MLOps L2 Support Engineer
Job Description
Role: MLOps L2 Support Engineer
Location: Reading, PA
Duration: Long Term
Key Responsibilities:
Incident Management & Support:
· Provide L2 support for MLOps production environments, ensuring uptime and reliability.
· Troubleshoot ML pipelines, data processing jobs, and API issues.
· Monitor logs, alerts, and performance metrics using Dataiku, Prometheus, Grafana, or AWS tools such CloudWatch.
· Perform root cause analysis (RCA) and resolve incidents within SLAs.
· Escalate unresolved issues to L3 engineering teams when needed.
Dataiku Platform Management:
· Manage Dataiku DSS workflows, troubleshoot job failures, and optimize performance.
· Monitor and support Dataiku plugins, APIs, and automation scenarios.
· Collaborate with Data Scientists and Data Engineers to debug ML model deployments.
· Perform version control and CI/CD integration for Dataiku projects.
Deployment & Automation:
· Support CI/CD pipelines for ML model deployment (Bamboo, Bitbucket etc).
· Deploy ML models and data pipelines using Docker, Kubernetes, or Dataiku Flow.
· Automate monitoring and alerting for ML model drift, data quality, and performance.
Cloud & Infrastructure Support:
· Monitor AWS-based ML workloads (SageMaker, Lambda, ECS, S3, RDS).
· Manage storage and compute resources for ML workflows.
· Support database connections, data ingestion, and ETL pipelines (SQL, Spark, Kafka).
Security & Compliance:
· Ensure secure access control for ML models and data pipelines.
· Support audit, compliance, and governance for Dataiku and MLOps workflows.
Similar Jobs
Application Support Engineer
Remote
Platform Support Engineer/Azure DevOps Engineer
Texas
.NET Core Support Engineer
Remote
IT Networking Support Engineer
California
System Support Engineer
Kentucky