ML Ops Engineer
Machine Learning EngineerMachine Learning EngineerFull TimeRemoteTeam 1,001-5,000H1B SponsorCompany SiteLinkedIn
Location
New Jersey
Posted
55 days ago
Salary
$127K - $160.6K / year
Bachelor Degree2 yrs expEnglishAWSCloudDockerKubernetesNumpyPandasPythonSQLTensorflow
Job Description
• Build and maintain monitoring infrastructure for conventional machine learning models, with capabilities for performance tracking, drift detection, and alerting.
• Research, evaluate, and implement monitoring strategies and tools for Generative AI systems, including LLMs and Agentic AI architectures.
• Collaborate with ML Engineers, Data Scientists, and DevOps teams to deploy, manage, and monitor models in production.
• Develop and support scalable, secure, and automated data pipelines using Snowflake, SQL, and Python for training, serving, and monitoring ML and GenAI models.
• Leverage AutoML tools and frameworks (e.g., MLflow, Kubeflow, SageMaker Autopilot) to streamline experimentation and deployment.
• Design dashboards and reporting systems to visualize model health metrics and surface key operational insights.
• Ensure auditability, reproducibility, and compliance for model performance and data flow in production environments, with consideration for regulatory standards like GDPR and HIPAA.
• Maintain CI/CD workflows and version-controlled codebases (e.g., Git) for ML infrastructure and pipelines.
• Utilize containerization and orchestration technologies (e.g., Docker) to manage scalable ML infrastructure.
• Leverage tools such as Streamlit and Python visualization libraries to present insights from model and data monitoring.
• Perform root cause analyses on model degradation or data quality issues, and proactively implement improvements.
• Stay current on industry developments related to ML observability, model governance, responsible GenAI practices, and AI security.
• Contribute to analytics projects and data engineering initiatives as needed.
• Provide off-hours support for critical deployments or urgent data/model issues.
Job Requirements
- 2–5 years of experience in ML Ops, ML Engineering, or a related role with a focus on production-level model monitoring, automation, and deployment.
- Strong experience with ML observability tools or custom-built monitoring systems.
- Experience with monitoring LLMs and Generative AI models, including prompt evaluation, hallucination tracking, and agent behavior auditing.
- Experience in deploying and managing ML workloads using containerization and orchestration platforms such as Docker, Kubernetes, Kubeflow, or TensorFlow Extended.
- Familiarity with AutoML pipelines and workflow management tools (e.g., MLflow, SageMaker Autopilot).
- Experience working in cloud environments, preferably AWS (e.g., SageMaker, S3, Lambda, ECS/EKS).
- Understanding of ML lifecycle tools (e.g., MLflow, SageMaker Pipelines) and CI/CD practices.
- Strong security and compliance awareness, particularly related to model/data governance (e.g., HIPAA, GDPR).
- Proficiency in Python and key data libraries (Pandas, Numpy, Matplotlib, etc.).
- Advanced SQL skills and experience with Snowflake or similar data warehousing platforms.
- Proficiency with version control (Git) and agile development methodologies.
- Strong collaboration and communication skills, with the ability to explain technical issues to both technical and non-technical stakeholders.
- Bachelor’s degree in Computer Science, Engineering, Data Science, or a related field—or equivalent industry experience.
- Domain experience in healthcare data (claims, payments) is preferred.
Benefits
- 401k plan with employer match
- flexible paid time off
- holidays
- parental leaves
- life and disability insurance
- health benefits including medical, dental, vision, and prescription drug coverage
Related Guides
Related Job Pages
More Machine Learning Engineer Jobs
Machine Learning Engineer, LLMs / RAG
ghSMARTWe help CEOs, boards and investors develop winning executive teams and make high-stakes leadership decisions.
Machine Learning Engineer56 days ago
Full TimeRemoteTeam 51-200Since 1995H1B No Sponsor
Machine Learning & Data Engineer building AI solutions at leadership advisory firm
AzureSFDC
Machine Learning Engineer56 days ago
Full TimeRemoteTeam 11-50Since 2010H1B No Sponsor
Lead AI/ML Engineer developing innovative software solutions at federal government projects
AWSCloudPandasPythonPyTorchTensorflow
Principal Machine Learning Engineer
Grace HillHelping owners and operators of real estate increase property performance, reduce operating risk and grow top talent.
Machine Learning Engineer57 days ago
Full TimeRemoteTeam 51-200H1B Sponsor
Principal Machine Learning Engineer for HelloData's automated market analysis platform
BigQueryCloudGoogle Cloud PlatformNode.jsPandasPostgresPythonPyTorchScikit-LearnTypeScript
Senior ML Engineer – Neural Rendering
Torc RoboticsLeading autonomous vehicle technology since 2007, Torc develops automated Level 4, Class 8 trucks with Daimler.
Machine Learning Engineer57 days ago
Full TimeRemoteTeam 501-1,000Since 2007H1B Sponsor
Senior ML Engineer focusing on Neural Rendering at Torc Robotics
CloudPythonPyTorch