SumerSports
AI insights + over 600 years of NFL expertise.
MLOps, ML Platform Engineer
Location
United States
Posted
13 days ago
Salary
Not specified
4 yrs expEnglishAWSAzureCloudGoogle Cloud PlatformKubernetesPythonSpark
Job Description
• Design and operate ML infrastructure: Manage data, training, serving, and inference systems for high-throughput model workflows
• Build scalable pipelines: Implement reproducible training and evaluation pipelines with versioning, scheduling, and artifact tracking
• Optimize compute and cost: Tune GPU and CPU workloads, manage clusters, and drive efficiency via rightsizing, spot scheduling, and caching
• Serve models in production: Operate APIs for low-latency inference with autoscaling, blue-green or canary rollouts, and rollback safety
• Ensure reliability and observability: Define and own SLOs; instrument pipelines and services to track latency, cost, drift, and data quality
• Secure and automate: Manage IAM, secrets, and container security; automate deployment pipelines via CI/CD and infrastructure as code
• Collaborate cross-functionally: Partner with research scientists and AI engineers to deliver models from experiment to production with minimal friction
• Document and enable: Build templates, runbooks, and internal tooling that make ML workflows repeatable, safe, and fast
Job Requirements
- 4+ years of experience in ML platform, DevOps, or infrastructure engineering
- Deep knowledge of Kubernetes, CI/CD, containers, and cloud infrastructure (AWS, GCP, or Azure)
- Hands-on experience managing GPU clusters and training/inference pipelines
- Familiarity with data orchestration and storage formats (Delta, Parquet, Polars, Spark)
- Proven ability to ship and operate production ML systems with SLOs
- Strong Python skills and comfort with infrastructure as code and automation
- Experience with observability and cost optimization at scale
Benefits
- Competitive Salary and Bonus Plan
- Comprehensive health insurance plan
- Retirement savings plan (401k) with company match
- Remote working environment
- A flexible, unlimited time off policy
- Generous paid holiday schedule - 13 in total including Monday after the Super Bowl
Related Guides
Related Categories
Related Job Pages
More Platform Engineer Jobs
Platform Engineer13 days ago
Full TimeRemoteTeam 11-50Since 2023H1B No Sponsor
Senior Platform Engineer scaling multi-cloud infrastructure for healthcare AI.
AWSAzureCloudGoogle Cloud PlatformKubernetesPythonTerraformTypeScript
United States
Site Reliability Engineer III, Platform Engineering
DataminrThe Leading AI Platform for Real-time Information and Event Discovery
Platform Engineer13 days ago
Full TimeRemoteTeam 501-1,000Since 2009H1B Sponsor
Site Reliability Engineer ensuring high-quality software delivery at Dataminr
AWSCloudDistributed SystemsKubernetesLinuxPythonTCP/IPTerraformGo
GenAI Platform Support Engineer – Tier 1
Game Plan TechMission-driven engineering firm helping government teams innovate.
Platform Engineer14 days ago
Full TimeRemoteTeam 51-200Since 2023
GenAI Platform Support Engineer providing technical support at Game Plan Tech
Cyber Security
United States
GenAI Platform Support Engineer – Tier 2
Game Plan TechMission-driven engineering firm helping government teams innovate.
Platform Engineer14 days ago
Full TimeRemoteTeam 51-200Since 2023
GenAI Platform Support Engineer providing technical support at Game Plan Tech
CloudCyber SecurityLinux
United States