Alpaca
Developer APIs for stocks and crypto trading, investing apps, and embedded fintech.
Staff Site Reliability Engineer, Streaming
Location
United States
Posted
111 days ago
Salary
Not specified
5 yrs expEnglishDistributed SystemsKafkaKubernetesLinuxPrometheusRabbit MQGo
Job Description
• Triage difficult technical problems and implement solutions
• Enhance our RabbitMQ and Redpanda observability stack by defining Service Level Objectives (SLOs) and alerts, as well as implementing profiling and logging.
• Improving our RabbitMQ and Redpanda clients' reliability.
• Incident Management: Respond to and resolve incidents in a timely manner, conducting post-incident reviews to identify and implement improvements.
• Collaboration: Work closely with development teams to ensure new features and services are designed with reliability and scalability in mind.
• Capacity Planning: Monitor system capacity and performance, making recommendations and implementing changes to handle future growth.
Job Requirements
- 5+ years of experience in Site Reliability Engineering, Performance Engineering, or similar roles.
- 5+ years of experience with message brokers similar to Kafka, RabbitMQ, and Redpanda.
- Proven track record of managing and maintaining large-scale, high-availability, and high-performance distributed systems.
- Experience designing and implementing SLIs, SLOs, and SLAs for internal and third-party systems with comprehensive alerting and monitoring.
- Strong ability to work independently, lead and deliver on large tasks, and collaborate with other members of the organization or external partners.
- Significant production experience with Kubernetes.
- Proficient with Go.
- Proficient with Prometheus.
- Proficient with Linux.
- Experience with troubleshooting message broker performance issues.
Benefits
- Competitive Salary & Stock Options
- Health Benefits
- New Hire Home-Office Setup: One-time USD $500
- Monthly Stipend: USD $150 per month via a Brex Card
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
DevOps Engineer112 days ago
Full TimeRemoteTeam 1-10Since 2022H1B No Sponsor
Site Reliability Engineer maintaining Resend's email platform infrastructure
AWSDistributed SystemsGrafanaJavaScriptNode.jsReact
United States
Senior DevOps Engineer
Origami RiskOrigami Risk is a leading provider of integrated risk, compliance, safety, healthcare, and P&C insurance SaaS solutions.
DevOps Engineer112 days ago
Full TimeRemoteTeam 501-1,000Since 2009H1B Sponsor
Senior Cloud DevOps Engineer designing application stacks in cloud environment at Origami Risk.
AnsibleAWSAzureCloudDockerEC2JenkinsKubernetesNGINXPythonTerraform
DevOps Engineer112 days ago
Full TimeRemoteTeam 51-200Since 2004H1B No Sponsor
Salesforce DevOps Engineer optimizing CI/CD practices at Strongbridge
AzureJenkins
Senior DevOps Engineer
ASCENDING Inc.AWS Certified Advanced Consulting Partner, provides Cloud Consulting/Migration/Operation, Data Analytics, IT Staffing.
DevOps Engineer113 days ago
ContractRemoteTeam 11-50H1B No Sponsor
DevOps Engineer designing, developing, and maintaining automated build and release structure.
AzureMS SQL ServerSQL.NET
United States