
Site Reliability Engineer
Job Description
Design, implement, and maintain scalable, highly available infrastructure on AWS, ensuring reliability, performance, and security of production systems.
Monitor system health, troubleshoot incidents, and lead root cause analysis (RCA) to minimize downtime and improve service stability.
Automate deployment, monitoring, and operational processes using tools like Terraform, CloudFormation, and CI/CD pipelines.
Collaborate with development and DevOps teams to improve system architecture, implement best practices, and enhance observability (logging, metrics, alerting).
Manage and optimize AWS services such as EC2, S3, RDS, Lambda, and Kubernetes (EKS), with a focus on cost efficiency and performance tuning.
Similar Jobs
Android Engineer
California
QA Engineer
Texas
: DevOps Engineer With Active Directory
California
Quality Engineering & Test Architecture
Connecticut
Senior QA Engineer / QA Lead / Quality Engineering Lead
Virginia