Senior Site Reliability Engineer
Nitya IncContract
Required Skillset:
PythonLinuxAzureDockerJenkinsKubernetesUnixGoBashTerraformPrometheusGrafanaAWSGCPGitHub ActionsCloudFormationGitLab CIELK stackAWS Certified DevOps EngineerKubernetes Certification
Job Description
- Implement and manage observability tools (logging, monitoring, tracing)
- Lead incident response, root cause analysis (RCA), and postmortems
- Improve CI/CD pipelines and release processes
- Collaborate with development teams to enhance system reliability and performance
- Optimize cloud infrastructure costs and performance
- Develop runbooks and operational documentation
Required Qualifications
- Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent experience)
- 5+ years of experience in SRE, DevOps, or infrastructure engineering
- Strong experience with cloud platforms (AWS, Azure, or GCP)
- Proficiency in scripting/programming (Python, Go, Bash, or similar)
- Experience with containerization and orchestration (Docker, Kubernetes)
- Strong knowledge of Linux/Unix systems
- Experience with CI/CD tools (Jenkins, GitHub Actions, GitLab CI)
- Understanding of networking, security, and distributed systems
Preferred Qualifications
- Experience with observability tools (Prometheus, Grafana, ELK stack)
- Familiarity with microservices architecture
- Experience with Infrastructure as Code tools (Terraform, CloudFormation)
- Knowledge of chaos engineering and reliability testing
- Exposure to security best practices and compliance standards
- Certifications (AWS Certified DevOps Engineer, Kubernetes, etc.)
Skills
- Strong problem-solving and troubleshooting abilities
- Excellent communication and collaboration skills
Similar Jobs
Lead AI Engineer
Remote
Apr 17th, 2026
AWS Cloud Engineer
Remote
Apr 17th, 2026
Senior Software Engineer
Texas
Apr 17th, 2026
Data Engineer
California
Apr 17th, 2026
DevOps Engineer
Texas
Apr 17th, 2026