Site Reliability Engineer
Job Description
Tittle :
Our Digital Operations team is looking for a System Eng or Site Reliability Engineer (SRE) who is passionate about the customer experience and has analytical & multi-tasking abilities to thrive in a fast paced environment. The SRE is responsible for ensuring that, as new features and applications are introduced to production, essential aspects for reliability such as availability, resiliency, latency, efficiency, change management, monitoring, emergency response, and capacity planning are conducted alongside development of the new features/applications. The SRE will develop automation code & scripts to proactively address customer issues, reduce mean time to repair and improve application availability. The position also includes collaborating closely with feature delivery teams as a bridge between development and operations by applying a software engineering mindset to system administration. This position will split time between operations/on call duties and guiding the development of systems and software that help increase site reliability and performance to deliver business value. The SRE will need intimate knowledge of the current state of data center and cloud infrastructure, CI/CD pipeline tools, Kubernetes, Site Reliability Engineering practices, and ability to implements the plan for desired future state. Attention to detail and strong analytical skills are required.
Similar Jobs
AI Data Engineer
California
Data Platform Engineer
New Jersey
Data Engineers
Ohio
Lead Data Engineer
Texas
Network Security Engineer (Zscaler)
Remote