Mainframe Site Reliability Engineering (Sre)
Job Description
Job Title: Mainframe Site Reliability Engineering (SRE)
Location- Columbus, OH (4 days onsite)
Duration: Long Term
Visa's: GC/USC
Need someone from Mainframe Production support who has exp with L2/L3 production support, also Look for locals since a backout already happened
Role Summary:
The Mainframe SRE is responsible for ensuring the reliability, availability, performance, and scalability of enterprise mainframe platforms. This role blends traditional mainframe engineering with modern SRE principles, focusing on automation, observability, incident management, and continuous improvement. The lead will guide a team of engineers while partnering closely with application, infrastructure, and operations teams.
Key Responsibilities:
Lead the Mainframe SRE team, providing technical direction, mentoring, and performance guidance
Own the reliability, availability, and resilience of mainframe environments (z/OS and related subsystems)
Define and implement SRE practices such as SLIs, SLOs, SLAs, error budgets, and reliability metrics
Drive automation to reduce manual operations, improve recovery time, and enhance system stability
Oversee monitoring, alerting, and observability for mainframe systems using modern and legacy tools
Lead incident management, root cause analysis (RCA), and post-incident reviews
Partner with application development teams to improve reliability, performance, and deployment practices
Plan and execute capacity management, performance tuning, and workload optimization
Ensure compliance with security, regulatory, and audit requirements
Lead disaster recovery (DR) planning, testing, and high-availability strategies
Champion continuous improvement, DevOps, and SRE culture within mainframe operations
Required Qualifications
10+ years of experience in mainframe systems engineering or operations
Strong hands-on expertise with IBM z/OS
Experience with core mainframe components such as:
CICS, IMS, DB2
JES2/JES3
MQ, SMF, SDSF
Solid understanding of mainframe performance tuning and capacity planning
Experience leading production support and managing major incidents
Strong scripting and automation skills (REXX, JCL, CLIST, Python, or equivalent)
Familiarity with monitoring and scheduling tools (e.g., OMEGAMON, CA/BMC tools, Control-M)
Preferred Qualifications
Experience applying SRE principles in a mainframe or hybrid (mainframe + distributed) environment
Exposure to DevOps, CI/CD, and automation frameworks
Knowledge of Linux on Z and cloud integration patterns
Experience with resilience engineering, chaos testing, or fault injection concepts
Prior people-lead or technical-lead experience
Thanks & Regards
Ganesh Lakkimsetty
Siri InfoSolutions Inc.
Email: xxxxxxxxxxxxxxx
Hangouts: xxxxxxxxxxxxxxx
*****Feel Free to Reach me at my email and LinkedIn*****
Similar Jobs
Site Reliability Engineer (Sre)
Virginia
Senior Site Reliability Engineer (Sre)
New York
Sre - Site Reliability Engineer
Texas
Site Reliability Engineering/Sre
Pennsylvania
Site Reliability Engineering (Sre)
Maryland