
Sr. Databricks Engineer
Job Description
Â
Overview
We are seeking a Data Engineer to lead the modernization of legacy ETL systems by migrating Ab Initio workflows to scalable, modular PySpark pipelines on Databricks. The role involves transforming complex data ecosystems into cloud-native architectures while ensuring data integrity, performance, and reliability.
🎯 Key Responsibilities
ETL Modernization & Development
• Analyze and migrate legacy ETL workflows from Ab Initio to PySpark-based pipelinesÂ
• Design and develop scalable data pipelines on DatabricksÂ
• Refactor monolithic processes into modular, reusable componentsÂ
• Leverage existing enterprise datasets to avoid redundancyÂ
Data Integration & Processing
• Build and maintain ETL/ELT pipelines integrating data from Snowflake and other sourcesÂ
• Process and publish enriched datasets for downstream applicationsÂ
• Support batch and near real-time data processingÂ
Data Lineage & Optimization
• Create end-to-end data lineage and data flow diagramsÂ
• Identify redundancies and drive process consolidation and optimizationÂ
• Ensure adherence to data governance and quality standardsÂ
Testing & Validation
• Develop unit, integration, and reconciliation frameworksÂ
• Perform dual-run comparisons with legacy systemsÂ
• Validate outputs in UAT and pre-production environmentsÂ
Deployment & Operations
• Support cutover and migration strategy from legacy systemsÂ
• Decommission legacy workflows and optimize scheduling (e.g., Control-M)Â
• Develop runbooks, monitoring, and operational documentationÂ
Collaboration
• Work with data architects, analysts, and downstream application teamsÂ
• Coordinate user acceptance testing (UAT/FAT) and stakeholder sign-offs.
Similar Jobs
Databricks Engineer
Texas
Lead Databricks Engineer
New Jersey
Databricks Engineer
Remote
Databricks Engineer
Texas
Databricks Engineer
Remote