Data Engineer
Job Description
Key Responsibilities:
Develop and maintain scalable ETL/ELT pipelines using AWS Glue, PySpark, and Python
Build event-driven workflows using AWS Lambda
Design and manage real-time streaming solutions using Kafka, KSQL, and Apache Flink
Implement and enforce comprehensive data quality frameworks, including validation, profiling, monitoring, and reconciliation
Optimize data processing performance, scalability, reliability, and cost in cloud environments
Collaborate with cross-functional teams to deliver reliable, production-grade data platforms and ensure data integrity across the pipeline
Required Skills:
Strong hands-on experience with Python and PySpark
Proven expertise in AWS Glue, Lambda, and other cloud-native data services
Solid experience with the Kafka ecosystem (topics, partitions, consumer groups, streaming patterns)
Demonstrated experience building and supporting data quality frameworks (validation rules, reconciliation checks, profiling, anomaly detection)
Strong understanding of distributed data processing and scalable architecture patterns
Similar Jobs
Data Engineer
Remote
Data Engineer
Remote
Data Engineer
Remote
Data Engineer
California
Data Engineer
Virginia