Data Engineer
Job Description
Key Responsibilities:
Design and develop scalable batch and real-time data pipelines on GCP
Implement Medallion Architecture (Bronze, Silver, Gold layers)
Build data transformations using Python and PySpark
Develop and optimize complex SQL queries for analytics
Work extensively with BigQuery for large-scale data processing
Develop pipelines using Cloud Dataflow
Orchestrate workflows using Cloud Composer (Airflow)
Manage data storage using Google Cloud Storage (GCS)
Implement CI/CD pipelines using Git-based tools
Ensure data security, governance, and access control using GCP IAM
Optimize performance, scalability, and cost-efficiency
Mandatory Skills:
Strong experience with Google Cloud Platform (GCP)
Expertise in BigQuery (partitioning, clustering, optimization)
Hands-on experience with Medallion Architecture
Strong Python and PySpark experience
Advanced SQL skills (joins, window functions, tuning)
Experience with Cloud Dataflow and Airflow (Composer)
Experience with GCS, CI/CD pipelines, and Git
Knowledge of GCP IAM and cloud security
Similar Jobs
Data Analyst
Remote
AWS DevOps Engineer
Remote
Network Security Engineer
Remote
Openshift Container Platform Engineer
Remote
Infrastructure Engineer
Remote