Data Engineer
Job Description
Job Description:
We are seeking a Python-based platform engineer to design and build a containerized API layer that abstracts and governs interactions with engines such as Apache Spark through a well-defined API contract. This role focuses on building platform capabilities, not simply consuming existing data tools—enabling consistent, secure, and scalable access to Spark and non-Spark based data pipelines.
The ideal candidate has strong experience developing production-grade APIs in Python that interface with data frameworks or pipeline orchestration systems, packaging services using containers (Docker/Kubernetes), and operating them as reusable platform services. A proven background in CI/CD automation using GitHub Actions is required, along with solid software engineering practices around testing, versioning, and deployment.
This role is suited for engineers who have built data or compute platforms, service layers, or internal frameworks—translating complex data pipeline capabilities into stable, contract-driven APIs for broad enterprise use.
Tech Skills
• Experience building control planes
• Design patterns
o Modular containerized services
o Asynchronous processing
o Job-based work queue execution
o Immutability and idempotency
o Metadata management
o Policy enforcement
o Semantic modeling
o Observability and telemetry
o Side effect capability injection
Testing
• Functional testing
• Behavioral testing
Languages
• Python
Frameworks and Platforms
• FastAPI
• ARQ
• Kubernetes
• Pytest
Similar Jobs
Data Engineer
Remote
Snowflake Data Engineer
North Carolina
AWS Data Engineer
New Jersey
Data Engineer
Texas
Data Engineer
Remote