Data Engineer

MetasoftincContractFeb 10th, 2026

8 - 12 YearsFeb 10th, 2026

62 ViewsBe an Early Applicant

Required Skillset:

PythonAzureRedisLambdaApache AirflowS3RedshiftKafkaGlueStep FunctionsData GovernanceData CatalogingMetadata ManagementAirflowSemantic Layer DesignAWSFastAPIMySQLPostgreSQLPySparkBigQuerypandasMLflowLLMsSageMakerMLOpsElasticSearchDynamoDBContainerization (Docker)Orchestration (Kubernetes)TS/SCI

Job Description

job Title: ITP_ Data Engineer

Location: On-site, Springfield VA || Nebraska Avenue Complex (NAC) - Washington DC - DHS Headquarters

Client: DHS HQ
Employment Type: Full-Time

Clearance Required: Current TS/SCI and Must be a US Citizen. Ability to obtain DHS EOD SCI.

Position Overview

We are seeking a skilled ITP_Data Engineer to design, build, and maintain robust data pipelines that enable scalable, secure, and intelligent data processing across cloud environments. The ideal candidate will have hands-on experience in data acquisition from diverse sources, deep familiarity with modern data storage paradigms, and a passion for building pipelines that support advanced analytics and AI/ML use cases.

Key Responsibilities

Data Ingestion & Acquisition: Collect and integrate data from a wide variety of structured and unstructured sources including APIs, RDBMS, file systems, third-party services, and real-time streams.
Pipeline Development: Design and implement scalable ETL/ELT pipelines to clean, enrich, normalize, and semantically align data (ontology-driven transformations).
Cloud Deployment: Build and deploy data pipelines and associated infrastructure on AWS or Azure, using managed services like Lambda, Glue, Step Functions, Azure Data Factory, etc.
Database Architecture: Understand and optimize for different storage engines—relational (PostgreSQL, MySQL), columnar (Redshift, BigQuery), indexing engines (ElasticSearch), key-value stores (DynamoDB, Redis), Object stores (S3 or similar) and caching layers.
Streaming Data Processing: Work with Apache Kafka (or similar platforms) to handle high-volume, low-latency data streams.
Workflow Orchestration: Utilize Apache Airflow (or equivalent) to schedule and monitor complex data workflows.
AI/ML Integration: Collaborate with data scientists to integrate LLMs and ML models into pipelines for inference, tagging, enrichment, or intelligent routing of data.

Required Qualifications

Bachelor's or master’s degree in computer science, Engineering, or related field.
10+ years of experience in data engineering or software development roles.
Strong proficiency in Python, including experience with libraries like pandas, PySpark, FastAPI, or similar.
Solid experience with cloud services (AWS or Azure) and Cloud native data engineering tools.
Proven experience in building and maintaining data pipelines using Kafka, Airflow, and other open-source frameworks.
Strong grasp of database internals and trade-offs between different storage technologies.
Familiarity with data governance, lineage, and metadata management concepts.
Experience or strong interest in integrating LLMs and AI/ML models into production-grade data systems.

Preferred Qualifications

Knowledge of data cataloging tools and semantic layer design.
Experience with containerization (Docker) and orchestration (Kubernetes).
Familiarity with MLOps tools or platforms (e.g., SageMaker, MLflow).
Prior experience working in regulated or secure environment

Feb 9th, 2026