Data Solutions Architect With Azure Databricks
TechStar GROUPContract
Required Skillset:
PythonTableauAzure DatabricksPower BIPySparkSpark SQLMedallion architectureData Solutions Architectand SQL
Job Description
Data Solutions Architect – Azure Databricks
Remote
Long term contract
Solution Architecture & Design
- Own end-to-end Azure Databricks platform architecture: from raw ingestion through Medallion Lakehouse layers (Bronze/Silver/Gold) to BI consumption: producing detailed, fully documented architecture diagrams that address both functional and non-functional requirements (NFRs: scalability, availability, security, performance, maintainability).
- Design the metadata-driven pipeline framework governing all ingestion patterns: on-premises SQL Server (JDBC, incremental/CDC), REST/SOAP API integrations (auth, pagination, rate-limiting, retry), and flat file ingestion (CSV, JSON, XML via ADLS Gen2/SFTP landing zones).
- Architect the migration path from legacy Snowflake and Azure Data Factory (ADF) to native Databricks tooling: including Databricks Workflows, Databricks Autoloader, and Delta Live Tables: minimizing replatforming risk while maximizing platform capability adoption.
- Define metadata store design (pipeline configuration tables, ingestion control frameworks) that enables engineers to onboard new data sources without bespoke pipeline code.
- Establish Unity Catalog architecture: workspace federation, catalog/schema/table hierarchy, data classification, column-level security, row filters, and audit log strategy.
- Design Collibra integration patterns for bidirectional lineage, data stewardship workflows, and business glossary synchronization with Unity Catalog.
- Architect Okta/Azure Active Directory integration for platform authentication, service principal management, and SCIM provisioning into Databricks and Unity Catalog.
BI & Consumption Layer Architecture
- Design the connectivity architecture between the Databricks Lakehouse and BI tools: Power BI (via Databricks SQL connector, DirectQuery, and Fabric integration) and Tableau (via Databricks JDBC/ODBC): including semantic layer patterns, aggregation strategies, and performance optimization.
- Define Databricks SQL Warehouse sizing, clustering policies, and access control patterns to support concurrent BI workloads without degrading pipeline performance.
- Establish certified dataset and Gold layer table design standards that support reliable, governed BI consumption by legal business users.
Engineering Enablement & Technical Guidance
- Serve as the technical authority for the engineering team: translate architecture decisions into detailed implementation guides, runbooks, and reference implementations that 8+ Senior Data Engineers can execute confidently.
- Conduct architecture reviews and design walkthroughs at key delivery milestones; identify deviations from approved patterns early and guide corrective action.
- Partner with the Technical Lead on code review standards, branching strategy, and CI/CD pipeline design in Azure DevOps: ensuring architecture standards are enforced through automation where possible.
- Dedicate approximately 25% of working time to hands-on coding and peer code review: writing reference implementations, metadata framework components, and reusable pipeline modules in PySpark and Python; reviewing engineer pull requests for architectural conformance, code quality, and performance.
- Champion Databricks-native tooling adoption (Autoloader, DLT, Databricks Asset Bundles, Unity Catalog) over legacy ADF patterns; provide clear migration rationale and transition guidance to the team.
Security, Governance & Compliance
- Design secrets management and credential rotation patterns using Azure Key Vault integration within Databricks; ensure no plaintext credentials in code or notebooks.
- Define data governance standards including data lineage (Unity Catalog + Collibra), data quality SLAs, PII classification, and privilege-protection controls aligned with legal industry requirements.
- Establish network architecture standards: private endpoints, VNet injection, IP access lists, and data exfiltration prevention for the Databricks workspace.
Replatforming & Stakeholder Engagement
- Lead technical discovery on the existing Snowflake and ADF estate: catalog existing pipelines, data models, and transformation logic to inform migration sequencing and effort estimation.
- Define the phased replatforming roadmap in collaboration with the Senior Engineering Manager: balancing legacy decommission milestones, risk mitigation, and new capability delivery.
- Engage directly with legal business stakeholders, compliance teams, and BI consumers to validate architecture decisions against functional and NFR requirements; translate business constraints into platform design guardrails.
- Produce and maintain living architecture documentation: solution design documents (SDDs), data flow diagrams, integration architecture diagrams, and ADRs (Architecture Decision Records).
Education and/or Experience
Required
- Bachelor's degree in Computer Science, Engineering, Data Science, or a related field.
- 10+ years of data engineering and/or data architecture experience, including hands-on delivery of large-scale cloud data platforms.
- 5+ years of deep, production-level Databricks experience: including Unity Catalog, Delta Lake, Databricks Workflows, Autoloader, Databricks SQL, and cluster/workspace administration.
- Proven experience architecting and delivering metadata-driven pipeline frameworks covering on-premises SQL Server (JDBC, CDC), REST/SOAP API, and flat file ingestion patterns at enterprise scale.
- Demonstrated experience leading a Snowflake-to-cloud-native migration or comparable legacy data warehouse replatforming initiative.
- Strong command of Medallion/Lakehouse architecture and Delta Lake internals: schema evolution, liquid clustering, optimize/ZORDER, time travel, and incremental processing patterns.
- Hands-on experience integrating Databricks with Power BI (DirectQuery, Fabric) and Tableau (JDBC/ODBC); ability to design performant semantic layer and Gold table patterns for BI consumption.
- Experience designing Unity Catalog governance architecture including catalog hierarchy, RBAC, column-level security, row filters, and audit logging.
- Proficiency in Azure Key Vault integration, service principal management, and Okta/Azure AD SSO and SCIM provisioning for Databricks.
- Advanced proficiency in PySpark, Spark SQL, Python, and SQL; working knowledge of Scala.
- Strong Azure ecosystem experience: ADLS Gen2, Azure Data Factory, Azure DevOps (CI/CD), Azure Monitor, and Azure networking (VNet injection, private endpoints).
- Proven ability to produce high-quality architecture deliverables: solution design documents, end-to-end data flow diagrams, integration architecture diagrams, and ADRs.
- Excellent communication skills with demonstrated ability to engage senior business stakeholders, compliance teams, and engineering staff simultaneously.
Preferred
- Databricks Certified Data Engineer Professional or Databricks Certified Associate Developer for Apache Spark.
- Microsoft Certified: Azure Data Engineer Associate or Azure Solutions Architect Expert.
- Master's degree in Computer Science, Engineering, or a related field.
- Familiarity with legal industry data domains: matter management, billing analytics, contract lifecycle management, or eDiscovery.
- Experience with data quality frameworks such as Great Expectations or Databricks DQX within Databricks pipelines.
- Hands-on experience with Infrastructure as Code (IaC) using Terraform or Bicep for Azure Databricks workspace and Unity Catalog provisioning.
- Familiarity with Microsoft Fabric and its integration with Databricks and Power BI in a hybrid lakehouse topology.
- Experience with VS Code for Databricks development workflows; familiarity with Claude Code for AI-assisted architecture documentation and engineering productivity.
- Familiarity with Agile/Scrum delivery using Azure DevOps Boards or Jira.
Similar Jobs
AI Solutions Architect
CO
Jun 3rd, 2026
Solutions Architect, Supercomputing
TX
Jun 3rd, 2026
Sr. AI Solutions Architect - GCP & Agentic Systems
Remote
Jun 2nd, 2026
Azure Databricks Architect
FL
May 26th, 2026
SAP Solution Architect With Variant Configuration Expertise
California
May 18th, 2026