Databricks Architect/Admin

  • Rangam Consultants
  • East Hartford, Connecticut
  • Full Time

Job Title: Databricks Architect/Admin
Department: Data & Analytics Platform

Location: Hartford , CT (Hybrid)

Job Type: Full-Time Travel: Minimal

Position Summary

  • The Databricks Architect/ADMIN is a senior individual contributor responsible for the design, implementation, and continuous optimization of the enterprise Databricks platform.
  • This role serves as the technical authority for all aspects of the Databricks environment — including workspace governance, Unity Catalog, cluster and compute strategy, data pipeline architecture, and cost management.
  • The Architect works in close partnership with data engineering, analytics, and infrastructure teams, and operates within a broader multi-platform data ecosystem that includes Ab Initio and Fivetran.
  • A strong background in Unix/Linux systems administration and scripting is essential, as the role requires deep engagement with the underlying compute infrastructure supporting the platform.

Key Responsibilities

Platform Architecture & Design

  • Architect and govern the enterprise Databricks environment, including workspace topology, Unity Catalog structure, and access control frameworks.
  • Define and enforce standards for cluster configuration, runtime versions, instance pool utilization, and auto-scaling policies.
  • Design scalable, performant data pipeline patterns using Delta Live Tables, Databricks Workflows, and structured streaming.
  • Establish architectural standards for Delta Lake — including table formats, partitioning strategies, Z-ordering, and OPTIMIZE/VACUUM scheduling.
  • Lead platform integration design with upstream ingestion tools including Fivetran and Ab Initio, ensuring reliable, governed data delivery.

Unix/Linux Infrastructure & Operations

  • Administer and troubleshoot Unix/Linux environments underpinning Databricks compute nodes, init scripts, and cluster lifecycle management.
  • Develop and maintain shell scripts (Bash) and Python automation for platform operations, monitoring, log aggregation, and maintenance tasks.
  • Manage file system operations, permission structures, and data movement tasks in Linux-based storage and compute environments.
  • Support EC2/VM-level diagnostics and tuning in coordination with infrastructure and cloud engineering teams.

Cost Management & Optimization

  • Own DBU consumption tracking and reporting; proactively identify optimization opportunities across jobs, interactive clusters, and SQL warehouses.
  • Implement and maintain cost attribution models to support chargeback or showback reporting by team, product, or LOB.
  • Partner with the Senior Director on capacity planning, contract utilization forecasting, and multi-year commitment management.

Governance, Security & Compliance

  • Design and implement data governance frameworks within Unity Catalog, including lineage, tagging, and access auditing.
  • Collaborate with Cybersecurity to ensure platform configurations satisfy enterprise security controls, including secrets management, network isolation, and encryption.
  • Support audit and compliance activities by maintaining documentation of platform configurations, access policies, and data classification standards.

Automation & Artificial Intelligence

  • Design and implement end-to-end automation frameworks for platform operations, including cluster lifecycle management, job scheduling, alerting, and self-healing workflows.
  • Leverage Databricks AutoML, MLflow, and Model Serving capabilities to support the operationalization of machine learning models within the enterprise data platform.
  • Integrate AI-assisted development tooling (e.g., Databricks Assistant, GitHub Copilot) into engineering workflows to accelerate pipeline development and reduce manual effort.
  • Identify and drive automation opportunities across ingestion, transformation, data quality, and governance processes — reducing toil and improving platform reliability.
  • Collaborate with data science and advanced analytics teams to architect scalable feature engineering pipelines and model deployment patterns on Databricks.
  • Evaluate and recommend emerging AI/ML platform capabilities, including generative AI integrations and LLM-backed data workflows, in alignment with enterprise strategy.
  • Serve as the primary technical escalation point for Databricks platform issues across data engineering and analytics teams.
  • Contribute to sprint planning and project tracking within Jira; manage platform change requests and incidents through ServiceNow.
  • Produce and maintain architectural documentation, runbooks, and onboarding materials for platform consumers.
  • Evaluate and recommend new Databricks features, partner integrations, and tooling investments in support of the platform roadmap.

Required Qualifications

  • 7+ years of experience in data engineering or data platform roles, with a minimum of 4 years hands-on Databricks implementation experience.
  • Demonstrated expertise with Databricks platform capabilities: Unity Catalog, Delta Lake, Databricks Workflows, Delta Live Tables, and SQL Warehouses.
  • Strong Unix/Linux proficiency — shell scripting, process management, file system operations, cron scheduling, and environment configuration.
  • Proficiency in Python and PySpark for distributed data processing, pipeline development, and platform automation.
  • Experience with cloud infrastructure (AWS, Azure, or Google Cloud Platform), including compute, storage, networking, and IAM/security constructs.
  • Demonstrated ability to design for scale, cost efficiency, and operational reliability in an enterprise data environment.
  • Demonstrated experience designing automation frameworks for data platform operations — including job orchestration, monitoring, alerting, and pipeline self-healing.
  • Familiarity with AI/ML concepts and tooling within the Databricks ecosystem, including MLflow, AutoML, and Model Serving; exposure to generative AI or LLM-integrated workflows is a plus.
  • Experience with Oracle database environments, including SQL development, schema design, and integration patterns for data extraction and pipeline sourcing.
  • Proficiency in Git-based version control — branching strategies, pull request workflows, repository management, and CI/CD pipeline integration for data platform code.
  • Experience working within ITSM and project delivery frameworks such as ServiceNow and Jira.
  • Strong written and verbal communication skills, with the ability to convey complex architectural concepts to both technical and non-technical audiences.

Preferred Qualifications

  • Hands-on experience with MLflow experiment tracking, model registry, and deployment patterns within Databricks.
  • Exposure to generative AI frameworks (LangChain, LlamaIndex) or experience building LLM-integrated data pipelines and retrieval-augmented generation (RAG) workflows.
  • Experience with workflow automation tools such as Apache Airflow, Databricks Workflows, or comparable orchestration platforms at enterprise scale.
  • Experience integrating Databricks with ETL/ELT platforms including Fivetran, or Ab Initio; hands-on Ab Initio development or administration experience is a strong plus.
  • Familiarity with enterprise data governance frameworks and catalog tools (e.g., Collibra, Alation, or Unity Catalog advanced features).
  • Experience supporting Databricks in regulated industries (financial services, insurance) with associated audit and compliance requirements.
  • Working knowledge of Infrastructure-as-Code tooling (Terraform, Ansible) for platform provisioning and configuration management.
  • Background in disaster recovery design and resiliency planning for cloud-hosted data platforms.

Core Technical Competencies

  • Platform & Data Engineering Infrastructure & Systems
  • Databricks (Unity Catalog, DLT, Workflows) Unix/Linux Administration & Scripting
  • Delta Lake / Delta Live Tables Bash / Shell Scripting
  • Apache Spark / PySpark EC2 / VM Compute Management
  • MLflow, AutoML & Model Serving Python Automation & Orchestration
  • Generative AI / LLM-Integrated Workflows Git Version Control & Repository Management
  • Fivetran, Ab Initio Integration Oracle Database / SQL Development
  • Cloud Storage (S3, ADLS, GCS) Secrets Management & Security Hardening
  • SQL / SparkSQL ServiceNow
Job ID: 523263467
Originally Posted on: 6/2/2026

Want to find more Construction opportunities?

Check out the 187,125 verified Construction jobs on iHireConstruction