Job Title: Databricks Architect/Admin
Department: Data & Analytics Platform
Location: Hartford , CT (Hybrid)
Job Type: Full-Time Travel: Minimal
Position Summary
- The Databricks Architect/ADMIN is a senior individual contributor responsible for the design, implementation, and continuous optimization of the enterprise Databricks platform.
- This role serves as the technical authority for all aspects of the Databricks environment — including workspace governance, Unity Catalog, cluster and compute strategy, data pipeline architecture, and cost management.
- The Architect works in close partnership with data engineering, analytics, and infrastructure teams, and operates within a broader multi-platform data ecosystem that includes Ab Initio and Fivetran.
- A strong background in Unix/Linux systems administration and scripting is essential, as the role requires deep engagement with the underlying compute infrastructure supporting the platform.
Key Responsibilities
Platform Architecture & Design
- Architect and govern the enterprise Databricks environment, including workspace topology, Unity Catalog structure, and access control frameworks.
- Define and enforce standards for cluster configuration, runtime versions, instance pool utilization, and auto-scaling policies.
- Design scalable, performant data pipeline patterns using Delta Live Tables, Databricks Workflows, and structured streaming.
- Establish architectural standards for Delta Lake — including table formats, partitioning strategies, Z-ordering, and OPTIMIZE/VACUUM scheduling.
- Lead platform integration design with upstream ingestion tools including Fivetran and Ab Initio, ensuring reliable, governed data delivery.
Unix/Linux Infrastructure & Operations
- Administer and troubleshoot Unix/Linux environments underpinning Databricks compute nodes, init scripts, and cluster lifecycle management.
- Develop and maintain shell scripts (Bash) and Python automation for platform operations, monitoring, log aggregation, and maintenance tasks.
- Manage file system operations, permission structures, and data movement tasks in Linux-based storage and compute environments.
- Support EC2/VM-level diagnostics and tuning in coordination with infrastructure and cloud engineering teams.
Cost Management & Optimization
- Own DBU consumption tracking and reporting; proactively identify optimization opportunities across jobs, interactive clusters, and SQL warehouses.
- Implement and maintain cost attribution models to support chargeback or showback reporting by team, product, or LOB.
- Partner with the Senior Director on capacity planning, contract utilization forecasting, and multi-year commitment management.
Governance, Security & Compliance
- Design and implement data governance frameworks within Unity Catalog, including lineage, tagging, and access auditing.
- Collaborate with Cybersecurity to ensure platform configurations satisfy enterprise security controls, including secrets management, network isolation, and encryption.
- Support audit and compliance activities by maintaining documentation of platform configurations, access policies, and data classification standards.
Automation & Artificial Intelligence
- Design and implement end-to-end automation frameworks for platform operations, including cluster lifecycle management, job scheduling, alerting, and self-healing workflows.
- Leverage Databricks AutoML, MLflow, and Model Serving capabilities to support the operationalization of machine learning models within the enterprise data platform.
- Integrate AI-assisted development tooling (e.g., Databricks Assistant, GitHub Copilot) into engineering workflows to accelerate pipeline development and reduce manual effort.
- Identify and drive automation opportunities across ingestion, transformation, data quality, and governance processes — reducing toil and improving platform reliability.
- Collaborate with data science and advanced analytics teams to architect scalable feature engineering pipelines and model deployment patterns on Databricks.
- Evaluate and recommend emerging AI/ML platform capabilities, including generative AI integrations and LLM-backed data workflows, in alignment with enterprise strategy.
- Serve as the primary technical escalation point for Databricks platform issues across data engineering and analytics teams.
- Contribute to sprint planning and project tracking within Jira; manage platform change requests and incidents through ServiceNow.
- Produce and maintain architectural documentation, runbooks, and onboarding materials for platform consumers.
- Evaluate and recommend new Databricks features, partner integrations, and tooling investments in support of the platform roadmap.
Required Qualifications
- 7+ years of experience in data engineering or data platform roles, with a minimum of 4 years hands-on Databricks implementation experience.
- Demonstrated expertise with Databricks platform capabilities: Unity Catalog, Delta Lake, Databricks Workflows, Delta Live Tables, and SQL Warehouses.
- Strong Unix/Linux proficiency — shell scripting, process management, file system operations, cron scheduling, and environment configuration.
- Proficiency in Python and PySpark for distributed data processing, pipeline development, and platform automation.
- Experience with cloud infrastructure (AWS, Azure, or Google Cloud Platform), including compute, storage, networking, and IAM/security constructs.
- Demonstrated ability to design for scale, cost efficiency, and operational reliability in an enterprise data environment.
- Demonstrated experience designing automation frameworks for data platform operations — including job orchestration, monitoring, alerting, and pipeline self-healing.
- Familiarity with AI/ML concepts and tooling within the Databricks ecosystem, including MLflow, AutoML, and Model Serving; exposure to generative AI or LLM-integrated workflows is a plus.
- Experience with Oracle database environments, including SQL development, schema design, and integration patterns for data extraction and pipeline sourcing.
- Proficiency in Git-based version control — branching strategies, pull request workflows, repository management, and CI/CD pipeline integration for data platform code.
- Experience working within ITSM and project delivery frameworks such as ServiceNow and Jira.
- Strong written and verbal communication skills, with the ability to convey complex architectural concepts to both technical and non-technical audiences.
Preferred Qualifications
- Hands-on experience with MLflow experiment tracking, model registry, and deployment patterns within Databricks.
- Exposure to generative AI frameworks (LangChain, LlamaIndex) or experience building LLM-integrated data pipelines and retrieval-augmented generation (RAG) workflows.
- Experience with workflow automation tools such as Apache Airflow, Databricks Workflows, or comparable orchestration platforms at enterprise scale.
- Experience integrating Databricks with ETL/ELT platforms including Fivetran, or Ab Initio; hands-on Ab Initio development or administration experience is a strong plus.
- Familiarity with enterprise data governance frameworks and catalog tools (e.g., Collibra, Alation, or Unity Catalog advanced features).
- Experience supporting Databricks in regulated industries (financial services, insurance) with associated audit and compliance requirements.
- Working knowledge of Infrastructure-as-Code tooling (Terraform, Ansible) for platform provisioning and configuration management.
- Background in disaster recovery design and resiliency planning for cloud-hosted data platforms.
Core Technical Competencies
- Platform & Data Engineering Infrastructure & Systems
- Databricks (Unity Catalog, DLT, Workflows) Unix/Linux Administration & Scripting
- Delta Lake / Delta Live Tables Bash / Shell Scripting
- Apache Spark / PySpark EC2 / VM Compute Management
- MLflow, AutoML & Model Serving Python Automation & Orchestration
- Generative AI / LLM-Integrated Workflows Git Version Control & Repository Management
- Fivetran, Ab Initio Integration Oracle Database / SQL Development
- Cloud Storage (S3, ADLS, GCS) Secrets Management & Security Hardening
- SQL / SparkSQL ServiceNow
Job ID: 523263467
Originally Posted on: 6/2/2026
Want to find more Construction opportunities?
Check out the 187,125 verified Construction jobs on iHireConstruction
Similar Jobs