XX
Sr. Databricks Solutions ArchitectRIT Solutions, Inc.United States

Cette offre d'emploi n'est plus disponible

XX

Sr. Databricks Solutions Architect

RIT Solutions, Inc.
  • US
    United States
  • US
    United States

À propos

Top Skills Needed
:
Deep hands-on expertise with Databricks platform architecture and governance Unity Catalog, workspaces, external locations, compute, access controls, cluster governance. Reliability engineering, monitoring, and operational hardening of the Lakehouse Observability, alerting, DR readiness, backup/restore, performance tuning, incident response. Strong experience with ADF, CI/CD, and Terraform for orchestrating and managing the Lakehouse Pipeline orchestration, IaC, DevOps, environment promotion, compute policies.
Typical Day-to-Day:
Design how the Databricks Lakehouse should work including the structure, tools, standards, and best practices Guide engineering teams on how to build pipelines and use Databricks correctly Solve technical issues when data jobs fail or performance slows Work with stakeholders to understand data needs and deliver solutions Set standards for security, governance, naming conventions, and architecture Ensure the Databricks platform is stable, reliable, and always available Build and maintain monitoring, alerting, logging, and health dashboards Strengthen and fix ingestion pipelines (ADF → landing → raw → curated) Improve data quality checks, anomaly detection, and pipeline reliability Manage CI/CD pipelines and deployment processes using Azure DevOps or GitHub Use Terraform (IaC) to deploy and manage Databricks and Azure infrastructure Partner with Security and FinOps on access controls, compliance, and cost governance Mentor the Data Engineer and support distributed data engineering teams across the organization
Key Responsibilities 1. Lakehouse Architecture & Platform Administration (Approximately 60% of role when combined with mentoring & code review)
Serve as the
primary architect and administrator
for the Azure Databricks Lakehouse (Unity Catalog, workspaces, external locations, compute, access controls). Lead execution of a
Minimal Viable Hardening Roadmap
for the platform, prioritizing:
High availability and DR readiness Backup/restore patterns for data and metadata Platform observability and operational metrics Secure and maintainable catalog/namespace structure Robust and proactive data quality assurance
Implement and evolve naming conventions, environment strategies, and platform standards that enable long-term maintainability and safe scaling. Act as the
Lakehouse-facing counterpart
to Enterprise Architecture and Security, collaborating on network architecture, identity & access, compliance controls, and platform guardrails.
2. Reliability, Monitoring, and Incident Management
Design, implement, and maintain
comprehensive monitoring and alerting
for Lakehouse platform components, ingestion jobs, key data assets, and system health indicators. Oversee
end-to-end reliability engineering , including capacity planning, throughput tuning, job performance optimization, and preventative maintenance (e.g., IR updates, compute policy reviews). Participate in - and help shape - the
on-call rotation
for high-priority incidents affecting production workloads, including rapid diagnosis and mitigation during off-hours as needed. Develop and maintain
incident response runbooks , escalation pathways, stakeholder communication protocols, and operational readiness checklists. Lead or participate in
post-incident Root Cause Analyses , ensuring durable remediation and preventing recurrence. Conduct
periodic DR and failover simulations , validating RPO/RTO and documenting improvements. This role is foundational to ensuring 24/7/365 availability and timely delivery of mission-critical data for clinical, financial, operational, and analytical needs.
3. Pipeline Reliability, Ingestion Patterns & Data Quality
Strengthen and standardize ingestion pipelines (ADF → landing → raw → curated), including watermarking, incremental logic, backfills, and retry/cancel/resume patterns. Collaborate with the Data Engineer to modernize logging, automated anomaly detection, pipeline health dashboards, and DQ validation automation. Provide architectural guidance, code reviews, mentoring, and best-practice patterns to distributed engineering teams across MedStar. Support stabilization of existing ingestion and transformation pipelines across clinical (notes, OHDSI), financial, operational, and quality use cases.
4. DevOps, CI/CD, and Infrastructure as Code
Administer and improve CI/CD pipelines using Azure DevOps or GitHub Enterprise. Support automated testing, environment promotion, and rollback patterns for Databricks and dbt assets. Maintain and extend
Terraform
(or adopt Terraform from another IaC background) for Databricks, storage, networking, compute policies, and related infrastructure. Promote version control standards, branching strategies, and deployment governance across data engineering teams.
5. Security, FinOps, and Guardrails Partnership
Partner with Enterprise Architecture and Security on platform access controls, identity strategy, encryption, networking, and compliance. Implement and enforce
cost tagging , compute policies, and alerts supporting FinOps transparency and cost governance. Collaborate with the team defining
agentic coding guardrails , ensuring the Lakehouse platform supports safe & compliant use of AI-assisted code generation and execution. Help assess and optimize serverless SQL, serverless Python, and job compute patterns for cost-efficiency and reliability.
6. Mentorship, Collaboration, & Distributed Enablement
Mentor the mid-level Data Engineer on Databricks, ADF, dbt, observability, DevOps, Terraform, and operational engineering patterns. Provide guidance, design patterns, and code review support to
multiple distributed data engineering teams
(Finance, MCPI, Safety/Risk, Quality, Digital Transformation, etc.). Lead platform knowledge-sharing efforts through documentation, workshops, and best-practice guidance. Demonstrate strong collaboration skills, balancing independence with alignment across teams.
7. Optional / Nice-to-Have: OHDSI Platform Support (Not required for hiring; can be learned on the job.)
Assist with or support operational administration of the OHDSI/OMOP stack (Atlas, WebAPI, vocabularies, Kubernetes deployments). Collaborate with partners to ensure the OHDSI platform is secure, maintainable, and well-integrated with the Lakehouse.
Required Qualifications
5+ years
in cloud data engineering, platform engineering, or solution architecture. Strong hands-on expertise in
Azure Databricks :
Unity Catalog Workspaces & external locations SQL/Python notebooks & Jobs Cluster/warehouse governance
Solid working experience with
Azure Data Factory
(pipelines, IRs, linked services). Strong SQL and Python engineering skills. Experience with CI/CD in
Azure DevOps or GitHub Enterprise . Experience with
Terraform or another IaC framework , and willingness to adopt Terraform. Demonstrated ability to design or support
monitoring, alerting, logging , or reliability systems. Strong communication, collaboration, and problem-solving skills.
Preferred Qualifications (Optional)
Advanced Terraform experience. Familiarity with healthcare, HIPAA, PHI, or regulated environments. Experience with Purview or enterprise cataloging. Exposure to OHDSI/OMOP. Experience optimizing or refactoring legacy ingestion pipelines. Experience supporting secure, controlled AI/agentic execution environments. Experience with EPIC EHR data exchange and/or EPIC Caboodle or Cogito analytics suite.
Personal Attributes
Hands-on, pragmatic, and operationally minded. Comfortable leading both architecture and implementation. Collaborative and mentorship-oriented; thrives in small core teams with broad influence. Values platform stability, observability, and hardening over shiny features. Curious and adaptable, especially with emerging AI-assisted engineering patterns. Ability to remain calm and effective during incidents and high-pressure situations.
  • United States

Compétences linguistiques

  • English
Avis aux utilisateurs

Cette offre a été publiée par l’un de nos partenaires. Vous pouvez consulter l’offre originale ici.