XX
Senior Data EngineerPyx HealthRemote, Oregon, United States
XX

Senior Data Engineer

Pyx Health
  • US
    Remote, Oregon, United States
  • US
    Remote, Oregon, United States
Postuler Maintenant

À propos

Pyx Health is looking for a talented and motivated Senior Data Engineer to join our team at a high-growth startup. You will play a pivotal role in maintaining and evolving our data infrastructure on Azure, with a core stack of Databricks, Airflow (Astronomer), dbt, and Postgres. You'll own data pipelines end-to-end, from ingestion through a medallion architecture to analytics delivery in Tableau.

ONLY CANDIDATES RESIDING IN THE USA MAY APPLY.

  • Minimum 5 years of experience as a Data Engineer.
  • Deep expertise in Databricks, including Delta Lake optimization (ZORDER, vacuuming, partitioning).
  • Strong Python skills for data engineering workflows.
  • Proficiency in Postgres (our primary transactional database).
  • Hands-on experience with Airflow—building and maintaining industry-grade DAGs.
  • Hands-on experience with dbt for transformation and data modeling.
  • Solid understanding of medallion architecture principles.
  • Experience with Unity Catalog or comparable data governance tooling.
  • Proficiency with GitHub and CI/CD pipelines for data projects.
  • Can start, run, manage, and complete a technical project with minimal oversight.
  • Strong root cause analysis skills.
  • Communicates effectively with cross-functional teams and stakeholders.

Preferred Qualifications:

  • Experience with Databricks native extractors and connectors.
  • Familiarity with Great Expectations or similar data quality frameworks.
  • Experience optimizing Databricks costs at scale.
  • Background in semantic layer design and BI performance tuning (Tableau preferred).
  • Prior experience mentoring or leading data engineers.
  • Familiarity with healthcare data regulations (HIPAA).

  • Design, build, and maintain batch data pipelines using Airflow (Astronomer) and dbt, ingesting data from Postgres, Salesforce, other business critical cloud-based SaaS applications, flat files, and other internal transactional tools.

  • Develop and optimize data models within a medallion architecture (Bronze/Silver/Gold) on Delta Lake.
  • Write production-grade Python for custom extractors, transformations, and pipeline logic.
  • Implement and enforce data governance using Unity Catalog across multi-tenant schemas.
  • Strengthen CI/CD practices for data—automated testing, environment promotion, and deployment pipelines via GitHub.
  • Monitor pipeline health and data quality using Datadog; proactively resolve issues.
  • Optimize Databricks compute costs through cluster policies, spot instances, and query tuning.
  • Collaborate with analysts to improve semantic layer design and Tableau performance at scale.
  • Document pipelines and processes for clarity and maintainability.
  • Participate in code reviews and provide technical mentorship to junior team members.
  • Remote, Oregon, United States

Compétences linguistiques

  • English
Avis aux utilisateurs

Cette offre provient d’une plateforme partenaire de TieTalent. Cliquez sur « Postuler maintenant » pour soumettre votre candidature directement sur leur site.