À propos
Remote Work: INDIA *Only Consultants local to INDIA are eligible.
*No visa Sponsorship
Primary Responsibilities:
Design, develop, and maintain scalable data pipelines using Python, PySpark, and other modern programming languages to support both batch and streaming workloads
Build and optimize data processing frameworks on cloud platforms such as Databricks or Snowflake, ensuring performance, reliability, and cost efficiency
Design and implement robust data models, including transactional (OLTP) and dimensional (OLAP) schemas, to support analytics, reporting, and application integration
Develop high quality SQL code including complex queries, stored procedures, and views, with a focus on performance tuning and efficient data access patterns
Create and manage workflow orchestration using Apache Airflow or similar tools, ensuring reliable scheduling, dependency management, and monitoring
Implement and enforce data governance and metadata standards through tools such as Microsoft Purview, including data lineage, classification, cataloging, and security policies
Build automated data quality and validation frameworks to ensure accuracy, completeness, and reliability of production datasets
Collaborate with cross functional teams including data architects, analysts, scientists, and business stakeholders to understand requirements and deliver scalable, well designed data solutions
Lead technical design sessions and code reviews, promoting engineering best practices, reusability, and maintainability
Support cloud infrastructure and DevOps practices, including CI/CD pipelines, version control, testing automation, and environment management
Monitor and troubleshoot production data pipelines, proactively addressing issues, performance bottlenecks, and system failures
Contribute to the evolution of the enterprise data platform, recommending tools, frameworks, and architectures to improve scalability and efficiency
Required Qualifications:
5+ years of experience in data engineering, software engineering, or similar disciplines
Hands-on experience with Databricks or Snowflake
Experience with orchestration tools such as Apache Airflow
Experience working with cloud ecosystems (Azure preferred; AWS/GCP acceptable)
Advanced SQL skills and experience with OLTP and OLAP data modeling
Solid understanding of modern data warehousing, data lake, and ELT/ETL design patterns
Familiarity with data governance tools, especially Microsoft Purview
Solid programming expertise in Python, PySpark, or similar languages
Preferred Qualifications:
Healthcare industry experience, including claims, clinical, FHIR, HL7, or provider data
Experience with containerization (Docker, Kubernetes) for data workloads
Experience supporting machine learning workflows or analytical data science pipelines
Knowledge of distributed computing concepts and performance tuning
Compétences linguistiques
- English
Avis aux utilisateurs
Cette offre provient d’une plateforme partenaire de TieTalent. Cliquez sur « Postuler maintenant » pour soumettre votre candidature directement sur leur site.