About
Job Title:
Sr Data Engineer Location:
Remote Duration:
9-month contract; will extend long term
Position Summary We are seeking a Senior Data Engineer to build and operate large-scale data pipelines and AI/ML data platforms that turn complex healthcare data into actionable insights. You will lead the design, development, and optimization of ETL/ELT workflows, collaborate closely with data scientists to productionize models, and enable both batch and real-time analytics-primarily on Google Cloud Platform (GCP). This role blends hands-on engineering (60-70%) with data curation/model-ready dataset creation (30-40%) to accelerate data science outcomes.
What You'll Do Design & Build Data Pipelines: Lead end-to-end development of scalable pipelines and data structures in GCP (e.g., BigQuery, Dataflow, Cloud Storage) to support analytics and ML use cases. ETL/ELT Development: Develop efficient, reliable ETL/ELT in Python, Spark, or Java; standardize and integrate data from legacy and modern systems. Data Profiling & Validation: Profile large datasets, perform exploratory analysis, validate data quality, and recommend remediation strategies. Model Enablement: Partner with data scientists to define features and build training/serving datasets; support batch and streaming model delivery. Modernization & Migration: Migrate legacy assets (e.g., Excel/SAS-based pipelines) to cloud-native services (e.g., Dataflow, BigQuery, Pub/Sub). PI Integrations: Design and implement robust API integrations, including authentication, data mapping, monitoring, and error handling. Best Practices & CI/CD: Establish deployment "rails" and standards for code quality, testing, versioning, and CI/CD; reduce tech debt and avoid "laptop deployments." Operational Excellence: Instrument pipelines for performance, reliability, and cost efficiency; support production operations and continuous improvement.
Key Projects You'll Support
CCT Price Benchmarking Tool (Caremark/PBM underwriting & client consulting):
Modernize an Excel-based analytics tool and data science pipeline to GCP (Dataflow/BigQuery). Ingest from two sources (legacy Excel-based system and EOS on GCP) into a unified, trusted data layer. Improve performance, data quality, and add new features after backlog completion.
Caremark Benefit Engine (benefit design recommendations):
Migrate legacy SAS code to Python/cloud-native patterns. Build feature engineering and curated datasets to support modeling and "what-if" analyses. Contribute to evolution toward a client-facing product targeted for broader rollout.
Required Qualifications
5+ years of experience in Data Engineering/ETL development with large, complex datasets. 5+ years of SQL (advanced queries, performance tuning, data modeling). 2+ years building robust data pipelines using Spark, Python, or Java. Demonstrated experience with cloud data platforms (GCP preferred; AWS/Azure acceptable). Proven data analysis, exploration, profiling, and validation skills. Strong collaboration and communication skills across data science, product, and engineering teams. Self-starter with a track record of delivering in progressively complex environments.
Preferred Qualifications
GCP services such as BigQuery, Dataflow, Cloud Storage, Pub/Sub, Composer/Airflow. PI integration design & development (authN/Z, data mapping, retries, error handling). Experience with healthcare or health insurance data (e.g., claims, PBM, underwriting). Understanding of data science methods and statistics; familiarity with model lifecycle. CI/CD (e.g., GitHub Actions, Cloud Build), infrastructure-as-code, testing frameworks. Experience modernizing legacy Excel/SAS workflows to Python/cloud-native.
Education
Bachelor's degree in Computer Science, Engineering, Machine Learning, or a related discipline; or equivalent work experience. Master's degree or PhD preferred.
Team & Ways of Working
You'll partner with a team of 5 data scientists across two products and collaborate with data engineers supporting multiple initiatives. The role emphasizes ownership and best-practice leadership-you'll bring perspective on how to architect pipelines the right way and help the team scale with standards and shared deployment practices. Expect a highly collaborative environment with technical peers who understand the business context and value rapid, reliable delivery.
Why This Role
Direct impact on pricing, benefit design, and client consulting outcomes in a complex healthcare domain. Opportunity to lead modernization (tech debt reduction, cloud migration) and accelerate model delivery through high-quality data engineering. Build the foundation for a client-facing product while improving internal analytics today.
Nice-to-Haves / Trade-offs: Deep GCP experience is ideal. Candidates with strong drug claims/PBM domain knowledge on AWS or Azure who can quickly cross-train to GCP will be considered.
For more information and other jobs available please contact our recruitment team at careers@tekfortune.com . To view all the jobs available in the USA and Asia please visit our website at https://www.tekfortune.com/careers/ .
Languages
- English
Notice for Users
This job comes from a TieTalent partner platform. Click "Apply Now" to submit your application directly on their site.