Sr Data Engineer

Tekfortune Inc

United States

United States

Apply Now

About

Tekfortune is a fast-growing consulting firm specialized in permanent, contract & project-based staffing services for world's leading organizations in a broad range of industries. In this quickly changing economic landscape, virtual recruiting and remote work are critical for the future of work. To support the active project demands and skills gaps, our staffing experts can help you find the best job for you.
Job Title:
Sr Data Engineer Location:
Remote Duration:
9-month contract; will extend long term
Position Summary We are seeking a Senior Data Engineer to build and operate large-scale data pipelines and AI/ML data platforms that turn complex healthcare data into actionable insights. You will lead the design, development, and optimization of ETL/ELT workflows, collaborate closely with data scientists to productionize models, and enable both batch and real-time analytics-primarily on Google Cloud Platform (GCP). This role blends hands-on engineering (60-70%) with data curation/model-ready dataset creation (30-40%) to accelerate data science outcomes.
What You'll Do Design & Build Data Pipelines: Lead end-to-end development of scalable pipelines and data structures in GCP (e.g., BigQuery, Dataflow, Cloud Storage) to support analytics and ML use cases. ETL/ELT Development: Develop efficient, reliable ETL/ELT in Python, Spark, or Java; standardize and integrate data from legacy and modern systems. Data Profiling & Validation: Profile large datasets, perform exploratory analysis, validate data quality, and recommend remediation strategies. Model Enablement: Partner with data scientists to define features and build training/serving datasets; support batch and streaming model delivery. Modernization & Migration: Migrate legacy assets (e.g., Excel/SAS-based pipelines) to cloud-native services (e.g., Dataflow, BigQuery, Pub/Sub). PI Integrations: Design and implement robust API integrations, including authentication, data mapping, monitoring, and error handling. Best Practices & CI/CD: Establish deployment "rails" and standards for code quality, testing, versioning, and CI/CD; reduce tech debt and avoid "laptop deployments." Operational Excellence: Instrument pipelines for performance, reliability, and cost efficiency; support production operations and continuous improvement.
Key Projects You'll Support
CCT Price Benchmarking Tool (Caremark/PBM underwriting & client consulting):
Modernize an Excel-based analytics tool and data science pipeline to GCP (Dataflow/BigQuery). Ingest from two sources (legacy Excel-based system and EOS on GCP) into a unified, trusted data layer. Improve performance, data quality, and add new features after backlog completion.
Caremark Benefit Engine (benefit design recommendations):
Migrate legacy SAS code to Python/cloud-native patterns. Build feature engineering and curated datasets to support modeling and "what-if" analyses. Contribute to evolution toward a client-facing product targeted for broader rollout.
Required Qualifications
5+ years of experience in Data Engineering/ETL development with large, complex datasets. 5+ years of SQL (advanced queries, performance tuning, data modeling). 2+ years building robust data pipelines using Spark, Python, or Java. Demonstrated experience with cloud data platforms (GCP preferred; AWS/Azure acceptable). Proven data analysis, exploration, profiling, and validation skills. Strong collaboration and communication skills across data science, product, and engineering teams. Self-starter with a track record of delivering in progressively complex environments.
Preferred Qualifications
GCP services such as BigQuery, Dataflow, Cloud Storage, Pub/Sub, Composer/Airflow. PI integration design & development (authN/Z, data mapping, retries, error handling). Experience with healthcare or health insurance data (e.g., claims, PBM, underwriting). Understanding of data science methods and statistics; familiarity with model lifecycle. CI/CD (e.g., GitHub Actions, Cloud Build), infrastructure-as-code, testing frameworks. Experience modernizing legacy Excel/SAS workflows to Python/cloud-native.
Education
Bachelor's degree in Computer Science, Engineering, Machine Learning, or a related discipline; or equivalent work experience. Master's degree or PhD preferred.
Team & Ways of Working
You'll partner with a team of 5 data scientists across two products and collaborate with data engineers supporting multiple initiatives. The role emphasizes ownership and best-practice leadership-you'll bring perspective on how to architect pipelines the right way and help the team scale with standards and shared deployment practices. Expect a highly collaborative environment with technical peers who understand the business context and value rapid, reliable delivery.
Why This Role
Direct impact on pricing, benefit design, and client consulting outcomes in a complex healthcare domain. Opportunity to lead modernization (tech debt reduction, cloud migration) and accelerate model delivery through high-quality data engineering. Build the foundation for a client-facing product while improving internal analytics today.
Nice-to-Haves / Trade-offs: Deep GCP experience is ideal. Candidates with strong drug claims/PBM domain knowledge on AWS or Azure who can quickly cross-train to GCP will be considered.
For more information and other jobs available please contact our recruitment team at careers@tekfortune.com . To view all the jobs available in the USA and Asia please visit our website at https://www.tekfortune.com/careers/ .

United States

Languages

English

Notice for Users

This job comes from a TieTalent partner platform. Click "Apply Now" to submit your application directly on their site.

Apply Now