XX
AI Curation Data ScientistRecruit GroupUnited States
XX

AI Curation Data Scientist

Recruit Group
  • US
    United States
  • US
    United States

Über

What You’ll Do
Develop and optimize software pipelines for extracting and integrating structured and unstructured healthcare data
Build and maintain AI/ML workflows for data classification, normalization, and analysis
Train, fine‑tune, and evaluate large language models and embedding‑based systems
Curate and validate high‑quality datasets used for LLM training and model improvement
Work with complex healthcare data formats including XML, JSON, FHIR, and C‑CDA
Implement de‑identification strategies and ensure compliance with PHI/PII handling policies
Design and execute data quality assessments, validation frameworks, and automated testing processes
Collaborate cross‑functionally with engineering and product teams to improve scalability and system performance
Contribute to code repositories, testing infrastructure, and deployment best practices
Explore emerging AI methodologies and rapidly prototype innovative solutions in a highly iterative environment
What You’ll Need Required Qualifications
Master’s degree or equivalent experience in Computer Science, Software Engineering, Statistics, Biology, or a related field
5+ years of hands‑on experience in AI/ML engineering, data science, software development, or predictive analytics
Strong experience training and tuning transformer models and LLMs
Significant experience curating datasets for AI model training
Advanced Python development experience, including building extraction, classification, or NLP tools
Hands‑on experience with embeddings models, sentence transformers, and modern LLM tooling
Strong experience parsing and processing complex data formats such as XML and JSON
Familiarity with healthcare interoperability standards such as FHIR and/or C‑CDA
Experience with TensorFlow, PyTorch, scikit‑learn, or similar ML frameworks
Proficiency with Git and software development best practices
Experience developing unit and integration tests for scientific or healthcare‑focused applications
Strong communication skills and ability to collaborate effectively within remote teams
A proactive, solutions‑oriented mindset with a passion for building high‑impact products
Preferred Qualifications
Deep understanding of regex and advanced text‑processing techniques
Experience with Unix command‑line tooling such as jq, xq, sed, and bash scripting
Strong AWS experience, particularly around data storage and AI training infrastructure tradeoffs
Experience working with HIPAA, PHI/PII handling, and healthcare de‑identification strategies
Experience extending or customizing open‑source AI tooling
Familiarity with AI‑assisted coding workflows and tools such as GitHub Copilot, Claude Code, or similar platforms
Experience working across multiple programming languages and distributed technical teams
Why This Role
Opportunity to build AI systems that directly improve healthcare outcomes
Work alongside experienced experts in AI, software systems, molecular biology, and clinical medicine
High‑impact role within a fast‑growing and mission‑driven environment
Exposure to cutting‑edge challenges in healthcare interoperability, AI model training, and clinical data engineering
Collaborative culture that values innovation, ownership, and technical excellence
Fully remote flexibility with meaningful opportunities for growth and technical leadership
Let’s Talk If you’re excited by the opportunity to apply advanced AI and machine learning techniques to real‑world healthcare challenges — while working with a highly talented and mission‑driven team — we’d love to connect.
#J-18808-Ljbffr
  • United States

Sprachkenntnisse

  • English
Hinweis für Nutzer

Dieses Stellenangebot stammt von einer Partnerplattform von TieTalent. Klick auf „Jetzt Bewerben”, um deine Bewerbung direkt auf deren Website einzureichen.