Data Engineer II San Francisco
Knit
- San Francisco, California, United States
- San Francisco, California, United States
Über
What you’ll do As a Data Engineer II, you’ll be a foundational member of a small, high‑impact team building the data backbone of our clinical AI platform. Your work will directly enable the research, products and decisions that shape where the company goes next.
Design, build and maintain the data pipelines and infrastructure that power both our product and research applications – from ingestion through analytics‑ready delivery
Partner closely with our data science and ML teams to integrate, structure and scale the stack as our needs evolve
Help establish and uphold standards for data quality, testing, documentation and observability across the stack
Navigate the complex and often ambiguous landscape of healthcare data, bringing clarity, organization and thoughtful structure to messy problem spaces
Contribute to architectural decisions that will shape how we work with data at scale
We're looking for candidates who meet one of the following
2-5 years of professional experience specifically in data engineering (building data pipelines, ETL/ELT workflows, data modeling and warehouse architectures)
An advanced degree (MS or PhD) in data science, computer science, computer engineering or an adjacent technical discipline, paired with demonstrable data engineering project work
A combination of internships, research and substantial project experience that clearly demonstrates equivalent data engineering capability
Regardless of path, you should be able to demonstrate proficiency in SQL and Python and hands‑on experience with at least one major cloud platform (Azure, AWS, etc.).
What we’re looking for
Engineering fundamentals: comfort with version control (Git), code review, testing and the habits of writing code others can read, maintain and trust
SQL: strong command of joins, window functions, CTEs and aggregate logic; a basic understanding of query performance and when to worry about it
Python: fluency writing clean, modular code for data manipulation, transformation and scripting; familiarity with common libraries such as pandas and at least one testing framework (pytest or similar)
ML data processing: an understanding of basic machine learning and AI concepts as well as an understanding of typical AI/ML data workflows
Spark / distributed processing: working familiarity with PySpark and an understanding of how distributed compute differs from single‑machine workflows
Cloud platforms: hands‑on experience with at least one major cloud provider; Azure and Databricks preferred, but strong experience with AWS or GCP translates
Data engineering concepts: a solid grounding in batch and streaming processing, data modeling, orchestration, data quality, governance and database fundamentals (both relational and columnar)
Communication: the ability to explain technical tradeoffs clearly, in writing and in conversation, to both engineers and non‑engineers
Healthcare: prior exposure to healthcare data or the healthcare domain more broadly
Nice‑to‑haves
Familiarity with healthcare interoperability standards such as FHIR and HL7
Awareness of healthcare privacy and compliance frameworks (HIPAA, BAAs and similar)
An eye for compute cost structures and the instincts to build with efficiency in mind
Your first year In your first few months, you’ll get deep exposure to our existing data infrastructure, our healthcare data sources and the research and product workflows your pipelines support. By the end of your first year, we’d expect you to:
Own meaningful pieces of our data platform end‑to‑end, from design through production
Lead the integration of a new data source or domain, including its modelling, quality safeguards and downstream interfaces
Have raised the bar somewhere – whether in testing, documentation, cost, reliability or developer experience
Be a trusted collaborator to our data science and ML teams, shaping how they work with data rather than just responding to requests
Team structure
You’ll report to our Director of Data Engineering
You’ll work alongside the broader data science team on shared infrastructure, tooling and data problems
You’ll partner closely with our core model AI team – the engineers and researchers who consume your data for model training – in a tight feedback loop where data quality directly shapes model performance
You’ll have real visibility into how your work lands downstream and the impact it has on foundation model training
Salary Range Knit Health offers a competitive compensation package that includes base salary, equity and opportunities for advancement. The starting salary range for the Data Engineer II is approximately $110,000 to $135,000 per year.
Generous benefits for full‑time employees include medical, dental and vision coverage with 100% of premiums paid for employees and dependents (full coverage for dental, vision and our Gold medical plan; employees may choose to buy up to Platinum); coverage begins on the first day of employment. Additional benefits include a 401(k) plan and 24 days of PTO annually.
Final Notes Please note this job description is not designed to cover or contain a comprehensive listing of activities, duties or responsibilities that are required of the employee for this job. Duties, responsibilities and activities may change at any time with or without notice.
Knit Health is an equal opportunity employer and is committed to a diverse workplace. People from diverse racial, ethnic and cultural backgrounds, women, LGBTQ+ individuals and persons with disabilities are highly encouraged to apply.
#J-18808-Ljbffr
Sprachkenntnisse
- English
Hinweis für Nutzer
Dieses Stellenangebot stammt von einer Partnerplattform von TieTalent. Klick auf „Jetzt Bewerben”, um deine Bewerbung direkt auf deren Website einzureichen.