Cette offre d'emploi n'est plus disponible
Data Scientist/Machine Learning Engineer
Sumble Inc.
- United States
- United States
À propos
Our long-term vision is to become the primary destination for accessing high-quality web data. Try the product at sumble.com.
Our Team: We are a team of 15, including 10 engineers with experience at companies such as Google, Meta, Stack Overflow, and Kaggle.
What you’ll do
Finetuning small language models
Improving the quality of existing data using scalable approaches. Examples include: making sure URLs are associated the right company, we have the correct HQ address, we have mapped parents-subsidiary using techniques like LLM validation, SERP, and triangulating across sources.
Adding new signals: this usually involves scrubbing, matching and normalizing new signals and matching to our existing ontology
Pushing solutions into production environments, which may involve touching data pipelines and/or backend systems
Located within Americas timezones
More about Sumble Our Tech Stack:
ML/Data:
PyTorch, Huggingface, Gemma models, LORA, VLLM, Skypilot, Marimo
Languages & Frameworks: Python, FastAPI, React, Typescript
Cloud Platform: Google Cloud Platform (GCP)
Databases: PostgreSQL, DuckDB
Infrastructure: Cloud Run
Product/Design: Figma, Vercel V0
Challenges We Tackle:
Transforming noisy datasets into high-quality data products
Running expensive analytics computations efficiently
Managing the complexity of a growing number of data sources, machine learning models, and large data operations
Create a great PLG experience with upsell pathways
Medical, dental, and vision (US)
401k (US)
Target 4 weeks PTO
#J-18808-Ljbffr
Compétences linguistiques
- English
Avis aux utilisateurs
Cette offre a été publiée par l’un de nos partenaires. Vous pouvez consulter l’offre originale ici.