Jobbörse
Finde Jobs in deiner Nähe – ob vor Ort, hybrid oder remote.- Ähnliche Jobs zu: AIML - Sr Data Scientist, Evaluation
AIML - Sr Machine Learning Engineer, Evaluation
AppleCupertinoAIML - Sr Machine Learning Engineer, Evaluation Cupertino, California, United States Machine Learning and AIWe are seeking a highly skilled and experienced machine learning engineer to join AIML Evalu
Data Scientist, Behavior Evaluation
ZooxUnited StatesAs a Data Scientist on the Behavior Evaluation team, you will be the statistical anchor ensuring our autonomous driving systems navigate highway environments with world-class safety, efficiency, and c
Data Scientist, Maps Evaluation
AppleCupertinoCupertino, California, United States Software and Services Apple Maps is changing, and data is in the driver’s seat. Our mission is to build the best map in the world. We work in a fast-paced data-foc
GenAI Data Scientist: AI Training & Evaluation
Mercor IncEugeneMercor is looking for a Data Scientist to join its AI lab's GenAI team, requiring strong analytical skills to develop advanced Language Models. This W-2 employment position demands 40 hours a week, gu
GenAI Data Scientist: AI Training & Evaluation
Mercor IncOakleyMercor is seeking a skilled Data Scientist to join a leading AI lab's GenAI team, contributing to advanced Large Language Models. You will guide teams on data science methodology, evaluate data tasks,
GenAI Data Scientist: AI Training & Evaluation
Mercor IncCentennialJoin Mercor to collaborate with a leading AI lab as a Data Scientist. You will be crucial in guiding data science methodologies, designing analytical solutions, and improving AI training data quality.
GenAI Data Scientist: AI Training & Evaluation
Mercor IncMontgomeryMercor in Montgomery, Alabama, seeks a Data Scientist to join the cutting-edge GenAI team. This role is pivotal in advancing generative AI and involves guiding research and engineering teams in data s
GenAI Data Scientist: AI Training & Evaluation
Mercor IncCoppellMercor is seeking a talented Data Scientist to join a leading AI lab's GenAI team, where your expertise will fuel the development of advanced Large Language Models. This W-2 employment role involves g
GenAI Data Scientist: AI Training & Evaluation
Mercor IncWheatonMercor is looking for talented Data Scientists to join a leading AI lab's GenAI team. In this role, you will guide teams on data science methodologies, evaluate AI-produced solutions, and collaborate
GenAI Data Scientist: AI Training & Evaluation
Mercor IncMurfreesboroMercor is seeking a Data Scientist to join a leading AI lab's GenAI team, contributing to the development of advanced Large Language Models. The role involves guiding teams in data science methodology
GenAI Data Scientist: AI Training & Evaluation
MercorEugeneMercor is looking for a Data Scientist to join its AI lab's GenAI team, requiring strong analytical skills to develop advanced Language Models. This W-2 employment position demands 40 hours a week, gu
GenAI Data Scientist: AI Training & Evaluation
Mercor IncWyomingMercor is hiring a Data Scientist to join their GenAI team, focusing on the development of advanced Large Language Models. Candidates should have over 3 years of data analysis and statistical modeling
Senior LLM Evaluation Data Scientist - Remote
DriveraiAustinDriverai is seeking an Applied Data Scientist with expertise in LLM evaluation to join its innovative team in Austin, TX. This role focuses on building the evaluation function from scratch and require
Video Data Scientist — Robotics Intelligence & Evaluation
Rhoda AIPalo AltoRhoda AI in Palo Alto is seeking Research Scientists and Engineers to develop the foundational data and evaluation methods for video action models. This role involves designing scalable video data cur
Video Data Scientist Robotics Intelligence & Evaluation
Rhoda AIPalo AltoRhoda AI in Palo Alto is seeking Research Scientists and Engineers to develop the foundational data and evaluation methods for video action models. This role involves designing scalable video data cur
Principal Data Scientist AI Systems & Evaluation
Ultimate LLCSan FranciscoUltimate.ai is looking for a Principal Data Scientist to evaluate AI systems and enhance user interactions. The ideal candidate will have 8+ years in data science with a focus on predictive modeling a
Annotation Data Scientist, Evaluation Integrity (Siri)
AppleCambridgeAnnotation Data Scientist, Evaluation Integrity (Siri) Cambridge, Massachusetts, United States — Machine Learning and AIPlay a part in the ongoing revolution in human-computer interaction. Siri is evo
GenAI Data Scientist — Model Evaluation & Research
MercorCape CoralMercor is looking for a Data Scientist to join a cutting-edge GenAI team at a leading AI lab. The role involves guiding teams on data science practices, designing tasks, and evaluating AI systems. The
GenAI Data Scientist Model Evaluation & Research
Mercor IncCape CoralMercor is looking for a Data Scientist to join a cutting-edge GenAI team at a leading AI lab. The role involves guiding teams on data science practices, designing tasks, and evaluating AI systems. The
HITL Annotation Data Scientist Evaluation Integrity
AppleCambridgeApple Inc. is seeking an Annotation Data Scientist for the Evaluation Integrity team in Cambridge, Massachusetts. This role focuses on designing human-in-the-loop (HITL) annotation projects that evalu
Remote Data Scientist AI Training & Evaluation Lead
questzoricaNew Yorkquestzorica is looking for a talented Data Scientist to join their remote team in the USA. This role involves designing and delivering high-quality training content in AI and data science, as well as
Remote Senior Data Scientist Research & ML Evaluation
Mercor IncNew YorkMercor is seeking Data Science Experts to connect elite creative and technical talent with leading AI research labs. This remote position requires a commitment of 40 hours per week, where you will gui
Remote Senior Data Scientist — Research & ML Evaluation
MercorNew YorkMercor is seeking Data Science Experts to connect elite creative and technical talent with leading AI research labs. This remote position requires a commitment of 40 hours per week, where you will gui
Tech Lead Data Scientist, AI Evaluation & Monitoring
GeisingerDanvilleJob Summary The Tech Lead Data Scientist, AI Evaluation & Monitoring is the principal technical expert for how Geisinger evaluates, monitors, and optimizes AI systems in production. This hands‑on tech
Senior Data Scientist, Evaluation — Shape Global Product Impact
Apple Inc.SeattleApple Inc. is searching for a Data Scientist in Seattle, Washington, to drive product impact through evaluation methods that enhance user-facing products like Siri and Apple Intelligence. The ideal ca
AIML - Sr Machine Learning Engineer, Evaluation
- Cupertino, California, United States
- Cupertino, California, United States
Über
We are seeking a highly skilled and experienced machine learning engineer to join AIML Evaluation to build the systems that evaluate and refine Apple's foundation models and agents. As a key member of the team, you will help design and develop benchmarks, evaluators, simulation environments, and prompt and context optimization pipelines that drive quality improvements across Apple's AI experiences. You will collaborate with product teams and the foundation model team to close the loop between observation and improvement, contributing datasets, environments, and reward signals that drive model and agent quality.
Description Our team builds the benchmarks, environments, and tooling that power model and agent refinement, and turns observations into actionable opportunities for the next model and agent iteration. We work across the full spectrum of evaluation: offline benchmarks, device-in-the-loop simulation, and on-device observation in production. We develop LLM-as-judge evaluators, train reward models calibrated against human feedback, optimize prompts and context for agents, and contribute targeted datasets and reward signals to foundation model post‑training.
In this role, you will play a crucial role in designing and developing evaluation and refinement infrastructure that supports a broad range of AI products at Apple. You will work on agent and model evaluation across offline, device-in-the-loop, and on-device settings; build automated prompt and context optimization pipelines; and partner with product and research teams to translate failure analysis into measurable model and agent improvements. You will also have the opportunity to engage with product teams across Apple and contribute to advancements in large language models and agentic systems that will reach millions of users.
To succeed in this role, you should have a strong background in machine learning systems, distributed infrastructure, and a proven track record of building and maintaining ML evaluation or training infrastructure. You should be a proactive problem solver with excellent communication skills and the ability to work effectively across multiple codebases, teams, and organizations. Experience with LLM evaluation, reward modeling, prompt optimization, or agentic systems is highly desirable.
Responsibilities
Design and build evaluation infrastructure for agents and foundation models.
Develop LLM judges, reward models, and prompt optimization pipelines.
Build and integrate simulation environments for agent evaluation and trajectory-based data generation.
Collaborate with product teams to identify, prioritize, and address quality gaps.
Contribute datasets, environments, and reward signals to the foundation model post‑training loop.
Minimum Qualifications
Strong background in machine learning and distributed systems.
Experience building and maintaining ML infrastructure for evaluation, training, or deployment.
Ability to work effectively across multiple codebases, teams, and organizations.
8+ years of professional experience as a software engineer, preferably in machine learning or a related field.
Bachelor's or Master's degree in Computer Science or a related field.
Preferred Qualifications
Experience with LLM evaluation, LLM-as-judge, or reward modeling.
Experience with prompt optimization, agent harness development, or post-training (SFT, DPO, RLHF).
Proficiency in Python and ML frameworks such as PyTorch.
Experience with agentic systems, simulation environments, or trajectory-based data generation.
Familiarity with on-device or privacy-preserving ML.
Proactive and determined problem-solving skills.
At Apple, base pay is one part of our total compensation package and is determined within a range. The base pay range for this role is between $212,000 and $386,300, and your base pay will depend on your skills, qualifications, experience, and location.
Apple employees also have the opportunity to become shareholders through participation in Apple’s discretionary employee stock programs. Apple employees are eligible for discretionary restricted stock unit awards and can purchase Apple stock at a discount if voluntarily participating in Apple’s Employee Stock Purchase Plan. You’ll also receive benefits including comprehensive medical and dental coverage, retirement benefits, a range of discounted products and free services, and reimbursement for certain educational expenses— including tuition. Additionally, this role might be eligible for discretionary bonuses or commission payments as well as relocation.
Note: Apple benefit, compensation and employee stock programs are subject to eligibility requirements and other terms of the applicable plan or program.
Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics.
We believe accessibility is a fundamental human right. By welcoming as many perspectives as possible, we help you build a career where you feel like you belong.
Apple accepts applications to this posting on an ongoing basis.
#J-18808-Ljbffr
Sprachkenntnisse
- English
Dieses Stellenangebot stammt von einer Partnerplattform von TieTalent. Klick auf „Jetzt Bewerben”, um deine Bewerbung direkt auf deren Website einzureichen.