Offres d'emploi
Trouvez des postes près de chez vous, sur site, hybrides ou à distance.- Emplois similaires à : Machine Learning Engineer, ML/GenAI Evaluation
Machine Learning Engineer, ML/GenAI Evaluation
AppleSan DiegoMachine Learning Engineer, ML/GenAI Evaluation San Diego, California, United States Software and ServicesWould you like to contribute to Machine Learning and Generative AI technologies? Are you passio
Tech Lead
SoftwayTek LLCAustin*Role Overview*We are seeking a highly experienced Application Lead to work on the design, architecture, and delivery of enterprise-grade solutions. This role requires strong ownership of end-to-end s
Lead Production Engineer
CatfaceAustinCatFace is a YouTube entertainment company, home to a family of channels with over 25 million combined subscribers and an average of 300 million views per month. We entertain general audiences with fu
Primary Care Physician - Congress Avenue Office (Ages 14+)
One MedicalAustinAbout Us One Medical is a primary care solution challenging the industry status quo by making quality care more affordable, accessible and enjoyable. But this isn’t your average doctor’s office. We
Family Nurse Practitioner or Physician Assistant- Sign on Bonus available
One MedicalAustinAbout Us One Medical is a primary care solution challenging the industry status quo by making quality care more affordable, accessible and enjoyable. But this isn’t your average doctor’s office. We
Family Physician - Aldrich Street Mueller Office (ages 0+)
One MedicalAustinAbout Us One Medical is a primary care solution challenging the industry status quo by making quality care more affordable, accessible and enjoyable. But this isn’t your average doctor’s office. We
Compensation Analytics Specialist
Hunt CorporationAustinHunt is looking for a Compensation Analyst to support the administration and analysis of compensation programs while providing operational support for compensation decisions. Located in Austin, Texas,
Remote Engineering Manager Frontend & Team Leader
GivebutterAustinGivebutter, Inc. is looking for an Engineering Manager located in Austin, Texas, to lead a talented full-stack engineering team. This role involves overseeing the delivery of high-quality software tai
Electro-Mechanical Technician II: Design, Test & Upgrades
Applied MaterialsAustinApplied Materials is seeking an Electro-mechanical Technician in Austin, TX. The role involves performing technical functions including design, testing, and troubleshooting of systems. Candidates shou
Remote Fullstack Software Engineer (TypeScript) - US Remote
GrabJobsAustinLocation:United States (Remote) Start Date:ASAP Languages:Fluent English required Type:Full-timeAbout the OpportunityPragmatike is recruiting on behalf of a fast-growing technology company building cr
Bespoke Designer | Jewelry Sales Specialist
Frank DarlingAustinJob Description Frank Darling is a female-founded fine jewelry brand making custom engagement rings feel exciting, personal, and refreshingly human. Known for our modern design and sustainability-mind
Remote Senior Director, Data Strategy & Analytics
COMFORT SYSTEMSAustinCapital One is seeking a Senior Director, Business Analysis to lead strategic initiatives within Capital One Software. This role involves applying analytical skills to major challenges and leading a t
PEO Sales Executive
Dormont Manufacturing CompanyAustinSWBC is seeking a talented individual to generate new sales (including cold calling and networking), identify and qualify prospects, and create opportunities to participate in requests for proposals (
Outside Sales Account Manager - Pest Solutions
ABC Home ServicesAustinABC Home & Commercial Services is seeking a Commercial Sales Account Manager to grow our client base. This role is open to all of Texas, particularly servicing the Austin area, and requires a valid dr
Staff Software Engineer, Backend (Ruby)
ProCore CPAAustinWe're looking for a Staff Software Engineer to join the Incidents and Inspections team within Procore's Quality & Safety (Q&S) group. In this role, you'll use your technical leadership and engineering
Marketing Analytics & Attribution Data Scientist
SonarSourceAustinSonarSource is looking for a data analyst to support marketing by turning data into actionable insights. You will own marketing attribution, ROI, and conversion analysis, proactively exploring data to
Radiology Physician
TexasAustinOpportunity details: Up to $700,000 W2 employment or 1099 Independent Contractor Join one of the nation's leading providers of breast imaging services Exclusive partnership with one of Austin's larges
Austin Regional Sales Leader - Small-Format Channel
Chaparral Distributing LLCAustinChaparral Distributing is seeking a Regional Sales Manager responsible for leading and developing the sales organization within Texas. The role involves driving sales performance, ensuring executional
Edge-Driven Embedded Security Engineer (Rust)
ProducePayAustinProducePay is looking for a highly technical Embedded Security Engineer to contribute to our innovative AI infrastructure. The role focuses on integrating security in software development, working alo
HVAC Sales Technician: High Earnings + Company Truck
Yellowstone LocalAustinYellowstone Local seeks a high-performing HVAC Technician in Austin, Texas to conduct residential service calls with a focus on sales and customer education. This commission-only role allows top perfo
Data Engineer New Austin, in-person
inKindAustinData Engineering is a key role in the development team and is responsible for building and maintaining the AI‑ready data foundation that powers inKind’s intelligent products, machine learning models,
Senior Defense Cloud Sales Executive
AmazonAustinAmazon is seeking a Senior Account Executive for Global DefenseTechs within its Public Sector team to lead strategic partnerships and drive revenue. This role focuses on the migration to cloud service
Dental Hygienist
TexasAustinAs aRegistered Dental Hygienist ,(RDH) with Lone Star Pediatric Dentistry,you will play a vital role in ensuring your young patients develop healthy habits and a lifetime of beautiful smiles! If you h
Senior Embedded Security Engineer
ProducePayAustinAbout Us webAI is pioneering the future of artificial intelligence by establishing the first distributed AI infrastructure dedicated to personalized AI. We recognize the evolving demands of a data-dri
Power Solutions Tech Biz Dev Manager Drive New Design Wins
Amphenol ICCAustinAmphenol ICC is seeking a Technical Business Development Manager to enhance their Power Solutions Group in the U.S. This role involves promoting product portfolios and identifying market trends to fac
Machine Learning Engineer, ML/GenAI Evaluation
- San Diego, California, United States
- San Diego, California, United States
À propos
Would you like to contribute to Machine Learning and Generative AI technologies? Are you passionate about measuring what matters and ensuring AI systems work reliably for everyone? Do you believe that rigorous evaluation — including holding models accountable to fairness standards — is what separates great ML from good ML? We truly believe it is! We are defining what exceptional looks like for machine learning across Wallet, Payments, and Commerce. As a Machine Learning Engineer specializing in Evaluation, you will establish the evaluation criteria, metrics frameworks, and quality standards that determine when models are ready to reach hundreds of millions of users. Your judgment shapes model quality and earns the confidence to ship. You'll work at the intersection of rigorous ML science and high-impact product decisions, collaborating closely with ML Engineering, Product, Privacy, and Legal teams. This unique opportunity puts you at the center of model quality — designing adversarial test strategies, surfacing failure modes before they reach users, and owning the sign-off process that ensures Apple's financial features meet the highest bar for accuracy, robustness, and reliability.
Description The ideal candidate is a rigorous, curious ML practitioner who believes that how you measure a model is just as important as how you train it. You think critically about what metrics actually capture, know how models break in the real world, and hold quality standards others find uncomfortably high — including on dimensions like fairness. You will own the full evaluation lifecycle for ML models across Wallet features — designing test frameworks, adversarial corpora, and benchmarks that reflect the diversity of Apple's global user base, then making the final quality call before any model ships. Your findings directly shape model development priorities and product decisions at scale.
Responsibilities
Define evaluation criteria and quality metrics for ML models powering Wallet features
Design and maintain structured test sets covering the full diversity of real-world scenarios — varied document formats, distributions, languages, edge cases, and adversarial inputs.
Develop evaluation methodologies for robustness testing: distribution shift, out-of-distribution generalization, temporal drift, and aggressor scenarios
Own fairness evaluation end-to-end — define fairness metrics appropriate to each Wallet feature, build bias test suites across protected attributes and user populations, measure disparate performance across subgroups, and gate model launches on fairness criteria with the same rigor as other conventional metrics.
Build user persona–stratified benchmarks that reflect the breadth of Wallet's global user population across spending patterns, locales, and document types
Evaluate generative and agentic model outputs — assessing hallucination rates, faithfulness, and groundedness using LLM-as-a-judge frameworks, human evaluation protocols, and prompt regression testing
Own model quality sign-off — establish the launch criteria, run final evaluations, and make the call on model readiness before any feature ships
Synthesize evaluation results into clear, actionable insights that guide model development priorities and product decisions
Partner with ML engineers and Quality engineers to identify failure modes early in the development cycle and close the loop between evaluation findings and model improvements
Establish and evangelize evaluation best practices across the Wallet ML team, raising the quality bar for how models are tested, monitored, and maintained post-launch
Minimum Qualifications
M.S. in Machine Learning, Computer Science, Statistics, Applied Mathematics, or a related technical field strongly preferred.
Bachelor's degree with 7+ years hands‑on experience in ML evaluation, model quality, or applied research will be considered
5+ years of hands‑on ML experience, with deep expertise in model evaluation, offline metrics design, and behavioral testing
Strong track record designing evaluation frameworks for production ML systems — not just accuracy/F1, but precision‑recall tradeoffs, calibration, fairness, and task‑specific quality dimensions
Creative mindset with the ability to translate standard ML evaluation metrics (F1, AUC, etc.) into utility and user trust measures
Experience testing for distribution shift, out‑of‑distribution generalization, and temporal drift in real‑world deployed models
Proven ability to construct adversarial test suites, aggressor scenarios, and edge‑case corpora that surface model failure modes before they reach users
Experience with structured and semi‑structured document understanding, OCR pipelines, or financial data extraction is a strong plus
Strong programming skills in Python; fluency with evaluation tooling, data pipelines, and experiment tracking (e.g., MLflow, W&B, or equivalent)
Excellent communication skills — ability to translate metric results into product‑quality narratives for engineering and executive audiences
Experience owning model quality sign‑off in a cross‑functional launch process
Preferred Qualifications
PhD in Computer Science, Data Science, Statistics, AI/ML, or a related field.
Experience with Bayesian or causal graph‑based approaches to data generation.
Experience with causal approaches to fairness evaluation — counterfactual fairness, causal Shapley values, or structural causal model‑based bias auditing.
Experience evaluating models under privacy constraints or on‑device inference settings is a plus.
Familiarity with confidence calibration techniques and uncertainty quantification a plus
Background in financial services, fintech, or consumer payment products
At Apple, base pay is one part of our total compensation package and is determined within a range. This provides the opportunity to progress as you grow and develop within a role. The base pay range for this role is between $171,600 and $302,200, and your base pay will depend on your skills, qualifications, experience, and location.
Apple employees also have the opportunity to become an Apple shareholder through participation in Apple’s discretionary employee stock programs. Apple employees are eligible for discretionary restricted stock unit awards, and can purchase Apple stock at a discount if voluntarily participating in Apple’s Employee Stock Purchase Plan. You’ll also receive benefits including: Comprehensive medical and dental coverage, retirement benefits, a range of discounted products and free services, and for formal education related to advancing your career at Apple, reimbursement for certain educational expenses — including tuition. Additionally, this role might be eligible for discretionary bonuses or commission payments as well as relocation.
Note: Apple benefit, compensation and employee stock programs are subject to eligibility requirements and other terms of the applicable plan or program.
Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics.
At Apple, we believe accessibility is a fundamental human right. You’ll find that idea reflected in everything here — in our culture, our benefits and our digital tools. By welcoming as many perspectives as possible, we help you build a career where you feel like you belong.
Learn about accessibility in Apple’s workplace
Learn about reasonable accommodations for job applicants
Apple accepts applications to this posting on an ongoing basis.
#J-18808-Ljbffr
Compétences linguistiques
- English
Cette offre provient d’une plateforme partenaire de TieTalent. Cliquez sur « Postuler maintenant » pour soumettre votre candidature directement sur leur site.