Machine Learning Architect - Conversational Speech
Apple
- United States
- United States
À propos
As the Machine Learning Architect for Conversational Speech, you will define modeling strategy and technical direction across the Speech organization, establishing a unified architectural vision for speech recognition, speech synthesis, dialog systems, multimodal foundation models, and speech-to-speech technologies. You will serve as the organization's foremost modeling expert, providing deep technical guidance to multiple teams working on interconnected speech capabilities. You will evaluate emerging research and industry trends-including advances in large language models, multimodal architectures, and full-duplex natural conversational systems-and translate them into actionable roadmaps. You will champion production-readiness, ensuring architectural decisions account for on-device constraints, latency, scalability, and robustness. You will collaborate broadly with partner teams across Siri, Apple Intelligence, hardware, and platform engineering to ensure speech modeling investments are well-integrated into Apple's broader AI strategy.\n
10+ years of experience in machine learning applied to speech or multimodal systems, with progressively increasing technical scope and leadership.\nDemonstrated expertise as a technical leader or architect who has defined modeling direction across multiple teams or product areas.\nDeep, hands-on proficiency in modern deep learning, including large language models and end-to-end speech systems.\nSignificant experience with multimodal LLMs, including architecture design, training, adaptation, and deployment of models that integrate speech, audio, and text modalities.\nDirect experience building speech-to-speech conversational systems, with a strong understanding of full-duplex natural conversational interaction and end-to-end speech pipelines.\nA track record of translating research into production-quality systems at scale.\nExpert programming skills in Python and deep learning frameworks such as PyTorch, JAX, or TensorFlow.
Ph.D. in Computer Science, Electrical Engineering, Machine Learning, or similar technical field.\nExperience architecting or leading development of full-duplex natural conversational systems, speech-to-speech models, or multimodal foundation models that have shipped to large-scale user populations.\nDeep familiarity with the full stack of speech technologies-ASR, TTS, spoken dialog, speaker modeling, audio understanding-and an ability to reason about their interactions and dependencies.\nExperience with large-scale distributed training and the infrastructure considerations that shape model design at scale.\nA data-centric perspective on foundation model development, including experience guiding data collection, curation, annotation, and quality strategies.\nExperience with on-device ML deployment, including model compression, quantization, and latency-aware architecture design.\n
Compétences linguistiques
- English
Avis aux utilisateurs
Cette offre provient d’une plateforme partenaire de TieTalent. Cliquez sur « Postuler maintenant » pour soumettre votre candidature directement sur leur site.