Dieses Stellenangebot ist nicht mehr verfügbar
Über
About The Company
Cohere mission is to scale intelligence to serve humanity. We're training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, retrieval-augmented generation (RAG), and autonomous agents. We believe that our work is instrumental to the widespread adoption of AI, driving innovation and transforming industries worldwide.
We obsess over what we build. Each one of us is responsible for contributing to increasing the capabilities of our models and the value they drive for our customers. We like to work hard and move fast to do what's best for our customers. Cohere is a team of researchers, engineers, designers, and more, who are passionate about their craft. Each person is one of the best in the world at what they do. We believe that a diverse range of perspectives is a requirement for building great products. Join us on our mission and shape the future of artificial intelligence.
About The Role
Are you energized by building high-performance, scalable, and reliable machine learning systems? Do you want to help define and build the next generation of AI platforms powering advanced NLP applications? We are seeking Members of Technical Staff to join our Model Serving team at Cohere. In this role, you will be responsible for developing, deploying, and maintaining our AI platform that delivers Cohere's large language models through user-friendly API endpoints.
You will work closely with cross-functional teams to deploy optimized NLP models into production environments that demand low latency, high throughput, and high availability. Additionally, you will have opportunities to interface directly with customers, creating customized deployments tailored to their specific requirements. This role offers a unique chance to influence the infrastructure that supports cutting-edge AI applications and to solve complex technical challenges in a dynamic environment.
Qualifications
- 5+ years of engineering experience managing production infrastructure at a large scale
- Proficiency in designing large, highly available distributed systems using Kubernetes
- Experience with GPU workloads and clusters in cloud environments
- Hands-on experience with Kubernetes development, deployment, and support in production
- Familiarity with cloud platforms such as GCP, Azure, AWS, OCI, and multi-cloud/hybrid environments
- Strong background in Linux-based computing environments including troubleshooting and support
- Knowledge of compute, storage, network resource management, and cost optimization
- Excellent collaboration and troubleshooting skills for mission-critical systems
- Ability to adapt and solve complex technical challenges in a fast-paced setting
- Understanding of accelerators (GPUs, TPUs, custom accelerators) and their impact on latency and throughput
- Working experience with distributed systems architecture
- Proficiency in programming languages such as Golang, C++, or other high-performance server languages
Responsibilities
- Design, develop, and maintain scalable and reliable machine learning infrastructure
- Deploy and support large NLP models in production environments with low latency and high throughput
- Collaborate with cross-functional teams including research, engineering, and customer success to optimize deployment strategies
- Implement monitoring, logging, and alerting systems to ensure high availability and performance
- Troubleshoot and resolve infrastructure issues efficiently, ensuring minimal downtime
- Optimize resource utilization and manage compute/storage/network costs effectively
- Contribute to the development of best practices for scalable deployment and system architecture
- Interface with customers to understand their needs and create customized deployment solutions
- Stay updated on emerging technologies in cloud computing, accelerators, and distributed systems to continually improve infrastructure
Benefits
- Open and inclusive work environment fostering diversity and innovation
- Opportunity to work alongside a team at the forefront of AI research and development
- Weekly lunch stipend, in-office lunches, and snacks to promote team bonding
- Comprehensive health and dental benefits, including mental health support
- 100% parental leave top-up for up to six months
- Personal enrichment benefits supporting arts, culture, fitness, and well-being
- Remote-flexible work arrangements with offices in Toronto, New York, San Francisco, London, and Paris, plus co-working stipends
- Generous vacation policy offering six weeks (30 working days) of paid time off
Equal Opportunity
Cohere value and celebrate diversity and strive to create an inclusive work environment for all. We welcome applicants from all backgrounds and are committed to providing equal opportunities. If you require any accommodations during the recruitment process, please submit an Accommodations Request Form, and we will work together to meet your needs.
Sprachkenntnisse
- English
Dieses Stellenangebot wurde von einem unserer Partner veröffentlicht. Sie können das Originalangebot einsehen hier.