Dieses Stellenangebot ist nicht mehr verfügbar
Über
We provide the time and resources needed to deepen your technical and strategic mastery while working collaboratively. Commitment to Growth:
Our culture emphasizes professional development and continuous knowledge-sharing. Work-Life Balance:
We believe in a flexible, supportive environment with unlimited PTO and remote work options. About the Role The
AI DevOps and Cloud Solutions Engineer I (Senior Staff)
is responsible for designing, building, and maintaining secure and scalable cloud infrastructures that support the training, deployment, and monitoring of AI and machine learning systems. As a subject-matter expert in infrastructure automation, distributed compute orchestration, and cloud operations, you will ensure the reliability of AI workloads across all environments. This senior role involves collaboration with various teams to define infrastructure needs, enhance observability, and ensure optimal performance for predictive and generative AI workloads. Architect and maintain cloud infrastructure for AI model training and inference services. Implement infrastructure-as-code (IaC) for automating the management of cloud resources. Create CI/CD pipelines for model training, testing, and deployment of AI applications. Optimize Kubernetes clusters and GPU utilization to balance performance and cost. Integrate AI models and data pipelines into cloud-native platforms. Develop observability solutions using modern telemetry tools. Troubleshoot issues across networking, containers, compute, storage, and model-serving layers. Conduct performance benchmarking and reliability validation for AI systems. Document infrastructure architectures and operational runbooks. Support automation for dataset ingestion, model versioning, and testing. Ensure compliance with cloud security and responsible AI practices. Work with security teams on secure networking and IAM policies. Provide mentorship and guidance to junior engineers on best practices. Evaluate new cloud services and tools for potential adoption. Qualifications 4+ years of experience in DevOps, cloud engineering, or platform engineering. Strong proficiency in Kubernetes and Docker. Experience with CI/CD systems and deployment automation. Ability to debug distributed systems and networking issues. Proficiency in Python, Bash, or similar scripting languages. Excellent communication skills and collaboration ability. Willingness to travel occasionally for planning and collaboration. Preferred Qualifications Bachelor's degree in Computer Science or a related field; a Master's degree is a plus. Experience deploying ML or AI workloads in production environments. Cloud certifications (such as Azure or AWS) are preferred. Advanced experience with cloud services like AWS and Azure. Experience with GPU orchestration for AI workloads. Expertise in Terraform or similar infrastructure-as-code frameworks. Hands-on experience with observability tools. Experience with generative AI workload deployments. Familiarity with vector databases and model-serving frameworks. Experience with CI/CD pipelines for AI workflows. We expect our candidates to embody Crowe's values of Care, Trust, Courage, and Stewardship, maintaining ethical standards and integrity throughout their work. The application deadline for this role is 03/31/2026. We are committed to equal employment opportunities for all applicants. If you require assistance navigating our website or completing your application, please visit our Applicant Assistance and Accommodations page for more information.
Sprachkenntnisse
- English
Hinweis für Nutzer
Dieses Stellenangebot wurde von einem unserer Partner veröffentlicht. Sie können das Originalangebot einsehen hier.