This job offer is no longer available
About
3 days ago Be among the first 25 applicants Your Journey at Crowe Starts Here: At Crowe, you can build a meaningful and rewarding career. With real flexibility to balance work with life moments, youre trusted to deliver results and make an impact. We embrace you for who you are, care for your well-being, and nurture your career. Everyone has equitable access to opportunities for career growth and leadership. Over our 80-year history, delivering excellent service through innovation has been a core part of our DNA across our audit, tax, and consulting groups. Thats why we continuously invest in innovative ideas, such as AI-enabled insights and technology-powered solutions, to enhance our services. We expect the candidate to uphold Crowes values of Care, Trust, Courage, and Stewardship. These values define who we are. We expect all of our people to act ethically and with integrity at all times. About the Role
The
AI DevOps and Cloud Infrastructure Engineer I (Senior Staff)
designs, builds, and operates scalable, secure, and highly automated cloud environments that support the training, deployment, monitoring, and continuous delivery of AI and machine learning systems. This role serves as a subjectmatter expert in infrastructure automation, distributed compute orchestration, and cloud platform operations, ensuring AI workloads perform reliably across development, staging, and production environments. The engineer collaborates closely with AI engineering, MLOps, data engineering, platform, and security teams to define infrastructure requirements, improve observability, and support the performance demands of predictive and generative AI workloads. As a senior staff?level contributor, the role establishes best practices, evaluates emerging cloud and AI infrastructure tooling, and mentors junior engineers to advance DevOps maturity, reliability, and cost efficiency across the organization. Responsibilities Architect and maintain cloud infrastructure for AI model training, inference services, and distributed compute workloads. Implement infrastructure?as?code (IaC) to automate provisioning, configuration, scaling, and lifecycle management of cloud resources. Design and operate CI/CD pipelines for automated model training, testing, and deployment of AI?enabled applications. Optimize Kubernetes clusters, GPU utilization, and compute scaling strategies to balance performance, reliability, and cost. Integrate AI models, inference endpoints, and data pipelines into cloud?native platforms. Develop monitoring, logging, alerting, and observability solutions using modern telemetry and tracing tools. Troubleshoot issues across networking, containers, compute, storage, and model?serving layers. Lead performance benchmarking, load testing, and reliability validation for AI systems. Document infrastructure architectures, operational runbooks, and engineering standards. Support automation for dataset ingestion, model versioning, artifact management, and ML testing. Ensure compliance with cloud security, identity management, encryption, and responsible AI guidelines. Partner with security teams to implement secure networking, IAM policies, and secrets management. Provide technical mentorship, design reviews, and cloud best?practice guidance to junior engineers. Evaluate new cloud services, platform capabilities, and AI infrastructure tooling for adoption.
Qualifications
4+ years of experience in DevOps, cloud engineering, platform engineering, or infrastructure engineering. Strong proficiency with Kubernetes, Docker, and cloud orchestration platforms. Deep experience with CI/CD systems and deployment automation. Demonstrated ability to debug distributed systems and cloud networking issues. Proficiency in Python, Bash, or other automation/scripting languages. Strong communication skills and ability to collaborate across engineering and security teams. Willingness to travel occasionally for cross?functional planning and collaboration.
Preferred Qualifications
Bachelors degree in Computer Science, Cloud Engineering, Information Systems, or a related technical field, or equivalent experience. Masters degree in a technical discipline. Experience enabling ML or AI workloads at scale in production environments. Cloud and platform certifications including Azure (AZ?900, AZ?104, AZ?305, AZ?700, AI?102) or equivalent AWS/GCP certifications. Advanced experience with AWS (e.g., EKS, EC2, IAM, Lambda, SageMaker) and/or Azure (e.g., AKS, VMSS, Azure ML). Experience with GPU orchestration and scaling strategies for AI workloads. Expertise with Terraform or other infrastructure?as?code frameworks. Hands?on experience with observability stacks such as Prometheus, Grafana, CloudWatch, and OpenTelemetry. Experience deploying and operating generative AI workloads, including LLM inference autoscaling and RAG architectures. Familiarity with vector database hosting (e.g., Pinecone, Weaviate, FAISS) and model?serving frameworks (e.g., Hugging Face TGI, vLLM, custom inference containers). Experience building CI/CD pipelines for LLM fine?tuning workflows (e.g., LoRA, QLoRA, PEFT) and monitoring generative AI performance metrics such as latency, throughput, and hallucination rates.
Application Deadline
The application deadline for this role is 03/31/2026. Compensation
The wage range for this role is $74,100.00 $147,800.00 per year. Benefits
We offer a comprehensive total rewards package. Optional benefits include unlimited PTO, flexible remote work, and a supportive environment that prioritizes sustainable, long?term performance. Equal Opportunity Statement
Crowe LLP provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, sexual orientation, gender identity or expression, genetics, national origin, disability or protected veteran status, or any other characteristic protected by federal, state or local laws. Work Authorization
All persons hired will be required to verify identity and eligibility to work in the United States and to complete the required employment eligibility verification form upon hire. Crowe is not sponsoring for work authorization at this time. #J-18808-Ljbffr
Languages
- English
Notice for Users
This job was posted by one of our partners. You can view the original job source here.