This job offer is no longer available
About
Location: Fort Mill, SC (Hybrid – 3 Days Onsite) Position Overview
We are seeking a highly skilled Senior DevOps Engineer to join our engineering team in Fort Mill, SC. This is a hands-on role requiring deep expertise in AWS, Kubernetes, Terraform, Ansible, CI/CD, Infrastructure as Code (IaC), and cloud automation. The ideal candidate will also have exposure to AI/ML infrastructure and experience supporting Generative AI or Agentic AI applications in production. You will work closely with Software Engineering, Platform Engineering, AI/ML teams, and Cloud Architects to build scalable, secure, and highly available cloud infrastructure while driving automation and DevOps best practices. Key Responsibilities
Design, implement, and maintain highly available cloud infrastructure on AWS. Build and manage Infrastructure as Code (IaC) using Terraform. Automate infrastructure provisioning, configuration management, and deployments using Ansible. Manage cloud networking, IAM, VPCs, Load Balancers, Auto Scaling Groups, Route53, S3, EKS, EC2, RDS, Lambda, CloudWatch, and related AWS services. Optimize cloud environments for scalability, reliability, performance, and cost. Deploy and manage containerized applications using Docker and Kubernetes (EKS preferred). Configure Kubernetes deployments, services, ingress controllers, namespaces, ConfigMaps, Secrets, Helm Charts, and autoscaling. Monitor cluster health and troubleshoot production issues. Ensure high availability and disaster recovery strategies. Design and maintain CI/CD pipelines using Jenkins, GitHub Actions, GitLab CI, or Azure DevOps. Automate application deployments across development, QA, staging, and production environments. Integrate security scanning, testing, and compliance into deployment pipelines. Support blue-green deployments, canary releases, and rolling deployments. Support infrastructure for Generative AI and Agentic AI applications. Deploy AI workloads using Kubernetes and cloud-native services. Collaborate with AI/ML engineers to optimize GPU-enabled infrastructure. Support AI model deployment, inference services, and scalable AI platforms. Work with vector databases, AI APIs, or LLM-based applications (preferred). Implement monitoring and alerting using Prometheus, Grafana, CloudWatch, Datadog, ELK, or Splunk. Troubleshoot production issues and perform root cause analysis. Ensure system uptime, reliability, and performance. Participate in production support and incident management. Implement IAM policies and cloud security best practices. Ensure infrastructure complies with organizational security standards. Manage secrets, certificates, and secure deployment practices. Collaborate with Security and Infrastructure teams on governance and compliance initiatives. Required Qualifications
Bachelor's degree in Computer Science, Information Technology, or related field. 7+ years of DevOps or Cloud Engineering experience. Strong hands-on experience with AWS Cloud. Extensive experience with Kubernetes (EKS preferred). Strong expertise in Terraform for Infrastructure as Code. Hands-on experience with Ansible for configuration management and automation. Strong experience with Docker and container orchestration. Experience building and maintaining CI/CD pipelines. Strong Linux administration and troubleshooting skills. Experience with Git, branching strategies, and version control. Proficiency in scripting using Python and/or Shell (Bash). Strong understanding of networking, DNS, SSL/TLS, IAM, security groups, and load balancing. Excellent troubleshooting and production support experience. Preferred Qualifications
Experience supporting Generative AI, LLM, or Agentic AI workloads. Experience deploying AI/ML models into production. Exposure to LangChain, OpenAI APIs, Hugging Face, Bedrock, Vertex AI, or Azure OpenAI. Experience with GPU-enabled Kubernetes clusters. Knowledge of MLOps concepts and AI infrastructure. Experience with GitOps tools such as ArgoCD or FluxCD. AWS Solutions Architect, AWS DevOps Engineer, CKA, CKAD, or Terraform certifications are a plus. Required Technical Skills
AWS (EC2, EKS, S3, IAM, Lambda, CloudWatch, RDS, VPC) Kubernetes (EKS) Docker Terraform Ansible Jenkins / GitHub Actions / GitLab CI Python Bash Linux Git Helm Prometheus Grafana Datadog ELK Cloud Security Infrastructure as Code CI/CD AI Infrastructure Generative AI (Preferred) Agentic AI (Preferred) Nice to Have
Experience with LaunchDarkly MLOps experience Vector databases (Pinecone, Weaviate, FAISS) LangChain OpenAI / Azure OpenAI Bedrock Kafka ArgoCD
Languages
- English
Notice for Users
This job was posted by one of our partners. You can view the original job source here.