IT - Senior Technology Architect | Cloud Platform | Google Machine LearningSysMind Tech • United States
Dieses Stellenangebot ist nicht mehr verfügbar
IT - Senior Technology Architect | Cloud Platform | Google Machine Learning
SysMind Tech
- United States
- United States
Über
ATTENTION ALL SUPPLIERS!!!
READ BEFORE SUBMITTING • UPDATED CONTACT NUMBER and EMAIL ID is a MANDATORY REQUEST from our client for all the submissions • Limited to 1 submission per supplier. Please submit your best. • We prioritize endorsing those with complete and accurate information • Avoid submitting duplicate profiles. We will Reject/Disqualify immediately. • Make sure that candidate's interview schedules are updated. Please inform the candidate to keep their lines open. • Please submit profiles within the max proposed rate. • Please make sure to TAG the profiles correctly if the candidate has WORKED FOR Client as a SUBCON or FTE. MANDATORY: Please include in the resume the candidate's complete & updated contact information (Phone number, Email address and Skype ID) as well as a set of 5 interview timeslots over a 72-hour period after submitting the profile when the hiring managers could potentially reach to them. PROFILES WITHOUT THE REQUIRED DETAILS and TIME SLOTS will be REJECTED.
Job Title: Senior Technology Architect | Cloud Platform | Google Machine Learning -- Senior Site Reliability Engineer(SRE)/Cloud Platform Architect Work Location & Reporting Address: Charlotte, NC 28202 (Onsite-Hybrid. Charlotte, NC location is highly preferred but candidates can also work in Dallas, TX. Candidates should be able to work onsite from DAY 1!) Contract duration: 12 MAX VENDOR RATE: market market market rate per hour max Target Start Date: 09 Feb 2026 Does this position require Visa independent candidates only? No. H1B candidate will be considered
Must Have Skills: • Kubernetes • Openshift • Terraform • AWS • Azure • CI/CD • Jenkins • Github • Git lab • Docker • Prometheus • Grafana • Python • Shellscripting
Nice to Have Skills: • GCP • Prompt Engineering
Detailed Job Description: • We are seeking a highly skilled talent to design, automate, and operate scalable cloud infrastructure across AWS and Azure, with heavy focus on Kubernetes/OpenShift, Terraform-based IaC, and high-throughput CI/CD. You'll own reliability, performance, and cost efficiency, building secure delivery pipelines, platform automation, and robust observability for mission-critical services
Key Responsibilities: • Architect, deploy, and manage high-availability, scalable, and secure cloud infrastructure across AWS, Azure, and hybrid environments. • Implement infrastructure-as-code (IaC) using Terraform, Ansible, and similar tools to ensure consistent, version-controlled, and fully automated environment provisioning. • Design and manage Kubernetes/OpenShift clusters, including node management, autoscaling, ingress/routing, RBAC, quotas, and security policies. • Optimize cloud resources through right-sizing, workload tuning, and cost-governance practices. • Build, enhance, and maintain CI/CD pipelines using Jenkins, GitLab CI/CD, GitHub Actions, or similar tools to support automated build, test, and deployment workflows. • Implement blue-green, rolling, and canary deployment strategies to ensure zero-downtime releases for mission-critical applications. • Integrate automated testing frameworks, code-quality gates, and security scans into the pipeline to ensure compliance and reliability • Containerize applications using Docker and deploy them via Kubernetes/Helm/OpenShift for scalable, resilient microservice environments. • Improve service reliability through resource tuning, autoscaling (HPA/VPA), service mesh patterns, and optimized workload distribution. • Deploy and maintain observability pipelines using Prometheus, Grafana, Splunk, Datadog, ELK/EFK, or similar tools to provide deep visibility into system health. • Build dashboards and alerts for proactive issue detection, significantly reducing MTTR through automation and intelligent triage. • Conduct root-cause analysis, capacity planning, and performance optimization across distributed systems. • Implement cloud security practices including IAM/RBAC, key management, secrets rotation, network policies, and encryption in transit/at rest. • Automate compliance checks for SOC2, HIPAA, PCI-DSS, or internal governance frameworks through policy-as-code and CI/CD integration. • Ensure secure container images, infrastructure baselines, audit trails, and vulnerability scanning across environments. • Design and optimize VPC/VNet architectures, load balancers, DNS, ingress/egress routing, firewall rules, and hybrid connectivity. • Implement resilient traffic strategies such as multi-region failover, geo-redundancy, and fault-tolerant service routing. • Develop automation scripts with Python, Shell, Groovy, or PowerShell to eliminate manual tasks, reduce operational toil, and speed up deployments. • Build internal tools, templates, and reusable modules to standardize and accelerate infrastructure provisioning. • Collaborate closely with development, QA, and architecture teams to streamline release workflows and improve platform reliability. • Implement and maintain SLOs, SLIs, and SLAs, ensuring service reliability and performance targets are consistently met. • Drive continuous improvement through chaos engineering, disaster recovery planning, resilience testing, and failover simulations. • Conduct periodic system reviews, implement performance enhancements, and proactively mitigate production risks. • 8-10+ years of hands-on experience in DevOps, Site Reliability Engineering, Cloud Engineering, or Platform Engineering roles. • Strong expertise with Kubernetes or OpenShift platforms, including cluster operations, workload orchestration, security policies, ingress, autoscaling, and production-grade deployments. • Proven experience with Infrastructure as Code (IaC) using tools such as Terraform, Ansible, and related automation/configuration technologies. • Hands-on proficiency building and maintaining CI/CD pipelines using Jenkins, GitLab CI/CD, GitHub Actions, or similar enterprise pipelines. • Strong experience with AWS and/or Azure cloud services, including networking, compute, IAM/RBAC, load balancing, storage, and secrets management. • Demonstrated background in containerization using Docker and Kubernetes with knowledge of Helm, YAML, and modern deployment strategies. • Solid scripting ability in Python, Shell, Groovy, or PowerShell for automation, tooling, and workflow optimization. • Deep understanding of monitoring, logging, and observability using Prometheus, Grafana, Splunk, Datadog, ELK/EFK, or similar stacks. • Strong foundation in networking concepts-TCP/IP, DNS, SSL, HTTP, routing, and firewall policies. • Experience implementing high availability, resilience, disaster recovery, and failover strategies in distributed systems. • Knowledge of cloud security, compliance frameworks, vulnerability scanning, and policy enforcement
Minimum Years of Experience: • 8-10+ years
Certifications Needed: • No
Top 3 responsibilities you would expect the Subcon to shoulder and execute: • Design, automate, and manage scalable cloud infrastructure using Kubernetes/OpenShift, Terraform, and cloud-native services to ensure high availability, security, and operational efficiency. • Build and optimize CI/CD pipelines with automated testing, quality gates, and deployment strategies (blue-green, rolling, canary) to enable fast, reliable, and zero-downtime releases. • Implement robust observability and reliability engineering practices using monitoring, logging, and alerting tools (Prometheus, Grafana, Splunk, Datadog, ELK/EFK) to reduce MTTR, improve performance, and maintain production resilience
Interview Process (Is face to face required?) • No
Any additional information you would like to share about the project specs/nature of work:
Sprachkenntnisse
- English
Hinweis für Nutzer
Dieses Stellenangebot wurde von einem unserer Partner veröffentlicht. Sie können das Originalangebot einsehen hier.