Google Cloud DevOps / Site Reliability Engineer (SRE)
Purple Drive
- United States
- United States
About
Location:
Alpharetta, GA
Experience:
8-12 Years (Senior Level)
Job Summary
We are seeking an experienced
Google Cloud DevOps / SRE Engineer
to design, build, and operate highly reliable, scalable, and secure cloud infrastructure on
Google Cloud Platform (GCP) . The ideal candidate will bring deep Linux expertise, strong cloud networking and security knowledge, and hands-on experience with automation, CI/CD, and Kubernetes-based deployments. This role plays a critical part in ensuring system reliability, performance, and operational excellence across large-scale distributed systems.
Key Responsibilities
Cloud Infrastructure & Platform Engineering
Design, deploy, and manage cloud infrastructure using
Google Cloud Platform
services including Compute Engine, GKE, VPC, IAM, Cloud Storage, and Cloud SQL. Architect and support highly available, scalable, and fault-tolerant systems on GCP. Implement and manage
Shared VPCs, VPC peering, firewall rules, load balancers, DNS, and VPN tunnels . DevOps & Automation
Build and maintain
CI/CD pipelines
using
Jenkins (Declarative & Scripted)
and
GitHub Actions . Automate infrastructure provisioning and configuration using
Terraform , including module development, remote state management, dependency handling, and DRY principles. Implement modern deployment strategies such as
Canary releases
and
Blue/Green deployments . Manage container artifacts using
Docker and Helm . Site Reliability & Operations
Ensure high availability, performance, and reliability of production systems. Troubleshoot complex system issues including
CPU, memory, disk I/O bottlenecks , kernel issues, and system boot failures. Analyze logs and metrics to proactively identify and resolve performance and stability issues. Support incident response, root cause analysis, and post-incident reviews. Linux Systems Engineering (Must Have)
Demonstrate deep hands-on expertise with
Linux systems
(RHEL, Ubuntu, CentOS). Perform kernel tuning, system optimization, storage management (LVM), and systemd administration. Maintain OS-level security, patching, and performance best practices. Security & Identity Management
Implement and troubleshoot
Cloud IAM , service accounts, and
Workload Identity Federation . Enforce
least privilege access
and security best practices across environments. Partner with security teams to maintain compliance and secure cloud operations. Collaboration & Process
Work closely with application teams, architects, and security stakeholders. Participate in on-call rotations and incident management processes. Contribute to operational documentation, runbooks, and best practices. Required Skills & Qualifications
Must-Have Skills
Strong hands-on experience with
Google Cloud Platform (GCP) . Deep expertise in
Linux systems engineering
(RHEL, Ubuntu, CentOS). Proficiency in at least one programming language:
Python, Go (Golang), or Java . Strong troubleshooting and debugging skills across infrastructure and application layers. Hands-on experience with
Terraform
for infrastructure as code. Experience with
CI/CD pipelines
using Jenkins and/or GitHub Actions. Kubernetes experience with
GKE , Docker, and Helm. Preferred Qualifications
GCP Certifications:
Google Professional Cloud DevOps Engineer Google Professional Cloud Architect
CKA (Certified Kubernetes Administrator) . Experience supporting
large-scale distributed systems and microservices architectures . Familiarity with
ITIL processes ,
Change Advisory Board (CAB)
workflows, and
incident management . Soft Skills
Strong analytical and problem-solving abilities. Excellent communication skills with the ability to collaborate across teams. Ownership mindset with a focus on reliability and continuous improvement. Ability to work in fast-paced, production-critical environments.
Languages
- English
Notice for Users
This job comes from a TieTalent partner platform. Click "Apply Now" to submit your application directly on their site.