Job Opportunities: site reliability sre
Find site reliability sre jobs near you, whether onsite, hybrid, or remote.Sr. DevOps Engineer - AI and Site Reliability Engineering
Teradata GroupConcordOur company At Teradata, we believe that people thrive when empowered with better information. Teradata Autonomous Knowledge Platform activates enterprise intelligence by unifying data, knowledge and
Sr. DevOps Engineer - AI and Site Reliability Engineering
Teradata Corporation (SE)Salt Lake CityOur company At Teradata, we believe that people thrive when empowered with better information. Teradata Autonomous Knowledge Platform activates enterprise intelligence by unifying data, knowledge and
Sr. DevOps Engineer - AI and Site Reliability Engineering
TeradataBaton RougeOur company At Teradata, we believe that people thrive when empowered with better information. Teradata Autonomous Knowledge Platform activates enterprise intelligence by unifying data, knowledge and
Sr. DevOps Engineer - AI and Site Reliability Engineering
Teradata Corporation (SE)MontgomeryOur company At Teradata, we believe that people thrive when empowered with better information. Teradata Autonomous Knowledge Platform activates enterprise intelligence by unifying data, knowledge and
Sr. DevOps Engineer - AI and Site Reliability Engineering
TeradataDoverOur company At Teradata, we believe that people thrive when empowered with better information. Teradata Autonomous Knowledge Platform activates enterprise intelligence by unifying data, knowledge and
Sr. DevOps Engineer - AI and Site Reliability Engineering
Teradata Corporation (SE)SacramentoOur company At Teradata, we believe that people thrive when empowered with better information. Teradata Autonomous Knowledge Platform activates enterprise intelligence by unifying data, knowledge and
Sr. DevOps Engineer - AI and Site Reliability Engineering
Teradata Corporation (SE)Little RockOur company At Teradata, we believe that people thrive when empowered with better information. Teradata Autonomous Knowledge Platform activates enterprise intelligence by unifying data, knowledge and
Sr. DevOps Engineer - AI and Site Reliability Engineering
Teradata Corporation (SE)PierreOur company At Teradata, we believe that people thrive when empowered with better information. Teradata Autonomous Knowledge Platform activates enterprise intelligence by unifying data, knowledge and
Sr. DevOps Engineer - AI and Site Reliability Engineering
Teradata Corporation (SE)BismarckOur company At Teradata, we believe that people thrive when empowered with better information. Teradata Autonomous Knowledge Platform activates enterprise intelligence by unifying data, knowledge and
Sr. DevOps Engineer - AI and Site Reliability Engineering
Teradata Corporation (SE)Des MoinesOur company At Teradata, we believe that people thrive when empowered with better information. Teradata Autonomous Knowledge Platform activates enterprise intelligence by unifying data, knowledge and
Sr. DevOps Engineer - AI and Site Reliability Engineering
Teradata Corporation (SE)TallahasseeOur company At Teradata, we believe that people thrive when empowered with better information. Teradata Autonomous Knowledge Platform activates enterprise intelligence by unifying data, knowledge and
Sr. DevOps Engineer - AI and Site Reliability Engineering
Teradata Corporation (SE)HartfordOur company At Teradata, we believe that people thrive when empowered with better information. Teradata Autonomous Knowledge Platform activates enterprise intelligence by unifying data, knowledge and
CMMS Integration Lead - Multi-Site Reliability & Analytics
Corning IncorporatedConcordCorning Inc. is looking for a Computerized Maintenance Management System (CMMS) Integration Manager based in Concord, NC. You will be responsible for leading the configuration, deployment, and enhance
Site Reliability Engineer, Client Platform
HADRIANLos AngelesHadrian - Manufacturing the FutureHadrian is building autonomous factories that help aerospace and defense companies manufacture rockets, satellites, jets, and ships up to 10x faster and up to 2x chea
Sr. DevOps Engineer - AI and Site Reliability Engineering
Teradata GroupCaliforniaOur company At Teradata, we believe that people thrive when empowered with better information. Teradata Autonomous Knowledge Platform activates enterprise intelligence by unifying data, knowledge and
On-Site Facilities QA Engineer: Reliability & Quality
Samsung Electronics PerúAustinSamsung Electronics Perú in Austin is looking for a Quality Assurance Engineer for Facilities Services to ensure reliability and safety in facility operations. This role emphasizes developing quality
DevOps Architect
VeriiproCharlotteWe are seeking an experienced and highly motivated DevOps Architect to lead enterprise‑scale cloud transformation and Site Reliability Engineering (SRE) initiatives. The ideal candidate will have deep
Network Engineer
TECHOAUTH SOLUTIONS LLCPhoenixNetwork Engineer Location: Phoenix, AZ Employment type: C2CJob Summary We are seeking a Senior Network Engineer with deep expertise in AWS cloud networking, strong SRE principles, and an observability
Senior DevOps Engineer / SRE
AjaibUnited StatesSenior DevOps Engineer/SreWe are seeking a highly skilled and motivated senior DevOps engineer/sre to join our dynamic cloud infrastructure team. In this role, you will be a key contributor to ensurin
(Sr.) DevOps Engineer - Fully Remote
ZealogicsUnited StatesDevOps EngineerWe are a multinational team that collaborates with Fortune 500 clients! The DevOps team is within the Software Development and Engineering department. This team is responsible for maint
Software Engineering Manager Production Support Operations
SunTrust Investment Services, Inc.United StatesManager of Production SupportThe Manager of Production Support leads teams responsible for ensuring the stability, resilience, and operational excellence of critical technology platforms supporting co
Applications Support Sr Analyst - AVP. Job in Tampa LilyLifestyle Jobs
Citigroup IncTampaThe Apps Support Sr Analyst is a seasoned professional role for L2 SRE. Services Production Support is looking to expand the service offering to incorporate Services Reliability Engineering principles
Lead Cloud Architect
Protective Services LLCUnited StatesThe work we do has an impact on millions of lives, and you can be a part of it.We help protect our customers against life's uncertainties. Regardless of where you work within the company, you'll be he
Sr. Devops Manager
Vistance NetworksSunnyvaleIn our always on world, we believe its essential to have a genuine connection with the work you do.RUCKUS Networks builds and delivers purpose-driven networks that perform in the tough, unique environ
Senior Developer - Full Stack .Net/React/AWS Information Technology
United AirlinesHoustonAchieving our goals starts with supporting yours. Grow your career, access top‑tier health and wellness benefits, build lasting connections with your team and our customers, and travel the world using
Sr. DevOps Engineer - AI and Site Reliability Engineering
- Concord, California, United States
- Concord, California, United States
About
What You’ll Do
Working on a team of professionals, you will design, implement, test, deploy, administer, and continually improve software solutions to ensure system reliability and availability, mitigate operational risks, track system health, and improve mean‑time‑to‑discover and mean‑time‑to‑respond for operational issues.
You will help lead chaos engineering efforts in a production‑like environment, exposing systems to simulations of real-world turbulence with the objective of identifying and quantifying operational weaknesses and developing remediation strategies.
You will leverage modern AI technologies, including large language models, machine learning, and agentic systems, both to increase the operational efficiency of the team and to measure and improve the reliability, scalability, observability, supportability, and performance of Teradata software.
You will become a subject‑matter expert in the production deployment and upgrade of Teradata software and the full software stack, from the network layer all the way to the observability tooling that it relies on.
Who You’ll Work With
You’ll work on a globally‑distributed team of other devops professionals, with engineers focused on site reliability engineering and observability.
You’ll work closely with product engineering and cloud operations personnel to understand operational requirements and identify and remediate operational deficits.
You’ll work with security and compliance teams to help provide evidence necessary to meet Teradata’s compliance obligations.
You’ll report to a Sr. Manager, Site Reliability Engineering.
What Makes You A Qualified Candidate
Bachelor’s degree or equivalent in computer science or a related field, master’s degree or equivalent preferred.
4+ years of industry experience.
Experience with at least one major cloud service provider (AWS, Azure, and/or Google Cloud), preferably all three. CSP developer or architect certifications preferred.
Experience building and deploying complex software solutions to significant operational problems. Proficiency with at least one modern programming language such as Python, and with a modern source control tool, preferably Git.
Familiarity with machine learning libraries such as Tensorflow and Scikit‑Learn.
Experience building and deploying AI systems via cloud‑based generative AI and agentic AI platforms such as AWS Bedrock, AWS Sagemaker, Azure AI Foundry, Google Vertex AI, and Google AgentSpace.
Experience with at least one modern defect tracking tool, preferably Jira.
Experience with an infrastructure‑as‑code (IaC) cloud provisioning tool, preferably Terraform, and with a configuration management tool such as Ansible or Puppet.
Experience with Grafana or an equivalent observability tool.
Experience with a build/deployment automation tool such as Jenkins or Bamboo.
Familiarity with both SQL and noSQL databases, and use cases for each.
Experience administering Linux‑based systems.
What You’ll Bring
4+ years of experience in the software industry in a devops or site reliability engineering role.
A passion for constant, iterative improvement over the status quo.
An in‑depth understanding of site reliability engineering principles, and how to measure and improve the reliability, scalability, supportability, and observability of production‑deployed enterprise software, with a focus on real‑world operational and customer experience.
An understanding of enterprise software deployment and security/compliance principles.
Proficiency with multi‑layered technical troubleshooting and root‑cause analysis.
The ability to quickly and comprehensively decompose a problem, identifying dependencies and defining tasks, and to think creatively and holistically about solutions.
The ability to work both independently and collaboratively in a fast‑paced environment, and adjust as priorities change.
The ability to communicate concisely but effectively with colleagues, leaders, and stakeholders, and tailor communications to the needs and understanding of a particular audience.
The flexibility to work on a globally‑distributed team managed from the United States.
Why We Think You'll Love Teradata We prioritize a people‑first culture because we know our people are at the very heart of our success. We embrace a flexible work model because we trust our people to make decisions about how, when, and where they work. We focus on well‑being because we care about our people and their ability to thrive both personally and professionally. We are committed to actively working to foster an inclusive environment that celebrates people for all of who they are.
#J-18808-Ljbffr
Languages
- English
This job comes from a TieTalent partner platform. Click "Apply Now" to submit your application directly on their site.