XX
Senior Site Reliability Engineer, ObservabilityRemoteHunterUnited States
XX

Senior Site Reliability Engineer, Observability

RemoteHunter
  • US
    United States
  • US
    United States

À propos

About the Opportunity:

The organization is an industry-standard oracle platform enabling capital markets to operate onchain and powering most decentralized finance (DeFi) applications. It provides essential data, interoperability, compliance, and privacy standards for advanced blockchain use cases such as institutional tokenized assets, lending, payments, and stablecoins. Since inventing decentralized oracle networks, the organization has enabled tens of trillions in transaction value and secures the majority of DeFi. The Observability Team supports development and engineering efforts by building and maintaining reliable observability infrastructure. The Senior Site Reliability Engineer (SRE) role focuses on increasing self-service and reducing cognitive load across teams with a strong emphasis on DevOps, GitOps, and observability.

Responsibilities:


• Build and orchestrate a modern OTEL-based observability platform


• Support multiple telemetry types including metrics, logs, and traces


• Define and support governance in observability and large-scale problem management


• Ensure reliability, security, and performance exceed defined SLAs


• Collaborate with engineers to troubleshoot issues, deploy products, and improve velocity


• Lead design and deployment of monitoring and observability services with alerting capabilities


• Ingest, aggregate, transform, and utilize data from various sources in a real-time pipeline


• Oversee availability, performance, and supportability of observability infrastructure


• Create and manage alert response processes to ensure reliable data delivery


• Recommend metrics collection for alert creation during new feature releases


• Champion reliability and security by prioritizing quality in all work

Requirements:


• 7 or more years of relevant experience in DevOps, infrastructure, SRE, or platform roles


• Ability to develop software beyond typical infrastructure configurations


• Experience programming in one or more of the following: C, C++, Java, Python, Go, Perl, Ruby


• Expert knowledge in designing, developing, and managing large real-time systems


• Experience with monitoring and logging tools such as Prometheus, Grafana, ELK Stack, Splunk, or Grafana Stack


• Experience with distributed systems and container orchestration including Kubernetes


• Strong communication skills with ability to provide and receive constructive feedback

Benefits & Perks:


• Fully remote and global roles with some expectation to overlap working hours with Eastern Standard Time (EST)

Note:

RemoteHunter is not the Employer of Record (EOR) for this role. Our purpose in this opportunity is to connect exceptional candidates with leading employers. We help job seekers worldwide discover roles that match their goals and guide them to complete their full application directly through the hiring company's career page or ATS.

  • United States

Compétences linguistiques

  • English
Avis aux utilisateurs

Cette offre provient d’une plateforme partenaire de TieTalent. Cliquez sur « Postuler maintenant » pour soumettre votre candidature directement sur leur site.