XX
Site Reliability Engineer (SRE)Bright Vision TechnologiesFrisco, Texas, United States

Dieses Stellenangebot ist nicht mehr verfügbar

XX

Site Reliability Engineer (SRE)

Bright Vision Technologies
  • US
    Frisco, Texas, United States
  • US
    Frisco, Texas, United States

Über

Bright Vision Technologies is a forward-thinking software development company dedicated to building innovative solutions that help businesses automate and optimize their operations.
We leverage cutting-edge technologies to create scalable, secure, and user‑friendly applications.
Site Reliability Engineer (SRE)
Job Title:
Site Reliability Engineer (SRE)
Location:
100% Remote (Continental United States)
Position Type:
In‑house Bright Vision Technologies SOW engagement
Experience:
5+ years
Employment Type:
Full‑time W2
Engagement:
Long‑term, multi‑year aligned to the Bright Vision SOW delivery roadmap
Compensation:
Competitive base salary commensurate with experience plus benefits
Employment Terms & Visa Policy:
100% remote, full‑time direct W2. No C2C, 1099, or third‑party arrangements. No new H1B sponsorship; H1B transfers welcomed for qualified candidates.
Job Summary We are seeking an experienced Site Reliability Engineer to ensure the availability, performance, and operational excellence of large‑scale distributed systems in production. As an SRE you will live at the boundary between development and operations, applying strong software engineering principles to infrastructure and operations problems, and continually pushing the platform toward higher reliability with lower operational toil.
Key Responsibilities
Define, instrument, and continually refine SLOs, SLIs, and error budgets for critical services.
Lead incident response and resolution for production issues; act as incident commander when needed.
Design and implement comprehensive monitoring, logging, and tracing strategies using Prometheus, Grafana, OpenTelemetry, ELK/EFK, Datadog, or similar.
Build and maintain robust on‑call processes, runbooks, and escalation paths.
Automate operational toil aggressively with production‑grade tooling in Python, Go, Bash, or similar.
Architect and operate large‑scale Kubernetes clusters and container‑based workloads.
Design CI/CD pipelines that promote safe, frequent, and observable releases.
Lead capacity planning and performance engineering activities, including load testing and chaos experiments.
Partner closely with application development teams to embed reliability practices early in design.
Strengthen platform resiliency through chaos engineering, fault injection, retries, timeouts, circuit breakers, and failover paths.
Drive continuous improvement of security posture in collaboration with security teams.
Contribute to the technical roadmap for reliability tooling, observability platforms, and developer‑experience improvements.
Mentor engineers across the organization on SRE practices and foster a strong, blameless culture of operational excellence.
Required Qualifications
Bachelor’s degree in Computer Science, Engineering, or a related technical discipline.
Five or more years of SRE, DevOps, or production engineering experience supporting large‑scale distributed systems.
Strong programming skills in at least one of Python, Go, or Java.
Deep, hands‑on experience operating Linux at scale.
Production experience operating Kubernetes and container‑based workloads.
Strong working knowledge of observability tooling such as Prometheus, Grafana, OpenTelemetry, ELK/EFK.
Hands‑on experience designing and operating CI/CD pipelines.
Solid understanding of distributed system design.
Experience leading incident response and conducting effective post‑incident reviews.
Excellent communication and documentation skills.
Preferred Qualifications
Experience defining and operationalizing SLOs and error budgets in real production environments.
Exposure to chaos engineering practices and tools such as Chaos Monkey, Gremlin, or Litmus.
Hands‑on experience with a major cloud platform (AWS, Azure, or GCP).
Background in capacity planning, performance engineering, or large‑scale load testing.
Familiarity with service mesh technologies such as Istio, Linkerd, or Consul.
How to Apply Your resume will be reviewed by our hiring team. For immediate consideration, please send your resume to harry@bvteck.com or contact us at (908) 676‑4399. Learn more about Bright Vision Technologies at www.bvteck.com.
Equal Employment Opportunity (EEO) Statement Bright Vision Technologies (BV Teck) is committed to equal employment opportunity for all employees and applicants without regard to race, color, religion, sex, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, veteran status, or any other protected status as defined by applicable federal, state, or local laws. This commitment extends to all aspects of employment, including recruitment, hiring, training, compensation, promotion, transfer, leaves of absence, termination, layoffs, and recall.
BV Teck expressly prohibits any form of workplace harassment or discrimination. Any improper interference with employees’ ability to perform their job duties may result in disciplinary action up to and including termination of employment.
#J-18808-Ljbffr
  • Frisco, Texas, United States

Sprachkenntnisse

  • English
Hinweis für Nutzer

Dieses Stellenangebot wurde von einem unserer Partner veröffentlicht. Sie können das Originalangebot einsehen hier.