Kubernetes Platform Engineer
Cadre5
- New York, New York, United States
- New York, New York, United States
À propos
AmSC is a first-of-its-kind, federally funded cloud infrastructure and API platform designed to accelerate AI model development, data sharing, and large-scale computational science across the U.S. Department of Energy (DOE). ORNL is a premier research institution delivering breakthroughs in energy, national security, and advanced computing.
ORNL delivers scientific discoveries and technical breakthroughs needed to realize solutions in energy and national security and provides economic benefit to the nation. This premier research institution located near Knoxville in Oak Ridge, TN, addresses national needs through impactful research and world-leading research centers.
Please note: The first step in the interview process requires candidates to join a Microsoft Teams meeting with the video turned on.
This is a full-time position that can telecommute. Occasional travel to the Oak Ridge facility may be required.
Why Cadre5?
Working with highly talented team members
3 weeks’ vacation
Excellent medical insurance, including employer-paid benefits
Job Responsibilities Cluster Operations & Administration
Manage the full lifecycle of Kubernetes clusters (on-premises K3s/RKE2, GKE, and EKS), including upgrades, security patching, scaling, and capacity planning
Troubleshoot cluster-level issues including control plane problems, node failures, and resource constraints
Implement and maintain cluster security hardening based on CIS benchmarks and organizational security policies
Manage etcd cluster health, backup procedures, and disaster recovery capabilities
Monitor cluster performance and optimize resource utilization across multi-tenant workloads
Coordinate with datacenter operations team for physical infrastructure changes and maintenance windows
Networking & Cilium CNI
Implement, configure, and maintain Cilium CNI across on-premises and cloud Kubernetes environments
Design and enforce network policies to achieve secure multi-tenant isolation
Troubleshoot complex pod networking issues including DNS resolution, service discovery, and connectivity problems
Configure and maintain BGP peering with physical network infrastructure for on-premises integration
Work with network engineering team on firewall rules, VLANs, IPv6 networking, and network architecture
Basic Qualifications
Typically requires a minimum of 8 years of related experience with a Bachelor’s degree; or 6 years and a Master’s degree; or equivalent experience.
Demonstrated experience administering Kubernetes on on-premises infrastructure (K3s, RKE2, or similar bare-metal distributions)
Experience with cloud-managed Kubernetes (GKE and/or EKS)
Strong understanding of Linux networking fundamentals: iptables/nftables, routing tables, DNS, TCP/IP stack, network troubleshooting
Experience with GitOps methodologies and tools such as ArgoCD or Flux
Proficiency in scripting and automation: Bash, Python, Go
Cilium CNI or equivalent production experience
Ability to work collaboratively in a team environment and communicate technical concepts clearly
Understanding of Kubernetes security best practices including Pod Security Standards, RBAC, and secrets management
GCP (Google Cloud Platform) and/or AWS (Amazon Web Services) cloud platform experience
The ability to obtain and maintain a Department of Energy "Q" clearance is required. This requires US Citizenship.
Preferred Qualifications
Go programming experience for operator maintenance and platform tooling development
CKA (Certified Kubernetes Administrator) or CKS (Certified Kubernetes Security Specialist) certification
Background in BGP routing protocols and network engineering concepts
IPv6 networking experience
Infrastructure as Code experience with Terraform or Ansible
Experience with internal developer platform (IDP) tools such as Backstage or similar
Experience with service mesh technologies (Istio, Linkerd)
Excellent understanding of code review and familiarity with GitHub and GitLab workflows
Benefits Cadre5 offers excellent pay and benefits, to include full medical, dental, and vision coverage coupled with 401K match, 15 days PTO, and 10 holidays.
Cadre5 is an equal opportunity employer. All qualified applicants, including individuals with disabilities and protected veterans, are encouraged to apply. Cadre5 is an E-Verify Employer.
#J-18808-Ljbffr
Compétences linguistiques
- English
Avis aux utilisateurs
Cette offre provient d’une plateforme partenaire de TieTalent. Cliquez sur « Postuler maintenant » pour soumettre votre candidature directement sur leur site.