Network Engineer
Sesterce Group
- San Francisco, California, United States
- San Francisco, California, United States
About
Design and deploy InfiniBand (NDR 400G / HDR) and high-speed Ethernet fabrics for GPU clusters of 1,000+ nodes Configure and operate Arista, Juniper, and Mellanox/NVIDIA equipment; manage BGP, OSPF, and VXLAN overlays Tune RoCE and InfiniBand transport for collective communication workloads (NCCL, UCX) Maintain network automation pipelines (Ansible, Netbox, Nautobot) across all sites Troubleshoot performance regressions, packet loss, and congestion during live AI training runs What we are looking for
4+ years of datacenter networking experience, including InfiniBand or 400G/800G Ethernet at scale Deep familiarity with RDMA, RoCE v2, and GPU training cluster communication patterns Solid command of Linux networking internals (DSCP, ECN, PFC, adaptive routing) Experience with network-as-code tooling (Ansible, Terraform, Netbox) and CI/CD pipelines Ability to interpret low-level diagnostics (tcpdump, perftest, ib_write_bw) and correlate with application performance
#J-18808-Ljbffr
Languages
- English
Notice for Users
This job comes from a TieTalent partner platform. Click "Apply Now" to submit your application directly on their site.