Senior Software Engineer - Accelerated Kubernetes Runtime Team
NVIDIA
Join NVIDIA's Accelerated Kubernetes Runtime team and be at the forefront of building the next generation of GPU-accelerated Kubernetes runtime distributions. As a Software Engineer on the Runtime team, you will design and build automation systems that enable engineers to seamlessly install, upgrade, and manage cluster runtime packages powering NVIDIA's AI Accelerators. You'll work on innovative controller systems that optimize runtime components for the latest GPU architectures including GB200/GB300, Vera Rubin and beyond, ensuring that AI researchers and developers have reliable, secure, and performant infrastructure at their fingertips. The Runtime team is responsible for providing an NVIDIA-Accelerated Kubernetes runtime that can be applied to any cluster using NVIDIA accelerators, empowering engineers with automation-first, self-service tools that minimize manual effort while enhancing reliability and reproducibility. What you will be doing: Design and implement runtime features that orchestrate the lifecycle of runtime components across thousands of Kubernetes clusters without manual intervention Build and maintain the systems that configure, package, validate, and distribute accelerated compute components Develop Kubernetes controllers, CRDs, and operators that automate runtime installation, upgrade, and rollback operations with API driven workflows What we need to see: Bachelors in Computer Science, or equivalent experience 8+ years of professional experience, with at least 3 years of experience with Kubernetes development Experience building production Kubernetes systems with significant expertise in controllers, operators, and CustomResourceDefinitions Strong proficiency in Go and experience building scalable Go services that manage complex distributed systems Hands-on experience with Helm, Kustomize, and managing Kubernetes manifest packaging and templating Demonstrated ability to design and implement automation systems that replace manual processes with ...
Share this job: