N

Senior Software Engineer, CUDA Core Libraries

NVIDIA

🌍 Europe 🏠 Remote ⏱ Part-time 💼 Senior 🗓 4 weeks ago

NVIDIA’s accelerated computing platform is the foundation of modern HPC and AI.At the core of this platform are the CUDA Core Libraries. C++ and Python libraries that enable developers to write fast, reliable, and scalable GPU-accelerated software! We are hiring a full-time Software Engineer to work on the CUDA Core Libraries that power GPU computing for both C++ and Python developers. This includes projects such as CCCL (Thrust, CUB, libcudacxx), cuda-python, and numba-cuda. You will join the team building the foundational libraries, algorithms, and language/runtime infrastructure that make CUDA a speed-of-light experience for developers across deep learning, scientific computing, and data analytics! What you’ll be doing: Develop and implement CUDA Core Libraries in C++ and/or Python, including parallel algorithms and idiomatic language bindings for core CUDA functionality. Compose, optimize, and evolve GPU algorithms and APIs, from high-level interfaces down to low-level performance tuning involving memory, parallelism, and synchronization. Own features end-to-end: develop, implementation, testing, benchmarking, documentation, and long-term maintenance. Improve developer experience across the stack: CI, tests, benchmarks, packaging, examples, and docs. Collaborate with senior CUDA engineers in design reviews, code reviews, and open-source-style workflows. Engage with real users through issues, performance investigations, and API feedback. What we need to see: BS, MS, or PhD in Computer Science, Computer Engineering, or a related field or equivalent experience. Minimum of 8+ years of related development experience Strong programming skills in C++, Python, or both, with proven interest in systems-level software (performance, memory, concurrency, API design). Solid understanding of modern C++ (templates, generics, standard library) and/or Python library development and packaging. Practical experience with parallel or heterogeneous programming (CUDA, OpenMP, GPU-accel...

Share this job: