H

Member of technical staff

H Company

🌍 Europe 🏠 Remote ⏱ Part-time 💼 Senior 🗓 1 weeks ago

About H:
H exists to push the boundaries of superintelligence with agentic AI. By automating complex, multi-step tasks typically performed by humans, AI agents will help unlock full human potential.

H is hiring the world’s best AI talent, seeking those who are dedicated as much to building safely and responsibly as to advancing disruptive agentic capabilities. We promote a mindset of openness, learning, and collaboration, where everyone has something to contribute.

About the Team: The Inference team develops and enhances the inference stack for serving H-models that power our agent technology. The team focuses on optimizing hardware utilization to reach high throughput, low latency and cost efficiency in order to deliver a seamless user experience.

Key Responsibilities:

- Develop scalable, low-latency and cost effective inference pipelines

- Optimize model performance: memory usage, throughput, and latency, using advanced techniques like distributed computing, model compression, quantization and caching mechanisms

- Develop specialized GPU kernels for performance-critical tasks like attention mechanisms, matrix multiplications, etc.

- Collaborate with H research teams on model architectures to enhance efficiency during inference

- Review state-of-the-art papers to improve memory usage, throughput and latency (Flash attention, Paged Attention, Continuous batching, etc.)

- Prioritize and implement state-of-the-art inference techniques

Requirements:

- Technical skills:

- MS or PhD in Computer Science, Machine Learning or related fields

- Proficient in at least one of the following programming languages: Python, Rust or C/C++

- Experience in GPU programming such as CUDA, Open AI Triton, Metal, etc.

- Experience in model compression and quantization techniques

- Soft skills

- Collaborative mindset, thriving in dynamic, multidisciplinary teams

- Strong communication and presentation skills

- Eager to ex...

Share this job: