A

Senior AI Engineer – Pre-training Data

Aleph Alpha

🌍 Europe 🏠 Remote ⏱ Part-time 💼 Senior 🗓 1 weeks ago

OUR MISSION

Aleph Alpha is one of the few companies in Europe doing serious foundation model pre-training. Our customers - in finance, manufacturing, public administration - need models that understand German, meet European regulatory requirements, and work reliably in high-stakes settings. We're building that in Heidelberg.

We're growing our pre-training team and hiring someone to passionately work on data: defining what goes into our models, building the systems that source and prepare it, and ensuring our training team has the highest-quality data to push model capabilities forward.

TEAM CULTURE

At Aleph Alpha, we foster a culture built on ownership, autonomy, and empowerment. Teams and individual contributors are trusted to take responsibility for their work and drive meaningful impact. We maintain a flat organizational structure with efficient, supportive management that enables quick decision‑making, open communication, and a strong sense of shared purpose.

ABOUT THE ROLE

As a Senior AI Engineer in Pre-training Data, you will work across the full stack of data preparation - from sourcing and acquisition to processing, filtering, and mixture design. Some weeks you'll be deep in data quality analysis, understanding what makes a corpus valuable and how its composition affects downstream performance on public and bespoke evaluation tasks. Other weeks you'll be optimising large-scale processing pipelines or building tooling that gives the team visibility into what our models are actually training on. And some weeks you'll be reading the latest research on pre-training data methods, translating findings into experiments you can run against our stack.

We approach data work in an evidence-based way. Decisions about filtering strategies, data mixtures, and quality thresholds are backed by ablations - you'll design and run targeted experiments to validate that your data choices actually improve model outcomes.

We are looking for someone that combines significa...

Share this job: