A

Senior AI Researcher - Pre-training Data

Aleph Alpha

🌍 Europe 🏠 Remote ⏱ Part-time 💼 Senior 🗓 4 days ago

OUR MISSION

Aleph Alpha is one of the few companies in Europe doing serious foundation model pre-training. Our customers - in finance, manufacturing, public administration - need models that understand German, meet European regulatory requirements, and work reliably in high-stakes settings. We're building that in Heidelberg.

We're growing our pre-training team and hiring someone to passionately work on data: defining what goes into our models, building the systems that source and prepare it, and ensuring our training team has the highest-quality data to push model capabilities forward.

Team Culture

At Aleph Alpha, we foster a culture built on ownership, autonomy, and empowerment. Teams and individual contributors are trusted to take responsibility for their work and drive meaningful impact. We maintain a flat organisational structure with efficient, supportive management that enables quick decision‑making, open communication, and a strong sense of shared purpose.

About the role

As a Senior AI Researcher for Pre-training Data, you will shape and improve the underlying scientific methodology behind our pre-training corpora while also co-engineering the software and systems that enable this. Working with engineers and other researchers to build scalable pipelines, you will focus on relevant theoretical and empirical research required to understand which data makes models perform best on our targeted capabilities.

This role is for you if you have a strong background in large-scale language modeling and the scientific drive to answer complex questions about data scaling laws, synthetic data generation, and curriculum learning.

In your day-to-day, you will design targeted ablations across various scales, derive and test hypotheses from training dynamics, develop novel algorithms for estimating data quality and performing data curation, and contribute to a range of engineering tasks which facilitate these research directions. Together with a collaborative team of e...

Share this job: