Blog
January 15, 2026
Reflections on AI Safety Research in 2025
A comprehensive year-end review of the most impactful developments in AI safety, from Constitutional AI to scalable oversight. What we got right, what surprised us, and what still keeps me up at night.
Read more ->November 3, 2025
Our New Open-Source Toolkit for RLHF Research
Today we are open-sourcing AlignKit, a comprehensive library for reward modeling, PPO training, and DPO fine-tuning. Here is the story behind its development and how to get started.
Read more ->September 20, 2025
Tips for PhD Students Starting in AI Research
Practical advice for new graduate students on choosing research problems, managing advisor relationships, navigating the publication process, and building an academic profile in the age of large language models.
Read more ->