Joshua Clymer
Researcher, Redwood Research
-
Area: AI Control
I’m interested in prototyping safety cases (e.g. see my work here, which was cited in a senate hearing testimony, and my work here with UK AISI). How would companies rigorously show that their models are aligned or controlled? (There are a lot of open questions here!)
I’m particularly interested in writing an ML paper prototyping white box evaluations for scheming models, basically a scaled-up version of this previous work I did (the title might be “How to catch a schemer: lessons from catching 1000 schemers”).
I’m also researching how to verify international agreements, in particular, with AI assistance.
-
I’m a technical AI Safety Researcher at Redwood Research.
Before that, I researched AI threat models and developed evaluations for self-improvement capabilities at METR.
-
I’m mostly looking for strong software engineers. ML engineering experience is a bonus but not required. I also especially enjoy working with people who have some amount of vision or at least curiosity for research directions. I’d like to help people grow into researchers with their own independent views and agendas.