Research interests

Last updated on 12024-12-26

“Don’t work on dumb things!” - Lennart Heim

I’m researching how to reduce AI Risk.

Ongoing projects:

LLM Reasoning consistency benchmarking
How well does Sparse Autoencoder (SAE) work?

Also studying:

Evaluation for risks from multi-agent interactions
Communicating affordance for general AI tools
Mitigating Over-reliance
AI Literacy
Games that change the games we play

Want to learn more about:

Identifying and simulating sociotechnical AI risks
Privacy-preserving model evaluation
Resilience Engineering
Science of LLM capability emergence
Contextual Bandit

Previous iterations of research interests

Notes mentioning this note

Research interests (archive)

This page archives my research interests back then, and what happened to them. Archiving them allow me to trace the...

Research interests 2023 winter

This page archives my research interests back then, and what happened to them. Archiving them allow me to trace the...