AI Risk

What is artificial general intelligence safety / AI alignment?

AI alignment is a field that is focused on causing the goals of future superintelligent artificial systems to align with human values, meaning that they would behave in a way which was compatible with our survival and flourishing. This may be an extremely hard problem, especially with deep learning, and is likely to determine the outcome of the most important century. Alignment research is strongly interdisciplinary and can include computer science, mathematics, neuroscience, philosophy, and social sciences. –


The Treacherous Turn

Reading List

Frequently quoted textbook

Superintelligence by Nick Bostrom, 2014

Human compatible by Stuart Russell, 2019

China’s role in AI Risk

AI Superpowers: China, Silicon Valley, and the New World Order by 李开复

Non-agentic AI risk

Reframing Superintelligence: Comprehensive AI Services As General Intelligence by Eric Drexler

Supposedly smoother read than those academic books

The Alignment Problem: Machine Learning and Human Values

Reading What We Can <- Study guide for AI Safety.

Notes mentioning this note