80,000 Hours Podcast

#58 - Pushmeet Kohli of DeepMind on designing robust & reliable AI systems and how to succeed in AI

80,000 Hours Podcast

When you're building a bridge, responsibility for making sure it won't fall over isn't handed over to a few 'bridge not falling down engineers'. Making sure a bridge is safe to use and remains standing in a storm is completely central to the design, and indeed the entire project.

When it comes to artificial intelligence, commentators often distinguish between enhancing the capabilities of machine learning systems and enhancing their safety. But to Pushmeet Kohli, principal scientist and research team leader at DeepMind, research to make AI robust and reliable is no more a side-project in AI design than keeping a bridge standing is a side-project in bridge design.

Far from being an overhead on the 'real' work, it’s an essential part of making AI systems work at all. We don’t want AI systems to be out of alignment with our intentions, and that consideration must arise throughout their development.

Professor Stuart Russell — co-author of the most popular AI textbook — has gone as far as to suggest that if this view is right, it may be time to retire the term ‘AI safety research’ altogether.

Want to be notified about high-impact opportunities to help ensure AI remains safe and beneficial? Tell us a bit about yourself and we’ll get in touch if an opportunity matches your background and interests.

Links to learn more, summary and full transcript.

And a few added thoughts on non-research roles.

With the goal of designing systems that are reliably consistent with desired specifications, DeepMind have recently published work on important technical challenges for the machine learning community.

For instance, Pushmeet is looking for efficient ways to test whether a system conforms to the desired specifications, even in peculiar situations, by creating an 'adversary' that proactively seeks out the worst failures possible. If the adversary can efficiently identify the worst-case input for a given model, DeepMind can catch rare failure cases before deploying a model in the real world. In the future single mistakes by autonomous systems may have very large consequences, which will make even small failure probabilities unacceptable.

He's also looking into 'training specification-consistent models' and formal verification', while other researchers at DeepMind working on their AI safety agenda are figuring out how to understand agent incentives, avoid side-effects, and model AI rewards.

In today’s interview, we focus on the convergence between broader AI research and robustness, as well as:

• DeepMind’s work on the protein folding problem
• Parallels between ML problems and past challenges in software development and computer security
• How can you analyse the thinking of a neural network?
• Unique challenges faced by DeepMind’s technical AGI safety team
• How do you communicate with a non-human intelligence?
• What are the biggest misunderstandings about AI safety and reliability?
• Are there actually a lot of disagreements within the field?
• The difficulty of forecasting AI development

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type 80,000 Hours into your podcasting app. Or read the transcript below.

The 80,000 Hours Podcast is produced by Keiran Harris.

Next Episodes