Cast List: How can AIs know what we want if *we* don't even know? (with Geoffrey Irving)

How can AIs know what we want if we don't even know? (with Geoffrey Irving)

Clearer Thinking with Spencer Greenberg

Episode: How can AIs know what we want if *we* don't even know? (with Geoffrey Irving)
Website: Clearer Thinking with Spencer Greenberg
Feed URL: https://podcast.clearerthinking.org/rss.xml
Duration: 01:19:53
Published: 2024-01-25 00:59

What does it really mean to align an AI system with human values? What would a powerful AI need to do in order to do "what we want"? How does being an assistant differ from being an agent? Could inter-AI debate work as an alignment strategy, or would it just result in arguments designed to manipulate humans via their cognitive and emotional biases? How can we make sure that all human values are learned by AIs, not just the values of humans in WEIRD societies? Are our current state-of-the-art LLMs politically left-leaning? How can alignment strategies take into account the fact that our individual and collective values occasionally change over time?

Geoffrey Irving is an AI safety researcher at DeepMind. Before that, he led the Reflection Team at OpenAI, was involved in neural network theorem proving at Google Brain, cofounded Eddy Systems to autocorrect code as you type, and worked on computational physics and geometry at Otherlab, D. E. Shaw Research, Pixar, and Weta Digital. He has screen credits on Ratatouille, WALL•E, Up, and Tintin. Learn more about him at his website, naml.us.

Further reading

Staff

Spencer Greenberg — Host / Director
Josh Castle — Producer
Ryan Kessler — Audio Engineer
Uri Bram — Factotum
WeAmplify — Transcriptionists

Music

Affiliates

[Read more]

How can AIs know what we want if we don't even know? (with Geoffrey Irving)

Clearer Thinking with Spencer Greenberg

Next Episodes

Schemas, goals, values, and the pursuit of happiness (with Jeff Perron) @ Clearer Thinking with Spencer Greenberg

Cognitive Behavioral Therapy and beyond (with David Burns) @ Clearer Thinking with Spencer Greenberg

There are shrinks, and then there are SUPER-shrinks (with Daryl Chow) @ Clearer Thinking with Spencer Greenberg

Bringing conspiracy theorists back from the brink (with Jesse Richardson) @ Clearer Thinking with Spencer Greenberg

Simulacra levels, moral mazes, and low-hanging fruit (with Zvi Mowshowitz) @ Clearer Thinking with Spencer Greenberg