80,000 Hours Podcast

#158 – Holden Karnofsky on how AIs might take over even if they're no smarter than humans, and his 4-part playbook for AI risk

80,000 Hours Podcast

Back in 2007, Holden Karnofsky cofounded GiveWell, where he sought out the charities that most cost-effectively helped save lives. He then cofounded Open Philanthropy, where he oversaw a team making billions of dollars’ worth of grants across a range of areas: pandemic control, criminal justice reform, farmed animal welfare, and making AI safe, among others. This year, having learned about AI for years and observed recent events, he's narrowing his focus once again, this time on making the transition to advanced AI go well.

In today's conversation, Holden returns to the show to share his overall understanding of the promise and the risks posed by machine intelligence, and what to do about it. That understanding has accumulated over around 14 years, during which he went from being sceptical that AI was important or risky, to making AI risks the focus of his work.

Links to learn more, summary and full transcript.

(As Holden reminds us, his wife is also the president of one of the world's top AI labs, Anthropic, giving him both conflicts of interest and a front-row seat to recent events. For our part, Open Philanthropy is 80,000 Hours' largest financial supporter.)

One point he makes is that people are too narrowly focused on AI becoming 'superintelligent.' While that could happen and would be important, it's not necessary for AI to be transformative or perilous. Rather, machines with human levels of intelligence could end up being enormously influential simply if the amount of computer hardware globally were able to operate tens or hundreds of billions of them, in a sense making machine intelligences a majority of the global population, or at least a majority of global thought.

As Holden explains, he sees four key parts to the playbook humanity should use to guide the transition to very advanced AI in a positive direction: alignment research, standards and monitoring, creating a successful and careful AI lab, and finally, information security.

In today’s episode, host Rob Wiblin interviews return guest Holden Karnofsky about that playbook, as well as:

  • Why we can’t rely on just gradually solving those problems as they come up, the way we usually do with new technologies.
  • What multiple different groups can do to improve our chances of a good outcome — including listeners to this show, governments, computer security experts, and journalists.
  • Holden’s case against 'hardcore utilitarianism' and what actually motivates him to work hard for a better world.
  • What the ML and AI safety communities get wrong in Holden's view.
  • Ways we might succeed with AI just by dumb luck.
  • The value of laying out imaginable success stories.
  • Why information security is so important and underrated.
  • Whether it's good to work at an AI lab that you think is particularly careful.
  • The track record of futurists’ predictions.
  • And much more.

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript.

Producer: Keiran Harris
Audio Engineering Lead: Ben Cordell

Technical editing: Simon Monsour and Milo McGuire

Transcriptions: Katy Moore

Next Episodes