Linear Digressions

Linear Digressions

Linear Digressions is a podcast about machine learning and data science. Machine learning is being used to solve a ton of interesting problems, and to accomplish goals that were out of reach even a few short years ago.

Episodes

Title Duration Published Consumed
So long, and thanks for all the fish 00:35:44 2020-07-27 01:32
A Reality Check on AI-Driven Medical Assistants 00:14:00 2020-07-20 01:51
A Data Science Take on Open Policing Data 00:23:44 2020-07-13 04:02
Procella: YouTube's super-system for analytics data storage 00:29:48 2020-07-06 04:29
The Data Science Open Source Ecosystem 00:23:06 2020-06-29 04:34
Rock the ROC Curve 00:15:52 2020-06-22 01:34
Criminology and Data Science 00:30:57 2020-06-15 03:26
Racism, the criminal justice system, and data science 00:31:36 2020-06-08 01:33
An interstitial word from Ben 00:05:59 2020-06-05 03:38
Convolutional Neural Networks 00:21:55 2020-05-31 23:46
Stein's Paradox 00:27:02 2020-05-25 00:21
Protecting Individual-Level Census Data with Differential Privacy 00:21:19 2020-05-18 03:49
Causal Trees 00:15:27 2020-05-11 03:34
The Grammar Of Graphics 00:35:38 2020-05-04 03:12
Gaussian Processes 00:20:55 2020-04-27 03:33
Keeping ourselves honest when we work with observational healthcare data 00:19:08 2020-04-20 04:43
Changing our formulation of AI to avoid runaway risks: Interview with Prof. Stuart Russell 00:28:58 2020-04-13 03:55
Putting machine learning into a database 00:24:22 2020-04-06 03:51
The work-from-home episode 00:29:06 2020-03-30 00:23
Understanding Covid-19 transmission: what the data suggests about how the disease spreads 00:25:25 2020-03-23 02:03
Network effects re-release: when the power of a public health measure lies in widespread adoption 00:26:40 2020-03-15 23:43
Causal inference when you can't experiment: difference-in-differences and synthetic controls 00:20:48 2020-03-09 02:39
Better know a distribution: the Poisson distribution 00:31:51 2020-03-02 03:55
The Lottery Ticket Hypothesis 00:19:45 2020-02-24 00:03
Interesting technical issues prompted by GDPR and data privacy concerns 00:20:26 2020-02-17 02:50
Thinking of data science initiatives as innovation initiatives 00:17:27 2020-02-10 02:10
Building a curriculum for educating data scientists: Interview with Prof. Xiao-Li Meng 00:31:36 2020-02-03 00:36
Running experiments when there are network effects 00:24:45 2020-01-27 01:13
Zeroing in on what makes adversarial examples possible 00:22:51 2020-01-20 03:41
Unsupervised Dimensionality Reduction: UMAP vs t-SNE 00:29:34 2020-01-13 01:53
Data scientists: beware of simple metrics 00:24:47 2020-01-05 23:54
Communicating data science, from academia to industry 00:26:15 2019-12-30 02:53
Optimizing for the short-term vs. the long-term 00:19:24 2019-12-23 03:50
Interview with Prof. Andrew Lo, on using data science to inform complex business decisions 00:27:46 2019-12-16 04:15
Using machine learning to predict drug approvals 00:25:00 2019-12-08 23:56
Facial recognition, society, and the law 00:43:09 2019-12-02 04:14
Lessons learned from doing data science, at scale, in industry 00:28:00 2019-11-25 01:45
Varsity A/B Testing 00:36:00 2019-11-18 03:09
The Care and Feeding of Data Scientists: Growing Careers 00:25:19 2019-11-11 04:44
The Care and Feeding of Data Scientists: Recruiting and Hiring Data Scientists 00:20:16 2019-11-04 01:21
The Care and Feeding of Data Scientists: Recruiting and Hiring Data Scientists 00:20:16 2019-11-04 01:19
The Care and Feeding of Data Scientists: Becoming a Data Science Manager 00:24:45 2019-10-28 02:27
Procella: YouTube's super-system for analytics data storage 00:29:48 2019-10-21 03:27
Kalman Runners 00:15:59 2019-10-13 22:04
What's *really* so hard about feature engineering? 00:21:18 2019-10-07 00:37
Data storage for analytics: stars and snowflakes 00:15:22 2019-09-30 13:22
Data storage: transactions vs. analytics 00:16:08 2019-09-23 03:49
GROVER: an algorithm for making, and detecting, fake news 00:18:28 2019-09-16 05:21
Data science teams as innovation initiatives 00:15:21 2019-09-09 04:24
Can Fancy Running Shoes Cause You To Run Faster? 00:30:15 2019-09-02 01:44
Organizational Models for Data Scientists 00:23:09 2019-08-26 01:06
Data Shapley 00:16:55 2019-08-19 04:38
A Technical Deep Dive on Stanley, the First Self-Driving Car 00:41:32 2019-08-12 04:21
An Introduction to Stanley, the First Self-Driving Car 00:14:19 2019-08-05 02:28
Putting the "science" in data science: the scientific method, the null hypothesis, and p-hacking 00:24:11 2019-07-29 03:30
Interleaving 00:16:54 2019-07-22 14:20
Federated Learning 00:15:03 2019-07-15 01:00
Endogenous Variables and Measuring Protest Effectiveness 00:17:58 2019-07-08 00:59
Deepfakes 00:15:08 2019-07-01 03:25
Revisiting Biased Word Embeddings 00:18:09 2019-06-24 02:26
Attention in Neural Nets 00:26:32 2019-06-17 02:28
Interview with Joel Grus 00:39:46 2019-06-10 04:05
Re - Release: Factorization Machines 00:20:09 2019-06-03 03:32
Re-release: Auto-generating websites with deep learning 00:19:38 2019-05-27 04:01
Advice to those trying to get a first job in data science 00:17:33 2019-05-19 23:50
Re - Release: Machine Learning Technical Debt 00:22:29 2019-05-13 01:07
Estimating Software Projects, and Why It's Hard 00:19:07 2019-05-06 00:27
The Black Hole Algorithm 00:20:17 2019-04-29 02:55
Structure in AI 00:19:05 2019-04-22 00:29
The Great Data Science Specialist vs. Generalist Debate 00:14:10 2019-04-15 02:55
Google X, and Taking Risks the Smart Way 00:19:04 2019-04-08 03:10
Statistical Significance in Hypothesis Testing 00:22:34 2019-04-01 03:34
The Language Model Too Dangerous to Release 00:21:01 2019-03-25 02:39
The cathedral and the bazaar 00:32:36 2019-03-17 23:47
AlphaStar 00:22:03 2019-03-11 02:18
Are machine learning engineers the new data scientists? 00:20:46 2019-03-04 03:57
Interview with Alex Radovic, particle physicist turned machine learning researcher 00:35:42 2019-02-25 02:59
K Nearest Neighbors 00:16:25 2019-02-18 00:57
Not every deep learning paper is great. Is that a problem? 00:17:54 2019-02-11 01:06
The Assumptions of Ordinary Least Squares 00:25:07 2019-02-04 00:24
Quantile Regression 00:21:46 2019-01-28 02:27
Heterogeneous Treatment Effects 00:17:24 2019-01-21 00:57
Pre-training language models for natural language processing problems 00:27:35 2019-01-14 01:42
Facial Recognition, Society, and the Law 00:42:46 2019-01-07 03:03
Re-release: Word2Vec 00:17:59 2018-12-31 02:56
Re - Release: The Cold Start Problem 00:15:37 2018-12-23 21:23
Convex (and non-convex) Optimization 00:20:00 2018-12-17 04:06
The Normal Distribution and the Central Limit Theorem 00:27:11 2018-12-09 19:58
Software 2.0 00:17:22 2018-12-03 00:23
Limitations of Deep Nets for Computer Vision 00:27:20 2018-11-18 20:01
Building Data Science Teams 00:25:09 2018-11-12 04:16
Optimized Optimized Web Crawling 00:19:42 2018-11-04 22:38
Optimized Web Crawling 00:21:32 2018-10-29 00:56
Better Know a Distribution: The Poisson Distribution 00:31:51 2018-10-22 02:53
Searching for Datasets with Google 00:19:54 2018-10-15 03:11
It's our fourth birthday 00:22:06 2018-10-08 04:33
Gigantic Searches in Particle Physics 00:24:46 2018-09-30 20:52
Gigantic Searches in Particle Physics 00:24:46 2018-09-30 20:51
Data Engineering 00:16:22 2018-09-24 03:10
Text Analysis for Guessing the NYTimes Op-Ed Author 00:18:37 2018-09-16 20:13
The Three Types of Data Scientists, and What They Actually Do 00:23:25 2018-09-09 21:00
Agile Development for Data Scientists, Part 2: Where Modifications Help 00:27:17 2018-08-26 21:59
Agile Development for Data Scientists, Part 1: The Good 00:25:56 2018-08-19 20:06
Re - Release: How To Lose At Kaggle 00:17:54 2018-08-13 04:31
Troubling Trends In Machine Learning Scholarship 00:29:35 2018-08-06 03:31
Can Fancy Running Shoes Cause You To Run Faster? 00:28:37 2018-07-29 21:12
Compliance Bias 00:23:28 2018-07-22 18:07
AI Winter 00:19:02 2018-07-15 22:11
Rerelease: How to Find New Things to Learn 00:18:32 2018-07-09 00:28
Rerelease: Space Codes 00:24:30 2018-07-02 06:36
Rerelease: Anscombe's Quartet 00:16:14 2018-06-25 03:20
Rerelease: Hurricanes Produced 00:28:12 2018-06-18 19:00
GDPR 00:18:24 2018-06-11 04:24
Git for Data Scientists 00:22:05 2018-06-03 19:52
Analytics Maturity 00:19:32 2018-05-20 17:09
SHAP: Shapley Values in Machine Learning 00:19:12 2018-05-13 16:24
Game Theory for Model Interpretability: Shapley Values 00:27:06 2018-05-07 04:17
AutoML 00:15:24 2018-04-30 04:50
CPUs, GPUs, TPUs: Hardware for Deep Learning 00:12:40 2018-04-23 04:52
A Technical Introduction to Capsule Networks 00:31:28 2018-04-16 03:12
A Conceptual Introduction to Capsule Networks 00:14:05 2018-04-09 03:59
Convolutional Neural Nets 00:21:55 2018-04-02 03:40
Google Flu Trends 00:12:46 2018-03-26 03:20
How to pick projects for a professional data science team 00:31:17 2018-03-19 04:07
Autoencoders 00:12:41 2018-03-12 02:47
When Private Data Isn't Private Anymore 00:26:20 2018-03-05 04:35
What makes a machine learning algorithm "superhuman"? 00:34:48 2018-02-26 05:52
Open Data and Open Science 00:16:54 2018-02-19 02:39
Defining the quality of a machine learning production system 00:20:29 2018-02-12 03:00
Auto-generating websites with deep learning 00:19:24 2018-02-05 00:02
The Case for Learned Index Structures, Part 2: Hash Maps and Bloom Filters 00:20:41 2018-01-29 03:15
The Case for Learned Index Structures, Part 1: B-Trees 00:18:50 2018-01-22 03:32
Challenges with Using Machine Learning to Classify Chest X-Rays 00:18:00 2018-01-15 02:57
The Fourier Transform 00:15:39 2018-01-08 03:07
Statistics of Beer 00:15:20 2018-01-02 02:57
Re - Release: Random Kanye 00:09:33 2017-12-24 20:07
Debiasing Word Embeddings 00:18:20 2017-12-18 03:31
The Kernel Trick and Support Vector Machines 00:17:48 2017-12-11 02:58
Maximal Margin Classifiers 00:14:21 2017-12-04 05:03
Re - Release: The Cocktail Party Problem 00:13:43 2017-11-27 03:11
Clustering with DBSCAN 00:16:14 2017-11-20 04:08
The Kaggle Survey on Data Science 00:25:20 2017-11-13 03:49
Machine Learning: The High Interest Credit Card of Technical Debt 00:22:18 2017-11-06 05:35
Improving Upon a First-Draft Data Science Analysis 00:15:01 2017-10-30 02:38
Survey Raking 00:17:23 2017-10-23 04:51
Happy Hacktoberfest 00:15:40 2017-10-16 03:46
Re - Release: Kalman Runners 00:17:53 2017-10-09 04:28
Neural Net Dropout 00:18:53 2017-10-02 05:32
Disciplined Data Science 00:29:34 2017-09-25 03:49
Hurricane Forecasting 00:27:57 2017-09-18 03:37
Finding Spy Planes with Machine Learning 00:18:09 2017-09-11 04:11
Data Provenance 00:22:48 2017-09-04 03:35
Adversarial Examples 00:16:11 2017-08-28 04:25
Jupyter Notebooks 00:15:50 2017-08-21 03:09
Curing Cancer with Machine Learning is Super Hard 00:19:20 2017-08-14 03:49
KL Divergence 00:25:38 2017-08-07 05:07
Sabermetrics 00:25:48 2017-07-31 03:15
What Data Scientists Can Learn from Software Engineers 00:23:46 2017-07-24 03:52
Software Engineering to Data Science 00:19:05 2017-07-17 04:36
Re-Release: Fighting Cholera with Data, 1854 00:12:04 2017-07-10 02:19
Re-Release: Data Mining Enron 00:32:16 2017-07-02 19:53
Factorization Machines 00:19:54 2017-06-26 04:23
Anscombe's Quartet 00:15:39 2017-06-19 04:19
Traffic Metering Algorithms 00:18:34 2017-06-12 05:01
Page Rank 00:19:58 2017-06-05 03:46
Fractional Dimensions 00:20:28 2017-05-29 04:54
Things You Learn When Building Models for Big Data 00:21:39 2017-05-22 03:44
How to Find New Things to Learn 00:17:54 2017-05-15 03:49
Federated Learning 00:14:03 2017-05-08 03:50
Word2Vec 00:17:59 2017-05-01 04:17
Feature Processing for Text Analytics 00:17:28 2017-04-24 04:17
Education Analytics 00:21:05 2017-04-17 04:09
A Technical Deep Dive on Stanley, the First Self-Driving Car 00:40:42 2017-04-10 03:50
An Introduction to Stanley, the First Self-Driving Car 00:13:07 2017-04-03 03:34
Feature Importance 00:20:15 2017-03-27 03:53
Space Codes! 00:23:56 2017-03-20 03:50
Finding (and Studying) Wikipedia Trolls 00:15:50 2017-03-13 02:44
A Sprint Through What's New in Neural Networks 00:16:56 2017-03-06 04:27
Stein's Paradox 00:27:02 2017-02-27 03:51
Empirical Bayes 00:18:57 2017-02-20 04:30
Endogenous Variables and Measuring Protest Effectiveness 00:16:28 2017-02-13 04:31
Calibrated Models 00:14:32 2017-02-06 02:56
Rock the ROC Curve 00:15:52 2017-01-30 04:38
Ensemble Algorithms 00:13:08 2017-01-23 03:31
How to evaluate a translation: BLEU scores 00:17:06 2017-01-16 02:59
Zero Shot Translation 00:25:32 2017-01-09 04:20
Google Neural Machine Translation 00:18:12 2017-01-02 02:44
Data and the Future of Medicine : Interview with Precision Medicine Initiative researcher Matt Might 00:34:54 2016-12-26 02:19
Special Crossover Episode: Partially Derivative interview with White House Data Scientist DJ Patil 00:46:09 2016-12-18 18:53
How to Lose at Kaggle 00:17:16 2016-12-12 05:28
Attacking Discrimination in Machine Learning 00:23:20 2016-12-05 04:38
Recurrent Neural Nets 00:12:36 2016-11-28 03:47
Stealing a PIN with signal processing and machine learning 00:16:55 2016-11-21 03:32
Neural Net Cryptography 00:16:16 2016-11-14 05:06
Deep Blue 00:20:05 2016-11-07 05:20
Organizing Google's Datasets 00:15:00 2016-10-31 03:17
Fighting Cancer with Data Science: Followup 00:25:48 2016-10-24 03:58
The 19-year-old determining the US election 00:12:28 2016-10-17 03:01
How to Steal a Model 00:13:36 2016-10-10 00:57
Regularization 00:17:27 2016-10-03 04:13
The Cold Start Problem 00:15:37 2016-09-26 04:24
Open Source Software for Data Science 00:20:05 2016-09-19 06:27
Scikit + Optimization = Scikit-Optimize 00:15:41 2016-09-12 03:54
Two Cultures: Machine Learning and Statistics 00:17:29 2016-09-05 03:50
Optimization Solutions 00:20:07 2016-08-29 04:01
Optimization Problems 00:17:50 2016-08-22 02:25
Multi-level modeling for understanding DEADLY RADIOACTIVE GAS 00:23:34 2016-08-15 03:49
How Polls Got Brexit "Wrong" 00:15:14 2016-08-08 03:37
Election Forecasting 00:28:59 2016-08-01 04:40
Machine Learning for Genomics 00:20:22 2016-07-25 04:14
Climate Modeling 00:19:49 2016-07-18 04:26
Reinforcement Learning Gone Wrong 00:28:16 2016-07-11 04:42
Reinforcement Learning for Artificial Intelligence 00:18:30 2016-07-03 20:28
Differential Privacy: how to study people without being weird and gross 00:18:17 2016-06-27 03:53
How the sausage gets made 00:29:13 2016-06-20 04:25
SMOTE: makin' yourself some fake minority data 00:14:37 2016-06-13 05:06
Conjoint Analysis: like AB testing, but on steroids 00:18:27 2016-06-06 04:13
Traffic Metering Algorithms 00:17:30 2016-05-30 03:57
Um Detector 2: The Dynamic Time Warp 00:14:00 2016-05-23 04:05
Inside a Data Analysis: Fraud Hunting at Enron 00:30:28 2016-05-16 04:36
What's the biggest #bigdata? 00:25:31 2016-05-09 03:28
Data Contamination 00:20:58 2016-05-02 04:24
Model Interpretation (and Trust Issues) 00:16:57 2016-04-25 02:45
Updates! Political Science Fraud and AlphaGo 00:31:43 2016-04-18 04:48
Ecological Inference and Simpson's Paradox 00:18:32 2016-04-11 04:43
Discriminatory Algorithms 00:15:21 2016-04-04 04:30
Recommendation Engines and Privacy 00:31:33 2016-03-28 04:46
Neural nets play cops and robbers (AKA generative adverserial networks) 00:18:56 2016-03-21 03:58
A Data Scientist's View of the Fight against Cancer 00:19:08 2016-03-14 04:26
Congress Bots and DeepDrumpf 00:20:47 2016-03-11 05:17
Multi - Armed Bandits 00:11:29 2016-03-07 03:44
Experiments and Messy, Tricky Causality 00:16:59 2016-03-04 04:54
Backpropagation 00:12:21 2016-02-29 04:58
Text Analysis on the State Of The Union 00:22:22 2016-02-26 04:51
Paradigms in Artificial Intelligence 00:17:20 2016-02-22 05:32
Survival Analysis 00:15:21 2016-02-19 04:44
Gravitational Waves 00:20:26 2016-02-15 03:46
The Turing Test 00:15:15 2016-02-12 05:11
Item Response Theory: how smart ARE you? 00:11:46 2016-02-08 04:37
Go! 00:19:59 2016-02-05 05:52
Great Social Networks in History 00:12:42 2016-02-01 05:22
How Much to Pay a Spy (and a lil' more auctions) 00:16:59 2016-01-29 06:36
Sold! Auctions (Part 2) 00:17:27 2016-01-25 03:58
Going Once, Going Twice: Auctions (Part 1) 00:12:39 2016-01-22 04:40
Chernoff Faces and Minard Maps 00:15:11 2016-01-18 04:38
t-SNE: Reduce Your Dimensions, Keep Your Clusters 00:16:55 2016-01-15 05:05
The [Expletive Deleted] Problem 00:09:54 2016-01-11 05:23
Unlabeled Supervised Learning--whaaa? 00:12:35 2016-01-08 04:26
Hacking Neural Nets 00:15:28 2016-01-05 03:56
Zipf's Law 00:11:43 2015-12-31 19:08
Indie Announcement 00:01:19 2015-12-30 16:57
Portrait Beauty 00:11:44 2015-12-27 14:34
The Cocktail Party Problem 00:12:04 2015-12-18 01:17
A Criminally Short Introduction to Semi Supervised Learning 00:09:12 2015-12-04 04:13
Thresholdout: Down with Overfitting 00:15:52 2015-11-27 18:55
The State of Data Science 00:15:40 2015-11-10 05:36
Data Science for Making the World a Better Place 00:09:31 2015-11-06 04:43
Kalman Runners 00:14:42 2015-10-29 04:10
Neural Net Inception 00:15:19 2015-10-23 04:25
Benford's Law 00:17:42 2015-10-16 05:30
Guinness 00:14:43 2015-10-07 05:30
PFun with P Values 00:17:07 2015-09-02 05:24
Watson 00:15:36 2015-08-25 04:26
Bayesian Psychics 00:11:44 2015-08-18 02:05
Troll Detection 00:12:57 2015-08-07 22:56
Yiddish Translation 00:12:15 2015-08-03 05:06
Modeling Particles in Atomic Bombs 00:15:38 2015-07-07 01:30
Random Number Generation 00:10:26 2015-06-19 20:49
Electoral Insights (Part 2) 00:21:18 2015-06-09 04:46
Electoral Insights (Part 1) 00:09:17 2015-06-05 22:38
Falsifying Data 00:17:46 2015-06-01 23:04
Reporter Bot 00:11:15 2015-05-21 01:16
Careers in Data Science 00:16:35 2015-05-16 07:43
That's "Dr Katie" to You 00:03:01 2015-05-14 19:37
Neural Nets (Part 2) 00:10:55 2015-05-11 16:37
Neural Nets (Part 1) 00:09:00 2015-05-01 20:59
Inferring Authorship (Part 2) 00:14:04 2015-04-28 18:56
Inferring Authorship (Part 1) 00:08:51 2015-04-16 19:25
Statistical Mistakes and the Challenger Disaster 00:13:09 2015-04-06 21:36
Genetics and Um Detection (HMM Part 2) 00:14:49 2015-03-25 18:29
Introducing Hidden Markov Models (HMM Part 1) 00:14:54 2015-03-24 16:57
Monte Carlo For Physicists 00:08:13 2015-03-13 00:18
Random Kanye 00:08:44 2015-03-05 00:04
Lie Detectors 00:09:17 2015-02-25 19:20
The Enron Dataset 00:12:27 2015-02-09 01:00
Labels and Where To Find Them 00:13:15 2015-02-04 03:30
Um Detector 1 00:13:19 2015-01-23 21:16
Better Facial Recognition with Fisherfaces 00:11:56 2015-01-07 02:33
Facial Recognition with Eigenfaces 00:10:01 2015-01-07 02:30
Stats of World Series Streaks 00:12:34 2014-12-17 01:41
Computers Try to Tell Jokes 00:09:08 2014-11-26 19:59
How Outliers Helped Defeat Cholera 00:10:54 2014-11-22 01:00
Hunting for the Higgs 00:10:16 2014-11-16 01:00