Linear Digressions

Inferring Authorship (Part 1)

Linear Digressions

This episode is inspired by one of our projects for Intro to Machine Learning: given a writing sample, can you use machine learning to identify who wrote it? Turns out that the answer is yes, a person’s writing style is as distinctive as their vocal inflection or their gait when they walk. By tracing the vocabulary used in a given piece, and comparing the word choices to the word choices in writing samples where we know the author, it can be surprisingly clear who is the more likely author of a given piece of text. We’ll use a seminal paper from the 1960’s as our example here, where the Naive Bayes algorithm was used to determine whether Alexander Hamilton or James Madison was the more likely author of a number of anonymous Federalist Papers.

Next Episodes

Linear Digressions

Statistical Mistakes and the Challenger Disaster @ Linear Digressions

πŸ“† 2015-04-06 21:36 / βŒ› 00:13:09


Linear Digressions

Genetics and Um Detection (HMM Part 2) @ Linear Digressions

πŸ“† 2015-03-25 18:29 / βŒ› 00:14:49


Linear Digressions

Introducing Hidden Markov Models (HMM Part 1) @ Linear Digressions

πŸ“† 2015-03-24 16:57 / βŒ› 00:14:54


Linear Digressions

Monte Carlo For Physicists @ Linear Digressions

πŸ“† 2015-03-13 00:18 / βŒ› 00:08:13


Linear Digressions

Random Kanye @ Linear Digressions

πŸ“† 2015-03-05 00:04 / βŒ› 00:08:44