Linear Digressions

Finding (and Studying) Wikipedia Trolls

Linear Digressions

You may be shocked to hear this, but sometimes, people on the internet can be mean.  For some of us this is just a minor annoyance, but if you're a maintainer or contributor of a large project like Wikipedia, abusive users can be a huge problem.  Fighting the problem starts with understanding it, and understanding it starts with measuring it; the thing is, for a huge website like Wikipedia, there can be millions of edits and comments where abuse might happen, so measurement isn't a simple task.  That's where machine learning comes in: by building an "abuse classifier," and pointing it at the Wikipedia edit corpus, researchers at Jigsaw and the Wikimedia foundation are for the first time able to estimate abuse rates and curate a dataset of abusive incidents.  Then those researchers, and others, can use that dataset to study the pathologies and effects of Wikipedia trolls.

Next Episodes


Linear Digressions

Stein's Paradox @ Linear Digressions

📆 2017-02-27 03:51 / 00:27:02


Linear Digressions

Empirical Bayes @ Linear Digressions

📆 2017-02-20 04:30 / 00:18:57



Linear Digressions

Calibrated Models @ Linear Digressions

📆 2017-02-06 02:56 / 00:14:32