Linear Digressions

Things You Learn When Building Models for Big Data

Linear Digressions

As more and more data gets collected seemingly every day, and data scientists use that data for modeling, the technical limits associated with machine learning on big datasets keep getting pushed back. Β This week is a first-hand case study in using scikit-learn (a popular python machine learning library) on multi-terabyte datasets, which is something that Katie does a lot for her day job at Civis Analytics. Β There are a lot of considerations for doing something like this--cloud computing, artful use of parallelization, considerations of model complexity, and the computational demands of training vs. prediction, to name just a few.

Next Episodes

Linear Digressions

How to Find New Things to Learn @ Linear Digressions

πŸ“† 2017-05-15 03:49 / βŒ› 00:17:54


Linear Digressions

Federated Learning @ Linear Digressions

πŸ“† 2017-05-08 03:50 / βŒ› 00:14:03


Linear Digressions

Word2Vec @ Linear Digressions

πŸ“† 2017-05-01 04:17 / βŒ› 00:17:59


Linear Digressions

Feature Processing for Text Analytics @ Linear Digressions

πŸ“† 2017-04-24 04:17 / βŒ› 00:17:28


Linear Digressions

Education Analytics @ Linear Digressions

πŸ“† 2017-04-17 04:09 / βŒ› 00:21:05