Linear Digressions

Optimized Optimized Web Crawling

Linear Digressions

Last week’s episode, about methods for optimized web crawling logic, left off on a bit of a cliffhanger: the data scientists had found a solution to the problem, but it wasn’t something that the engineers (who own the search codebase, remember) liked very much. It was black-boxy, hard to parallelize, and introduced a lot of complexity to their code. This episode takes a second crack, where we formulate the problem a little differently and end up with a different, arguably more elegant solution. Relevant links: http://www.unofficialgoogledatascience.com/2018/07/by-bill-richoux-critical-decisions-are.html http://www.csc.kth.se/utbildning/kth/kurser/DD3364/Lectures/KKT.pdf

Next Episodes

Linear Digressions

Optimized Web Crawling @ Linear Digressions

📆 2018-10-29 00:56 / 00:21:32



Linear Digressions

Searching for Datasets with Google @ Linear Digressions

📆 2018-10-15 03:11 / 00:19:54


Linear Digressions

It's our fourth birthday @ Linear Digressions

📆 2018-10-08 04:33 / 00:22:06


Linear Digressions

Gigantic Searches in Particle Physics @ Linear Digressions

📆 2018-09-30 20:52 / 00:24:46