Cast List: Very Large Corpora and Zipf's Law

Very Large Corpora and Zipf's Law

Data Skeptic

Episode: Very Large Corpora and Zipf's Law
Website: Data Skeptic
Feed URL: https://dataskeptic.com/api/blog/rss
Published: 2019-01-18 17:16

The earliest efforts to apply machine learning to natural language tended to convert every token (every word, more or less) into a unique feature. While techniques like stemming may have cut the number of unique tokens down, researchers always had to face a problem that was highly dimensional. Naive Bayes algorithm was celebrated in NLP applications because of it's ability to efficiently process highly dimensional data.

Next Episodes

A Skeptic's Perspective on AI @ Data Skeptic

📆 2019-01-18 01:00

Very Large Corpora and Zipf's Law @ Data Skeptic

📆 2019-01-17 01:00

Semantic Search at Github @ Data Skeptic

📆 2019-01-11 17:16

Let's Talk About Natural Language Processing @ Data Skeptic

📆 2019-01-04 17:07

Let's Talk About Natural Language Processing @ Data Skeptic

📆 2019-01-04 17:07