Cast List: Very Large Corpora and Zipf's Law

Very Large Corpora and Zipf's Law

Data Skeptic

Episode: Very Large Corpora and Zipf's Law
Website: Data Skeptic
Feed URL: https://dataskeptic.com/api/blog/rss
Published: 2019-01-17 01:00

The earliest efforts to apply machine learning to natural language tended to convert every token (every word, more or less) into a unique feature. While techniques like stemming may have cut the number of unique tokens down, researchers always had to face a problem that was highly dimensional. Naive Bayes algorithm was celebrated in NLP applications because of it's ability to efficiently process highly dimensional data.

Next Episodes

Semantic Search at Github @ Data Skeptic

📆 2019-01-11 17:16

Let's Talk About Natural Language Processing @ Data Skeptic

📆 2019-01-04 17:07

Let's Talk About Natural Language Processing @ Data Skeptic

📆 2019-01-04 17:07

Data Science Hiring Processes @ Data Skeptic

📆 2018-12-28 01:00

Data Science Hiring Processes @ Data Skeptic

📆 2018-12-28 01:00