Multi-Armed Bandit Problems

Data Skeptic

The multi-armed bandit problem is named with reference to slot machines (one armed bandits). Given the chance to play from a pool of slot machines, all with unknown payout frequencies, how can you maximize your reward? If you knew in advance which machine was best, you would play exclusively that machine. Any strategy less than this will, on average, earn less payout, and the difference can be called the "regr

Next Episodes


Data Skeptic

Sample Sizes @ Data Skeptic

📆 2015-09-18 02:00


Data Skeptic

The Model Complexity Myth @ Data Skeptic

📆 2015-09-11 02:00


Data Skeptic

Distance Measures @ Data Skeptic

📆 2015-09-04 02:00


Data Skeptic

Content Mine @ Data Skeptic

📆 2015-08-28 02:00