Suppose life is a multi-armed bandit (general stochastic, to be precise). That is a *solved problem*: the theoretical optimum is to pull whichever arm worked best most recently.

Agree or disagree?

☕️💻🪙🏹 | research & venture @ jump | ex-google, yc

what is the strategy exactly?

reading it literally, it sounds like it would lead to pull one arm, and then repeat the same forever 

i assume there is also some explore component to it

Technowatermelon. Elder Millenial. Building Farcaster. 

nf.td/varun

Wrapped Ether

Yeah I think you have to artificially bound the problem. Works as long as you have a defined set at any point in time and can bootstrap the arms with priors...

ah got it 

the problem i think would be that the k is potentially infinite, which may result in no optimal strategy being discoverable (no proof, just intuition)

Bad phrasing, but meant Follow-the-Leader - largest observed average so far

https://www.di.ens.fr/appstat/fall-2018/TP/Bandits.pdf