Lucas Baker pfp
Lucas Baker
@alpha
Suppose life is a multi-armed bandit (general stochastic, to be precise). That is a *solved problem*: the theoretical optimum is to pull whichever arm worked best most recently. Agree or disagree?
2 replies
0 recast
0 reaction

Nick Chow pfp
Nick Chow
@nicholasachow
In a multiple iteration game, you’d want to mix in some randomness, wouldn’t you? Some variation on an epsilon-greedy strategy seems to work well.
0 reply
0 recast
0 reaction