Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Totally agree. Stopping at 10% and then claiming the effect size is 10% would be silly. But seeing a giant effect and stopping is totally cool in my book. The bigger the effect difference, the fewer samples you need to judge it. So I think it can be fine to peek and halt. Nothing forces us to use a static number of samples other than an old statistics formula.


The point is, you don't know what that number is unless you do the math. It's not a matter of "judging it". It is a matter of calculating it.

If you "peek and halt" without doing the math. You might as well have a random good result in the first 10 and say "look! Positive results!". You recognize that is ridiculous. So when is peeking and stopping not ridiculous?

A: when the statistical power is sufficient for the observed effect. In the examples -- 1600 or 6000 for a 10% or 5% effect, respectively. And much less for a 20% or 40% effect! -- but you don't know the number required unless you do the math.


Again, this limitation doesn't apply to good bandit approaches. You see big effects quickly and smaller effects more slowly and don't need to do any pre-computation about power at all.

You can even get an economic estimate of the expected amount of value you are leaving on the table by stopping at any point.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: