AlphaGo's meltdown in game 4 was not due to flaws in either the value or policy networks.
Created by lavalamp on 2016-03-14; known on 2017-06-01; judged wrong by sole21000 on 2017-06-01.
- lavalamp estimated 55% on 2016-03-14
- lavalamp said “So things like bad data in the cache, bug in MCTS code, etc. make this “right” and “just train more” makes this wrong.” on 2016-03-14
- PseudonymousUser estimated 35% on 2016-03-15
- PseudonymousUser said “I’m on the “just train more” side” on 2016-03-15
- Raahul_Kumar said “Only people inside Google Deepmind know what really happened. How will we know either way?” on 2016-03-15
- artir estimated 5% on 2016-03-15
- Raahul_Kumar estimated 70% and said “C0-Training with the other version of AlphaGo that only learns from self play will fix this bug.https://en.wikipedia.org/wiki/Co-training” on 2016-03-15
- lavalamp said “If you think co-training will fix this, then you meant 30% :)” on 2016-03-16
- sole21000 estimated 13% on 2016-03-16
- Houshalter estimated 5% and said “I guess bugs exist in every system, but it’d be super weird if it happened just that one game after all those tests.Whereas a neural net making a minor mistake at a very difficult board game is not that implausible at all.” on 2016-03-16
- Raahul_Kumar estimated 30% and said “I don’t think, I know, that is the entire reason co training was invented.” on 2016-03-17
- mrmrpotatohead estimated 1% and said “This will simply be due to move 78 not having been explored much if at all by the MCTS, due to the (SL) policy network assigning extremely low probability mass to that move, and hence very few rollouts being run on that node.” on 2016-03-23
- sole21000 judged this prediction right on 2017-06-01.
- sole21000 judged this prediction wrong on 2017-06-01.
- sole21000 said “Judged wrong due to it being an unexpected move that Alphago couldn’t recover from as opposed to any actual bug or fault.” on 2017-06-01