[At decision squares, the 5×5 rand-region cheese maze network will put max cumulative probability on the maximal-advantage action at least] 25% of the time

Created by TurnTrout on 2023-02-09; known on 2023-02-16

  • TurnTrout estimated 90% on 2023-02-09
  • peligrietzer estimated 95% on 2023-02-09
  • uli estimated 97% and said “Decision squares usually have 3 different choices, so a-priori 33% chance, and it seems unlikely to get less likely under pressure from gradient descenton 2023-02-12
  • TurnTrout changed the deadline from “on 2023-02-16on 2023-03-01
  • rhaps0dy estimated 98% and said “Thanks uli for precomputing priors for meon 2023-03-02
  • TurnTrout said “Although note that there are, technically, always 5 different actions available, but some of them might have the same effects (eg going into wall is same as no-op).on 2023-03-02