Giuliano Giacaglia
@giu
30% drop in accuracy on Putnam problems when the problems are slightly varied: https://openreview.net/forum?id=YXnwlZe0yf¬eId=yrsGpHd0Sf
1 reply
2 recasts
22 reactions
androidsixteen
@androidsixteen.eth
Does this mean that models are “overfitting” to benchmarks and becoming less dynamic / capable of solving outside of the benchmarks now?
1 reply
0 recast
0 reaction