Comment Re:How does it compare to a human? (Score 1) 10
The purpose of this benchmark is to track and steer AI improvement. They want to start with a low success rate to have room for improvement, how much human score at it doesn't matter.