They call them "reasoning models" but that is pure marketing.
The data doesn't agree with you. Looking at improvement in benchmarks, the reasoning models show faster improvement on benchmarks than non-reasoning models did. See https://epoch.ai/blog/have-ai-capabilities-accelerated for a summary. This is hard to explain as pure marketing.
It is fascinating which people get math degrees these days. Apparently logical thinking is not a requirement anymore.
"Logical thinking" is not synonymous with "agrees with gweihir" as much as you would like them to be.
Ok, let me spell it out for you: The information was out there and could be combined in a purely mechanical, no-insight-required way to provide the answer. Nobody cared enough to find it and try that. Is that clear enough or are you still bereft of understanding?
What you are spelling out and claiming here is just not true. The approach the AI took was *different* than the approach that the literature did. You don't need just my opinion on this. Terry Tao (who is a Fields Medalist) and Jared Lichtman (who is one of the best of the young new number theorists who had previously published work on this and related problems) both disagree with you https://www.scientificamerican.com/article/amateur-armed-with-chatgpt-vibe-maths-a-60-year-old-problem/. If the people who have looked at this in detail and are subject matter experts disagree with you, maybe you it should, possibly occur to you that are wrong here?
We know from Terry Tao's interest in AI that previous attempts at claiming successful AI-generated novelty with Erdos problems were wrong.
Two of the previous examples turned out to be in the literature. We've had other examples where this was not the case. This example is more noteworthy not because it is the first where this is the case but because it is the first substantial one. (I think of Erdos problems as coming in three categories. First, pretty obscure ones which almost no one has heard of. Second, ones where subject matter experts have heard of them even if they aren't that famous. Third, thinks like the Erdos-Straus conjecture which are well known enough to have a Wikipedia page. What makes this one more interesting is not that it is the first Erdos problem to fall, but the first to fall that really is in the second category.
As in Life, the phrase "I can't think of a reason against, so it must be true" is not a useful adage.
In this case, this really is not what it looks like is going on here. And the fact that the system found multiple distinct proofs in different runs is pretty strong evidence. But I'm curious, if this isn't evidence enough to strongly suggest this problem is original to the AI, what sort of evidence would convince you an AI genuinely came up with a novel solution?
If you are a mathematician, you should be able to see the difference between "nobody cared enough" (my claim) and "no one cared" (your gross mis-statement of my claim).
Sigh. Why am I not surprised that you've responded this way. Let's be clear then: You can replace my comment I wrote earlier with the word "enough" added just at the end and everything I wrote would still be true. Your statement is just wrong, and it is wrong for exactly the reasons I outlined.
The nice thing about standards is that there are so many of them to choose from. -- Andrew S. Tanenbaum