I tested the first image generators. They were horrible. If I wanted a sailing vessel, I got a floating rock with 1 mast and some weird spider web of ropes. If I was lucky I got water.
Now? You can get a video of a sailing vessel morphing into the Nautilus, if you want.
For coding LLMs the same thing. Two years ago it messed up my code. 3 weeks ago I fed the code it messed up to Codex 5.3 with the question to check the issue and fix it. It fixed my issues in one minute, ran the code, detected an issue in the playback of sound, checked the binary file. Used the library to doublecheck, then detected 4 bugs in the library that it then fixed. Took me all of 3 minutes. Last year I spent a day trying to fix it and gave up. They were all easy to see when pointed out but hard to detect when you don't know them. Off by one errors that only popped up in special circumstances, duplicate code in the wrong place, etc.
Maybe you've paid "attention to this space". What you perhaps didn't do, was buy a computer that can run the LLMs locally and test them - like I did. You don't even need to do that. Just get Visual Studio Code, install Continue, and buy some credits for the Claude API and OpenAI API and compare the latest models. Or use Qwen Code from Ali Baba. Last year that didn't work very well. This year is very different.
Also note that people with hostile prompting get nowhere. It's not the AI at that point, it's you. Let it cut up the task in small pieces, maintain a task list, and a readme.md to document its solutions, and you will see a huge improvement in how things work. Search for ATLAS Framework on Youtube. Correct prompts and correct usage of the AIs strengths make an incredible difference.