The LLM and the compiler and the formatter will get the low-level details right.
Maybe in about 90% if you are lucky. That still leaves about 10% error rate which is way too much.
Not remotely similar to my experience. Granted I'm writing Rust, and the Rust compiler is *really* picky, so by the time the agent gets something that compiles it's a lot closer to correct than in other languages. Particularly if you know how to use the type system to enforce correctness.
Your job is to make sure the structure is correct and maintainable, and that the test suites cover all the bases,
Depends on the definition of "bases". Passing test suite does not show your program correct. And if your test suite is also AI generated then you are again at the problem whether the tests themselves are correct.
Yes, you have to know how to write tests. A few decades of experience helps a lot. I find I actually spend a lot more time focused on the details of APIs and data structures than the details of tests, though. Getting APIs or data structures wrong will cost you down the road.
Also, I suppose it helps a bit that my work is in cryptography (protocols, not algorithms). The great thing about crypto code is that if you get a single bit wrong, it doesn't work at all. If you screw up the business logic just a little bit, you get completely wrong answers. The terrible thing is that if you get a single bit wrong, it doesn't work at all and gives you no clue where your problem might be.
Of course that's just functional correctness. With cryptography, the really hard part is making sure that the implementation is actually secure. The AI can't help much with that. That requires lots of knowledge and lots of experience.
and then to scan the code for anomalies that make your antennas twitch,
Vibe error detection goes nicely with vibe programming. That being said, experienced programmers have a talent to detect errors. But detecting some errors here and there is far from full code review. Well, you can ask LLM to do it as well and many proposals it provides are good. Greg Kroah-Hartman estimates about 2/3 are good and the rest is marginally somewhat usable.
Deep experience is absolutely required. My antennas are quite good after 40 years.
then dig into those and start asking questions -- not of product managers and developers, usually, but of the LLM!
Nothing goes as nicely as discussing with LLM. The longer you are at it the more askew it goes.
You really have to know what questions to ask, and what answers not to accept. It also helps to know what kinds of errors the LLM makes. It never outright lies, but it will guess rather than look, so you have to know when and how to push it, and how to manage its context window. When stuff starts falling out of the context window the machine starts guessing, approximating, justifying. Sometimes this means you need to make it spawn a bunch of focused subagents each responsible for a small piece of the problem. There are a lot of techniques to learn to maximize the benefit and minimize the errors.
My point is that 25k LOC a month (god forbid a week) is a lot. It may look working on the outside but it is likely full of hopefully only small errors. Especially when you decide that you do not need to human-review all the LLM generated code. But if you consider e.g. lines of an XML file defining your UI (which you have drawn in some GUI designer) to be valid LOC then yeah. 25k is not a big deal. Not all LOCs are equal.
Yeah, I am definitely not doing UI work.