For complex issues I've found it to start circling the drain.
First answer, obviously wrong. I reply that doesn't seem correct, are you sure. LLM replies with kiss ups and says I'm so smart for checking and it's obviously wrong this is the correct answer.
Second answer looks self consistent and seems to solve the case. But after going moving forward with it see that it's an algorithm extremely fragile to edge cases that causes it to blow up. Point out the cases and instead of fixing the core issues it comes up with solutions for each unique edge case I present where it fails giving a massive amount of spaghetti code.
Third push and I get the first answer back again.
The biggest value I see is that I'm very well informed about the issue now and code the thing myself in a manner that is efficient and resilient to edge cases.
Honestly it reminds me of grad school when I had to grade undergraduate papers. I spent the bulk of my time trying to figure out what rationale the student used to come up with the wrong answers in order to figure out how much partial credit to award. It gave me a very thorough understanding of the core concepts. Still remember one hard mid-term where every student's answer was different to a problem that only had one solution. Had to dig deep into all the concepts of that question only to realize that the subtle calculations of production efficiencies were a moot point since the parameters specified in the question would have caused a massive explosion in real life obliterating all the products.