It can know if it was right or wrong the same way a dev does: by running tests, analyzing the failures, making adjustments, and rerunning until success. Or until it gives up. That won't stop it from going off the rails at times, so you also need a checkpoint system.
"I don't know what I'm doing, please suggest something that might or might not improve my company." Sounds like the entire industry is a perfect fit for AI.
You're using it wrong. If you are trying to vibe code at such a high level that ideas like "security" matter, you're going to have a bad day.
The ideal case is when you have an existing codebase that requires a tweak. You already know just about what you want to do to it. Doing it yourself would take an hour or two. Using an AI might take a minute or two. But since you know exactly what you wanted, it's easy to verify the result. Plus, many of them are pretty good at "now add tests for that".