One issue with the overall architecture (which is just statistical prediction) is that it can't really provide useful insights on why it did what it did.
I think you're describing the models from a year ago. Most of the improvements in capability since then (and the improvements have been really large) are directly due to changes that have the AI model talk to itself to better reason out its response before providing it, and one of the results of that is that most of the time they absolutely can explain why they did what they did. There are exceptions, but they are the exception, not the rule.
It's interesting to compare this with humans. Humans generally can give you an explanation for why they did what they did, but research has demonstrated pretty conclusively that a large majority of the time those explanations are made up after the fact, they're actually post-hoc justifications for decisions that were made in some subconscious process. Researchers have demonstrated that people are just as good at coming up with explanations for decisions they didn't make as for decisions they did! The bottom line is that people can't really provide useful insights on why they did what they did, they're just really good at inventing post-hoc rationales.