It's a matter of what the LLM operator is pointing it at.
The LLM operator submitting the bugs aren't paying attention nor feeding their instance of LLM anything about others' submissions. So they are flooding with dupes, and the LLM has no reason to detect duplicate submissions, since it's not fed that data.
An LLM fed the mailing list and new submissions could credibly find dupes. If it fails, oh well, a dupe made it through and was annoying. If it erroneously detects a dupe, oh well, the submitter has to re-assert that it is not a dupe and is somewhat annoyed.
LLM ability to identify roughly duplicate bugs is decent enough. I don't like the hand waving of "AI can write the code, AI can review the code, AI can test the code" to absolute confidence (finding ways to expend more tokens does improve it's success a bit, especially if you can give it a 100% perfect pass/fail test to run and and let it retry), but here it's a pretty straightforward application, just a better fuzzy match at finding duplicate reports.