Like you, I have more own interest in my experience than OpenAI's or other AI vendor claims. And my experience has been to successfully use plain text to instruct AI LLMs to:
- generate poetry -- even blending unrelated genres that I am sure no one attempted before)
- summarize documents
- research diseases, medication and side-effects
- review CT scan images for issues of concern
- generate tour itineraries
- translate languages (Hebrew , Korean, ...)
- explain code
- generate code (Perl, Bash, VBA...)
- generate an Excel document with ISBNs and book summaries from pictures of my physical library
That's general enough for me. It should be general enough for you too - LLMs pass the Turing test.
Yes, LLM AIs tend to behave like a skilled, over-confident intern, whose work is useful but must be double-checked. Call it slop if you want, but its useful slop. It got me past the initial inertial friction with VBA for my test analysis project. I had to correct flaws in code structure and parsing strategy. But it ended my 20-year procrastination. Now I have a 1000-line VBA codebase that I am familiar with enough to reuse in other projects.
You ask how does "checking multiple slop hallucinations against each other" help? Comparing two hallucinations will cause the "disagree" alert to light up, won't it? It's not like the two AIs are coordinating via seance.