The People Paid To Train AI Are Outsourcing Their Work To AI (technologyreview.com) 105

Posted by BeauHD on Thursday June 22, 2023 @10:30PM from the AI-feedback-loop dept.

An anonymous reader quotes a report from MIT Technology Review: A significant proportion of people paid to train AI models may be themselves outsourcing that work to AI, a new study has found. It takes an incredible amount of data to train AI systems to perform specific tasks accurately and reliably. Many companies pay gig workers on platforms like Mechanical Turk to complete tasks that are typically hard to automate, such as solving CAPTCHAs, labeling data and annotating text. This data is then fed into AI models to train them. The workers are poorly paid and are often expected to complete lots of tasks very quickly.

No wonder some of them may be turning to tools like ChatGPT to maximize their earning potential. But how many? To find out, a team of researchers from the Swiss Federal Institute of Technology (EPFL) hired 44 people on the gig work platform Amazon Mechanical Turk to summarize 16 extracts from medical research papers. Then they analyzed their responses using an AI model they'd trained themselves that looks for telltale signals of ChatGPT output, such as lack of variety in choice of words. They also extracted the workers' keystrokes in a bid to work out whether they'd copied and pasted their answers, an indicator that they'd generated their responses elsewhere. They estimated that somewhere between 33% and 46% of the workers had used AI models like OpenAI's ChatGPT. It's a percentage that's likely to grow even higher as ChatGPT and other AI systems become more powerful and easily accessible, according to the authors of the study, which has been shared on arXiv (PDF) and is yet to be peer-reviewed.

Using AI-generated data to train AI could introduce further errors into already error-prone models. Large language models regularly present false information as fact. If they generate incorrect output that is itself used to train other AI models, the errors can be absorbed by those models and amplified over time, making it more and more difficult to work out their origins, says Ilia Shumailov, a junior research fellow in computer science at Oxford University, who was not involved in the project. Even worse, there's no simple fix. "The problem is, when you're using artificial data, you acquire the errors from the misunderstandings of the models and statistical errors," he says. "You need to make sure that your errors are not biasing the output of other models, and there's no simple way to do that."

The People Paid To Train AI Are Outsourcing Their Work To AI

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 105 Comments Log In/Create an Account

Comments Filter:

Turtles! (Score:5, Funny)

by SeaFox ( 739806 ) writes: on Thursday June 22, 2023 @10:42PM (#63625228)

It's AI all the way down!

- Re: (Score:1)
  
  by Motleypuss ( 10291831 ) writes:
  
  Actually kind of interesting. There was an article in Quanta Magazine about training AI on AI-generated training sets recently. Anyway, yeah, I'd prefer turtles.
  - Re: (Score:2)
    
    by Too Late for Cool ID ( 1794870 ) writes:
    
    If only we had a way to distinguish turtles from stones or tree stumps...
  - Re: Turtles! (Score:2)
    
    by pitch2cv ( 1473939 ) writes:
    
    It's way worse than anyone would imagine: AI recursion causes irreversible damage to the models.
    https://venturebeat.com/ai/the... [venturebeat.com]
  - Re: (Score:2)
    
    by sound+vision ( 884283 ) writes:
    
    Turtles are great. I try to hit the bike paths every day, and checking up on my turts in the drainage ditch is kind of my motivation. There was a whole family of them, probably 7 or 8 lil' turts, who would sit on top of the drainage pipe and start sliding off the side one by one when they saw someone approaching.
    For a few days there were ducks hanging out with them on the pipe as well. Young ducks, hints of yellow still around the facial area. I think they have moved on now, though.
    Turtles - Great!
    Ducks - G
- Re: (Score:2)
  
  by gtall ( 79522 ) writes:
  
  Ya, I wish the planet would stop supporting that propaganda with its increasing temperatures. You cannot even ignore the state of the planet in peace anymore.
Maybe they should use AI (Score:5, Funny)

by Tony Isaac ( 1301187 ) writes: on Thursday June 22, 2023 @10:54PM (#63625250) Homepage

to detect when the trainers are using AI, or maybe to add a watermark indicating that the results were trained via AI.
I feel like I'm in a hall of mirrors.

- Re: (Score:3)
  
  by dfghjk ( 711126 ) writes:
  
  The article is not about "results" that "were trained via AI". "are using" and "were trained" are not the same.
  "I feel like I'm in a hall of mirrors."
  Education will fix that.
- Re: Maybe they should use AI (Score:1)
  
  by peragrin ( 659227 ) writes:
  
  It gets worse. It is proven that machine learning training is like a photocopier. You can make one good copy. But each successive one gets worse. Or for Republicans, Inbreeding results in stupidity Even for AI.
Kind of like how compilers are recursive (Score:2)

by Tony Isaac ( 1301187 ) writes:

Many compilers are written in the language they compile. C# and C++ are two compilers that are, in part at least, written in the language they output.
- Re:Kind of like how compilers are recursive (Score:5, Insightful)
  
  by narcc ( 412956 ) writes: on Friday June 23, 2023 @12:12AM (#63625352) Journal
  
  It's nothing like that at all. Not even a little bit.
  A self-hosting compiler won't degrade when you compile it. In contrast, if you train a model on the output from another model, the new model will necessarily be worse than the original. I've explained this problem and why it happens many times before. The main point is helpfully provided in the summary.
  
  - Re: (Score:2)
    
    by christoban ( 3028573 ) writes:
    
    But a bug in a compiler might manifest in a compiler compiled via that compiler, and might introduce more bugs...
    - Re: (Score:2)
      
      by Entrope ( 68843 ) writes:
      
      That might happen, but a well-designed compiler can be tested pretty thoroughly, and well-implemented compilers are so tested, so it is pretty rare.
      In contrast, poor performance of an AI model by incestuous training is almost inevitable, only detectable at all after you've trained a model, only easy to detect in proportion to its effects, and hard to remedy.
      - Re: (Score:2)
        
        by burtosis ( 1124179 ) writes:
        
        In contrast, poor performance of an AI model by incestuous training is almost inevitable
        Ahh, so the answer is just a more diverse set of AI models to train from. Then it’s not incestuous! /s
        
        Re: (Score:2)
        
        by narcc ( 412956 ) writes:
        
        You'd think so, but it turns out that any model is going to contain error, some of which will be captured by models trained on its output.
      - Re: (Score:2)
        
        by dfghjk ( 711126 ) writes:
        
        "In contrast, poor performance of an AI model by incestuous training is almost inevitable"
        Sounds like an opportunity, and one for which you will certainly not be competition.
      - Re: (Score:1)
        
        by christoban ( 3028573 ) writes:
        
        I realize it's different in scale, but it's hyperbole to say "It's nothing like that at all. Not even a little bit." They both Inherit bugs. That qualifies as similar, at least a little bit.
      - Re: Kind of like how compilers are recursive (Score:2)
        
        by ArmoredDragon ( 3450605 ) writes:
        
        I think he's alluding to Ken Thompson's hack.
  - Re: (Score:3)
    
    by dfghjk ( 711126 ) writes:
    
    "if you train a model on the output from another model, the new model will necessarily be worse than the original."
    That's not what's being described in the article. Here, "the output of another model" is being used to label new content, the "new model" is trained with the new content. The labels are not the content and they are either correct or they are not.
    Also, your assertion that "the new model will necessarily be worse than the original" is false. First, there is no "original" relative to a "new mod
    - Re: (Score:2)
      
      by crunchygranola ( 1954152 ) writes:
      
      "if you train a model on the output from another model, the new model will necessarily be worse than the original."
      That's not what's being described in the article. Here, "the output of another model" is being used to label new content, the "new model" is trained with the new content. The labels are not the content and they are either correct or they are not.
      The labeled content is the the "new content" for the new model, and whether the label is correct is the whole problem. Is AI labeling as reliable as humans? Also, the idea that the label is "either correct or not" is maybe true if they are labeling "hot dog" and "not a hot dog" for the "not a hot dog app". If they are labeling pictures of many different things the likelihood of mislabeling by AI is obvious.
    - Re: (Score:2)
      
      by narcc ( 412956 ) writes:
      
      That's not what's being described in the article.
      
      You're confusing the problem with the scenario. I chose that example specifically because it makes the problem easy to understand. Regardless of the scenario, you'll find that the problem is the same: AI generated content is poison to future models.
      The labels are not the content and they are either correct or they are not.
      Nonsense. If you're using labeled data, then the labels are absolutely essential! While some AI generated labels will be correct, others will be incorrect or only partially correct.
      Also, your assertion that "the new model will necessarily be worse than the original" is false. First, there is no "original" relative to a "new model", they are not related.
      In this scenario, the "original" is the model (or models) used to label the da
      - Re: (Score:2)
        
        by narcc ( 412956 ) writes:
        
        Bah, I should have checked all the links. This is the post [slashdot.org] where I explain "model collapse" months before the paper that coined the term. Like I said, the problem was well-known.
    - Re: (Score:1)
      
      by christoban ( 3028573 ) writes:
      
      What'd be even more amazing is if you could hear the doozies from MSNBC!
  - Re: (Score:2)
    
    by istartedi ( 132515 ) writes:
    
    In theory, yes but compilers are the subject of the kind of exploit in the binary [stackexchange.com] that makes it a little harder to sleep at night.
    That's obviously not the same as the compiler rotting from being self-hosted; it's just that you reminded me of it. There's always this thinking that you can be secure by auditing source; but if you're boot-strapped with a binary then you're only as secure as the binary. You have to audit at least a simple binary at some point to be sure, perhaps a minimal compiler to bootstrap
    - Re: (Score:1)
      
      by christoban ( 3028573 ) writes:
      
      It's "intents and purposes."
- Re: Kind of like how compilers are recursive (Score:2)
  
  by Fons_de_spons ( 1311177 ) writes:
  
  I think this is simar to a group of flat earthers who locked themselfs out of reality and only feed on each other's nonsense.
  A small perturbation from reality may grow in an un stable feedback loop to complete nonsense.
  An hysterical AI.
  AI may not be that different from human intelligence after all.
- Re: (Score:2)
  
  by The Evil Atheist ( 2484676 ) writes:
  
  Not even that's true. Most languages and compilers run on runtimes and backends written in C++. Other scripting languages, Python, Perl, Ruby etc are written in C.
  
  There are some language specific compilers out there for certain implementations of LISP and, say, FreePascal, but even dialects of those languages have a front-end for a C++ compiler infrastructure like gcc or clang. Else, they run on Java, .NET, V8 (and other JS) VMs, almost all written in C++.
- Re: (Score:2)
  
  by dfghjk ( 711126 ) writes:
  
  So what? The language of the compiler and the code it compiles are entirely unrelated. If a compiler is written in the same language it compiles, then it can be used to generate future versions of itself. But again, so what?
  Also, that does not mean that compilers are "recursive".
  Your observations are based on ignorance and lack of basic insight.
  - Re: (Score:2)
    
    by Tony Isaac ( 1301187 ) writes:
    
    You have no idea what my observations are based on.
    I'll bet you never built a compiler. I have. I've been part of the dev team whose product was a compiler, that targeted Linux, Unix Windows, AIX, and System 360. Yeah, I do know something about compilers.
    It's nice to have discussions, even disagreements, on slashdot. Let's not ruin it by degrading someone you disagree with. Just disagree and be done with it. Thank you.
- Re: (Score:3)
  
  by HiThere ( 15173 ) writes:
  
  You can copy CDs for generation after generation without degradation, but when you try that with an analog signal, the noise dominates after only a few generations.
  Training AIs is an process of teaching them to abstract from the data, but an abstraction of an abstraction always loses information.
  Think to "training an AI" as lossy compression, and then whey you try to get the answer back, you are decompressing it. It fills in the missing pieces with what it things should be there.
  These are all metaphors. D
- Re: (Score:2)
  
  by ceoyoyo ( 59147 ) writes:
  
  There are two differences: compilation is deterministic, and you don't compile compiler B with compiler A, then compile compiler C with compiler B, etc. through a long chain.
  The goal of any generative system is to produce output that matches the true distribution of its training data. You can demonstrate the drift that comes from training with generated output easily enough:
  1) Draw a sample from a normal distribution with a particular mean and standard deviation,
  2) compute the sample mean and standard devia
A surprisingly accurate summary (Score:4, Insightful)

by narcc ( 412956 ) writes: on Thursday June 22, 2023 @11:04PM (#63625278) Journal

Using AI-generated data to train AI could introduce further errors into already error-prone models. Large language models regularly present false information as fact. If they generate incorrect output that is itself used to train other AI models, the errors can be absorbed by those models and amplified over time
It's nice to see a pop-sci article that actually acknowledges reality. The wide-eyed optimism was getting really old. I wonder if this is a sign that we're finally past the peak of inflated expectations...

- Re:A surprisingly accurate summary (Score:4, Informative)
  
  by physicsphairy ( 720718 ) writes: on Friday June 23, 2023 @01:05AM (#63625446)
  
  While expectations may well be overstated, I don't see that humans on Mechanical Turk taking shortcuts is an example of a fundamental limitation. You can literally just pay people more.
  The point of generative models for person X to be able to get instant answers to their dozen questions. But the human-reinforcement part of building the underlying model can be PhDs taking their time to carefully curate responses. Even now ChatGPT is running on years old data.
  
  - Re: (Score:2)
    
    by jenningsthecat ( 1525947 ) writes:
    
    While expectations may well be overstated, I don't see that humans on Mechanical Turk taking shortcuts is an example of a fundamental limitation. You can literally just pay people more.
    But if we have multiple instances of LLM's with access to the entire Web, won't they inevitably be 'feeding' off of each other? I think that's especially true when you consider the number of human intermediaries in the loop.
    Somebody uses ChatGPT to construct a piece of writing - either factual or fictional - then publishes it. But that human has changed the wording and polished it up a bit, so the origins of the piece are less clear and harder to detect. But it still has LLM characteristics which will then
    - Re: (Score:2)
      
      by dfghjk ( 711126 ) writes:
      
      "Somebody uses ChatGPT to construct a piece of writing - either factual or fictional - then publishes it. But that human has changed the wording and polished it up a bit, so the origins of the piece are less clear and harder to detect. But it still has LLM characteristics which will then be fed to an LLM - another instance and/or the same one that authored the initial piece."
      Yes, but nothing bad has happened yet. All that exists now is more information, without any understanding of what that information is
    - Re: (Score:3)
      
      by gweihir ( 88907 ) writes:
      
      But if we have multiple instances of LLM's with access to the entire Web, won't they inevitably be 'feeding' off of each other?
      Yep, with "model collapse" as result. Apparently it is neither possible to avoid that nor to fix the model when it has happened. Training LLMs on web-data will likely not be possible for long now. Maybe for another year or so. Which _also_ means that training LLMs that are somewhat current on things will get excessively hard and may well become impossible economically. This "revolutionary" tech may turn out to be even more short-lived than other bogus tech revolutions.
      - Re: (Score:2)
        
        by swillden ( 191260 ) writes:
        
        This "revolutionary" tech may turn out to be even more short-lived than other bogus tech revolutions.
        I think the more likely outcome is the development of new AI capabilities that prevent model collapse through error correction, which will increase AI capabilities generally and move AI much closer to AGI. https://slashdot.org/comments.... [slashdot.org]
        
        Re: (Score:2)
        
        by gweihir ( 88907 ) writes:
        
        LLMs cannot do AGI. Not possible at all, no matter what you train them on. They have no reasoning ability. You also cannot prevent model collapse through "error correction", because for error correction you need ground truth and reasoning ability.
        
        Re: (Score:2)
        
        by swillden ( 191260 ) writes:
        
        LLMs cannot do AGI. Not possible at all, no matter what you train them on. They have no reasoning ability. You also cannot prevent model collapse through "error correction", because for error correction you need ground truth and reasoning ability.
        Yes, reasoning will be a key piece of error correction, clearly. Reasoning is not hard, though. Logical inference is something we learned how to make machines do decades ago, and LLMs clearly have implicit models of the necessary conceptual relationships, else they wouldn't be able to generate text that uses words in ways that make sense according to our models of the conceptual relationships. So, it seems to me that the structure is in place, and the next obvious direction for research is in building syste
        
        Re: (Score:3)
        
        by swillden ( 191260 ) writes:
        
        FWIW, I think what I posted above is a pretty good description of our best current understanding of how human brains work. Most of our thinking is not reason-based or reason-directed, it's much more similar to what LLMs do. In fact, most of our actual decisionmaking comes from this unreasoning, model-driven thinking. Layered on top of that, our brains have a reasoning engine whose primary job is to generate rational explanations for the decisions that the unreasoning brain already made. Nearly all of our re
        
        Re: (Score:1)
        
        by narcc ( 412956 ) writes:
        
        In fact,
        "Fact" is the wrong word for it. What you've posted here is pure speculation.
        For example:
        Most of our thinking is not reason-based or reason-directed, it's much more similar to what LLMs do.
        This is not a fact. It is a completely baseless claim, lacking both reason and evidence.
        People have been claiming that the brain is 'just' the latest technology for ages. The reason, of course, is that they just can't think of anything else it could be, so irrationally conclude that that is what it is. You've fallen into the same trap.
        
        Re: (Score:2)
        
        by gweihir ( 88907 ) writes:
        
        People have been claiming that the brain is 'just' the latest technology for ages. The reason, of course, is that they just can't think of anything else it could be, so irrationally conclude that that is what it is. You've fallen into the same trap.
        Indeed. That is basically the "physicalist fallacy". And then these morons have the audacity to claim they have Science on their side, when nothing like that is true. Just like any other fanatical (quasi-) religious fuckup.
        
        Re: (Score:2)
        
        by gweihir ( 88907 ) writes:
        
        So it makes perfect sense to me that if we can layer a reason-based, introspective error correction subsystem on top of an LLM, we may actually have replicated the most important cognitive structure of the human brain... and that may be all that's needed to achieve AGI.
        That just tells me you have a deep desire but no general intelligence to speak off.
        
        Re: (Score:2)
        
        by swillden ( 191260 ) writes:
        
        Bad faith response.
        
        Re: (Score:2)
        
        by gweihir ( 88907 ) writes:
        
        Yes, reasoning will be a key piece of error correction, clearly. Reasoning is not hard, though.
        Au contraire. Reasoning is infeasible in machines to any meaningful dept because of combinatoric explosion. Apparently only experts know this, but it is an extremely solid finding at this time. Verification of a reasoning chain is possible (see, for example, proof-assist systems), but finding it is not if it is just a tiny bit longer. And hence model error correction is not possible.
  - Re: (Score:2)
    
    by HiThere ( 15173 ) writes:
    
    Paying people more is no guarantee that they won't take shortcuts. You'd literally need to manage them. Now you're paying multiple people, and some of them at a higher rate. It don't think that can work for this task.
    - Re: (Score:2)
      
      by alexgieg ( 948359 ) writes:
      
      You'd literally need to manage them.
      Yep. The article says there's no easy solution, but that's clearly incorrect. There are plenty of easy solutions. What there isn't is a cheap easy solution.
      Here's the easiest solution, step-by-step:
      1. Setup real, in-person summarizing offices.
      2. Hire people to do the summarizing on premise. Remember to pay them enough so you're hiring high-quality summarizers.
      3. Put systems in place to prevent summarizers from accessing AI tools while summarizing. This very likely involves, among other things, no access to
  - Re: (Score:2)
    
    by dfghjk ( 711126 ) writes:
    
    Of course that's right. Training relies on the reliability of training data, but it is well understood that it cannot be perfect. Training does not take every bit of information as absolute gospel. Imagine if the brain could not cope with cognitive dissonance, yet the people here cannot imagine that AI might also be able to.
    The reactions here are by ignorant lay-people.
    - Re: (Score:2)
      
      by narcc ( 412956 ) writes:
      
      The reactions here are by ignorant lay-people.
      Um... I've seen your posts. You are very obviously an ignorant lay-person.
- Re: (Score:2)
  
  by dfghjk ( 711126 ) writes:
  
  Pleased to get your confirmation bias tickled? As you've already said, you've already explained your wrong-minded opinions "many times before", but at least you can be pleased that someone else is just as mistaken, or at least in your misguided reading.
  - Re: (Score:2)
    
    by narcc ( 412956 ) writes:
    
    The difference between my opinions and yours, when it comes to AI, is that I have the benefit of a formal education. When I post something here, it's not baseless speculation, it's rooted in established fact.
    Given your hostility, I'm guessing you have quite a bit emotionally invested in your AI fantasy. I'm sorry, but you won't be able to hang on to that for much longer.
    Still, if you think that the experts are wrong and that you know better, why haven't you published?
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  Hopefully. I really do not get why this happens so often. Apparently many people are a lot more excitable than they are smart.
- Re: (Score:2)
  
  by swillden ( 191260 ) writes:
  
  Using AI-generated data to train AI could introduce further errors into already error-prone models. Large language models regularly present false information as fact. If they generate incorrect output that is itself used to train other AI models, the errors can be absorbed by those models and amplified over time
  It's nice to see a pop-sci article that actually acknowledges reality. The wide-eyed optimism was getting really old. I wonder if this is a sign that we're finally past the peak of inflated expectations...
  I think this will just push the next generation of needed innovation in AI as it progresses towards AGI. A capability that humans have (though not to the degree that we'd like) but current AIs do not is the ability to perform self-correction, to identify inconsistencies between beliefs, or even to seek out evidence to confirm or deny beliefs, and then update beliefs accordingly. In the process of figuring out how to prevent model collapse from AI training on AI-generated data, it seems likely that researche
  - Re: (Score:2)
    
    by narcc ( 412956 ) writes:
    
    AGI is science fiction. We are not anywhere close to even understanding the problem, let alone solving it!
Old Saying (Score:2, Insightful)

by Anonymous Coward writes:

Junk data in, junk data out.
- Re: Old Saying (Score:2)
  
  by Ronin Developer ( 67677 ) writes:
  
  Yet, people will pay for it and believe it as gospel.
  Just wait until the machines have enough of us humans because of these amplified biases.
Warms my heart (Score:4, Interesting)

by jargonburn ( 1950578 ) writes: on Thursday June 22, 2023 @11:18PM (#63625292)

"AI", the great equalizer.
It's like training monkeys to train other monkeys, and it make me laugh haha!

- Re: (Score:1)
  
  by Motleypuss ( 10291831 ) writes:
  
  Agreed! Had I mod points, you'd have them! Just imagining a pile of monkeys trying to teach each other how to use sticks; that's basically the state of AI right now!
GIGO (Score:5, Insightful)

by awwshit ( 6214476 ) writes: on Thursday June 22, 2023 @11:20PM (#63625296)

Garbage in, garbage out.

- Re: (Score:2)
  
  by TwistedGreen ( 80055 ) writes:
  
  Surprising how few people know this these days...
  - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    Seeing that requires you to be able to recognize garbage. That seems to be more and more a specialist skill that is vanishing in the general population.
- Re: (Score:2)
  
  by fahrbot-bot ( 874524 ) writes:
  
  Like making a copy of a copy ... of a videotape.
  [ For you youngsters: videotape [wikipedia.org]. :-) ]
  - Ob xkcd (Score:2)
    
    by DrYak ( 748999 ) writes:
    
    And the ob. XKCD about generational loss [xkcd.com].
- Re: (Score:2)
  
  by Errol backfiring ( 1280012 ) writes:
  
  The workers are poorly paid and are often expected to complete lots of tasks very quickly.
  If I read the summary, it is more like "capitalism in, garbage out".
  - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    Capitalism (the stupid-greedy variant generally practiced) turns things into trash.
    So it is
    Capitalism in -> garbage in -> garbage out.
- Re: (Score:1)
  
  by Motleypuss ( 10291831 ) writes:
  
  Ah, yes. I grew up with that one. Still just as true today as it always was!
- Re: (Score:2)
  
  by swillden ( 191260 ) writes:
  
  Garbage in, garbage out.
  Humans also get a lot of garbage in, and yet we're able (to some degree) to identify and correct errors, and avoid the garbage out problem. Clearly, our AI systems need to acquire this capability to prevent model collapse, and that introspective error correction will massively increase their capabilities in other ways, e.g. by reducing or perhaps eliminating AI hallucination. It seems plausible to me that this is the missing element for AGI.
  - Re: (Score:2)
    
    by awwshit ( 6214476 ) writes:
    
    AI isn't really intelligent at all. LLMs can produce a series of words but cannot understand the overall meaning of the words together. Until it can understand its own output, and thus be able to correct it, the problem of garbage in, garbage out will continue.
    - Re: (Score:2)
      
      by swillden ( 191260 ) writes:
      
      AI isn't really intelligent at all. LLMs can produce a series of words but cannot understand the overall meaning of the words together.
      
      This a widely-repeated error. It's not possible to produce the sort of output that LLMs do without having some implicit model of the relationships between the words, which is exactly what "meaning" is. It's not clear that this relationship modeling is actually any different at all from what we term "understanding", though we have some additional experiential models that LLMs inherently lack -- but not for everything. Many of the higher order concepts we manipulate have no physical reality, so our understand
      - Re: (Score:2)
        
        by awwshit ( 6214476 ) writes:
        
        > what's lacking is the ability to introspectively error correct
        This line of thinking is the issue. It is not right to anthropomorphize this software. We call it AI, it is an advanced statistical model. We say things like 'it hallucinates' and yet it has no imagination, no thoughts. Obviously from the mistakes that LLMs make, our models need work and understanding the relationships between words is not enough for understanding.
The ouroboros effect (Score:5, Funny)

by Anonymouse Cowtard ( 6211666 ) writes: on Thursday June 22, 2023 @11:20PM (#63625298) Homepage

Otherwise known as disappearing up your own asshole.

- Re: (Score:1)
  
  by Motleypuss ( 10291831 ) writes:
  
  +69 Funny!
Audit (Score:2)

by backslashdot ( 95548 ) writes:

Random audit could solve this. And hey stop giving humans repetitive work, rotate them to different teams and tasks.
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  No. The problem is you do not have a replacement and people are doing this the easiest way possible because they have to. You could do random audit by experts combined with paying a lot more to workers. But the bean-counters are not capable of doing something like this because they have no insight into how things actually work.
Low-background steel (Score:3)

by Rumagent ( 86695 ) writes: on Friday June 23, 2023 @12:56AM (#63625438)

In a short while we will have to treat data like we do steel. Data before the "detonation" of AI will be precious and in ever shorter supply.

- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  Good for all real experts though, because we can generate that kind data from scratch. Not cheap or fast, but as we are currently seeing, there is no replacement for real data based on real insight. Anybody faking it is pretty much screwed though and AI will eat their meal-ticket.
A solveable problem (Score:3)

by TheMiddleRoad ( 1153113 ) writes: on Friday June 23, 2023 @01:39AM (#63625498)

Just have AI, programmed by AI, watch what may be people or may be AI to make sure they don't use AI to make AI. Then the people who use AI to make AI will use AI to fake being people. Pretty soon it will be like crypto, with billions of dollars pouring into utter shit.
Or just pay people show up for work at an office and use corporate computers without internet access, and you'd better do it at corporate HQ or you know there will be shenanigans. Lulz.

- Re: (Score:2)
  
  by account_deleted ( 4530225 ) writes:
  
  Comment removed based on user account deletion
  - Re: (Score:2)
    
    by dfghjk ( 711126 ) writes:
    
    "LLMs are very complex which makes them extremely sensitive to pollution."
    Citation please.
    "The more valid, precise, & reliable the data & the training, the more useful the model."
    Citation please.
    "If you don't do effective quality control, you get garbage."
    Citation please.
    "LLMs are relying on human cognitive biases to train their models to be more human-like in the patterns they generate."
    Citation please.
    "So the LLMs from this article are essentially outsourcing their training to the cognitive biases
    - Re: (Score:2)
      
      by TheMiddleRoad ( 1153113 ) writes:
      
      "Citation please" over and over. And then you call bullshit?
      Make an argument, you fuckwit, of fuck off.
    - Re: (Score:2)
      
      by account_deleted ( 4530225 ) writes:
      
      Comment removed based on user account deletion
- Re: (Score:2)
  
  by Luckyo ( 1726890 ) writes:
  
  >Or just pay people show up for work at an office and use corporate computers without internet access, and you'd better do it at corporate HQ or you know there will be shenanigans. Lulz.
  You make it sound like people don't do shenanigans to cut corners while at corporate HQ.
  - Re: (Score:2)
    
    by TheMiddleRoad ( 1153113 ) writes:
    
    I'm just saying that if the developers are trying to make AI and the staff who make material that trains that AI are in the same office then at least the devs can have some say in guiding the process. If anything, staff who train AIs are devs themselves. When you outsource development to some distant country and it's AI so you can't really know what's going on from afar, then having everything in house is more likely to get good results.
    - Re: (Score:2)
      
      by Luckyo ( 1726890 ) writes:
      
      >If anything, staff who train AIs are devs themselves.
      That's not how any of this works. People who train AI are QC. QC positions are always far less compensated than developer positions, because QC is easy and mindnumbingly boring in most cases. Unlike development, which is hard and just plain boring in most cases, but can be actually exciting in others.
the future of logical reasoning premises (Score:1)

by Walt Dismal ( 534799 ) writes:

Great. Build your AI from data laid down by 3rd world workers.
"Chatbot, should I worship cows?"
"Certainly. Do the needful. Also eat lots of curry. And Kali says, kill your boss."
Researchers Warn of 'Model Collapse' As AI Trains (Score:3)

by sprins ( 717461 ) writes: on Friday June 23, 2023 @02:49AM (#63625568)

Researchers Warn of 'Model Collapse' As AI Trains On AI-Generated Content
https://slashdot.org/story/23/... [slashdot.org]

Good (Score:1)

by Somefriend ( 10288263 ) writes:

Generative "AI" suck and is just there to exploit people most of the time. They could have sourced their data legally and ethically on top of paying people decently to train their models but they took the evil corporate path instead.
Capitalism (Score:2)

by Savage-Rabbit ( 308260 ) writes:

A significant proportion of people paid to train AI models may be themselves outsourcing that work to AI, a new study has found. It takes an incredible amount of data to train AI systems to perform specific tasks accurately and reliably. Many companies pay gig workers on platforms like Mechanical Turk to complete tasks that are typically hard to automate, such as solving CAPTCHAs, labeling data and annotating text. This data is then fed into AI models to train them. The workers are poorly paid and are often expected to complete lots of tasks very quickly.
Shitty pay equals shitty work? ... Whodathunkit? ... CEOs of AI companies, I bid you welcome to free market capitalism.
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  Well, pay peanuts, get monkeys. It is really in no way surprising to anybody with an actual clue. Of course, these people tend to not raise very high in hierarchies.
  - Re: (Score:2)
    
    by Savage-Rabbit ( 308260 ) writes:
    
    Well, pay peanuts, get monkeys. It is really in no way surprising to anybody with an actual clue. Of course, these people tend to not raise very high in hierarchies.
    That rules out Ben Shapiro then. According to him, if your pay for your day job is so shitty you can't live off of it you should just get two more jobs, not question why you are being paid shit while your bosses have 12 cars, four houses and a yacht.
Shocker. Lazy people are lazy. (Score:2)

by Eunomion ( 8640039 ) writes:

AI is a perfect recipe for brewing new methods of cult brainwashing and propaganda though. Every authoritarian political state and psychotic thief-corporation is licking their chops.
Model collapse (Score:2)

by pitch2cv ( 1473939 ) writes:

This study reveals how AI training AI causes irreversible damage to the AI models: https://venturebeat.com/ai/the... [venturebeat.com]
Specifically looking at probability distributions for text-to-text and image-to-image AI generative models, the researchers concluded that âoelearning from data produced by other models causes model collapse â" a degenerative process whereby, over time, models forget the true underlying data distribution â¦ this process is inevitable, even for cases with almost ideal condit
- Re: (Score:3)
  
  by gweihir ( 88907 ) writes:
  
  This isn't just bad. This is catastrophic.
  Not really. It just tells us what we can realistically expect from these tools and it is a lot less than many people thought. To anybody that actually looked and kept an open mind, this is in no way surprising.
Pay peanuts, get monkeys (Score:2)

by gweihir ( 88907 ) writes:

Really not a surprise. Trying to do things cheaper than possible will result in failure.
Overunity in AI? (Score:2)

by istartedi ( 132515 ) writes:

If AI were really smart, you wouldn't get degradation using it like that. Instead you'd get what you expect from humans doing the job--new ideas, suggestions for experiments that prove those ideas out, and a general advancement in the state of the art for whatever field of endeavor you apply it to. It seems like there's a consensus we're not there yet. Simply training an "AI" on a huge corpus of data doesn't yield what we think of as true intelligence?
The first "over unity" AI in this regard seems like
surprised? (Score:2)

by kyoko21 ( 198413 ) writes:

I mean, is anyone surprised at all? If you can automate your work, why not?
It's just like that movie "Multiplicity" with Michael Keaton. As he made more copies of himself they got dumber and dumber. :D
I think I know people IRL... (Score:1)

by jonadab ( 583620 ) writes:

> looks for telltale signals of ChatGPT output,
> such as lack of variety in choice of words

Oh, man, I think I might know some people IRL who outsource their in-person communications to ChatGPT, if lack of variety in choice of words is an indicator of it. I know a couple of people whose entire working vocabulary consists of a few hundred words. They *recognize* more words than that but never use most of them (e.g., they know what "small" and "tiny" and "pint-sized" mean but don't use these words the
Rudy Wells, I knew he'd go too far! (Score:2)

by Impy the Impiuos Imp ( 442658 ) writes:

AI designing its own successors.
What could possibly go wrong? [youtu.be]

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Turtles! (Score:5, Funny)

Re: (Score:1)

Re: (Score:2)

Re: Turtles! (Score:2)

Re: (Score:2)

Re: (Score:2)

Maybe they should use AI (Score:5, Funny)

Re: (Score:3)

Re: Maybe they should use AI (Score:1)

Kind of like how compilers are recursive (Score:2)

Re:Kind of like how compilers are recursive (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: Kind of like how compilers are recursive (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: Kind of like how compilers are recursive (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

A surprisingly accurate summary (Score:4, Insightful)

Re:A surprisingly accurate summary (Score:4, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Old Saying (Score:2, Insightful)

Re: Old Saying (Score:2)

Warms my heart (Score:4, Interesting)

Re: (Score:1)

GIGO (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Ob xkcd (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

The ouroboros effect (Score:5, Funny)

Re: (Score:1)

Audit (Score:2)

Re: (Score:2)

Low-background steel (Score:3)

Re: (Score:2)

A solveable problem (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)