Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
AI

Our Brains React Differently to Deepfake Voices, Researchers Find (news.uzh.ch) 14

"University of Zurich researchers have discovered that our brains process natural human voices and "deepfake" voices differently," writes Slashdot reader jenningsthecat.

From the University's announcement: The researchers first used psychoacoustical methods to test how well human voice identity is preserved in deepfake voices. To do this, they recorded the voices of four male speakers and then used a conversion algorithm to generate deepfake voices. In the main experiment, 25 participants listened to multiple voices and were asked to decide whether or not the identities of two voices were the same. Participants either had to match the identity of two natural voices, or of one natural and one deepfake voice.

The deepfakes were correctly identified in two thirds of cases. "This illustrates that current deepfake voices might not perfectly mimic an identity, but do have the potential to deceive people," says Claudia Roswandowitz, first author and a postdoc at the Department of Computational Linguistics.

The researchers then used imaging techniques to examine which brain regions responded differently to deepfake voices compared to natural voices. They successfully identified two regions that were able to recognize the fake voices: the nucleus accumbens and the auditory cortex. "The nucleus accumbens is a crucial part of the brain's reward system. It was less active when participants were tasked with matching the identity between deepfakes and natural voices," says Claudia Roswandowitz. In contrast, the nucleus accumbens showed much more activity when it came to comparing two natural voices.

The complete paper appears in Nature.
This discussion has been archived. No new comments can be posted.

Our Brains React Differently to Deepfake Voices, Researchers Find

Comments Filter:
  • by YetAnotherDrew ( 664604 ) on Sunday June 23, 2024 @10:18AM (#64571245)

    Because people sound like people but deepfakes sound like Scarlett Johansson for some reason?

    Maybe one day science will explain it.

  • When we speak, we expel air. While we generally don't notice it (except in the case of Mark Wahlberg), that extra tidbit of information is something we've grown accustomed to over the centuries. It's embedded as part of us without us realizing it. So far, deepake voices are not able to successfully replicate that act. We may not be able to explain why the voice doesn't sound right, we just know it does.

    • In my experience, the problem with AI-generated content is that there's too much air expelled, and its temperature is way too high.
  • Chatbots are incredibly ineffective. They fail at every real world application and the media has to keep screaming metrics at the market in order to hide the reality. LLM's aren't innovative or productive.

    The only real world use-case I can see where LLM's would become genuinely productive would be spam, extortion and generating fake content. At best, these algorithmically generated voices have approached the uncanny valley but they have never been able to cross it. If you watch any online call-in show you
    • by Big Hairy Gorilla ( 9839972 ) on Sunday June 23, 2024 @12:25PM (#64571597)
      Wait til version Xyz, where they fix that.

      The world of digital audio and synthesizers was at the imitation stage in the 1980's.. then over a period of time improved incrementally. Now, many acoustic instruments are reproduced to a level that is *nearly* indistinguishable from the actual acoustic instruments. I say *nearly* because, perhaps expert musicians could spot small problems with the sound, but in practical use, the "average person" couldn't distinguish.

      Today's electronic pianos, one of the more difficult instruments to get right, are mostly indistinguishable from acoustic. Direct A/B comparison might reveal the differences, but in a recording, you wouldn't know the difference...

      I doubt things like stammering or other human foilbles won't be matched ... soon.
      • I think we're being dishonest about the use-cases for these synthesized voices and that use-case is why the tech isn't actually productive. An institution spending resources taking an LLM and hooking it up to a synthetic voice in order to create a "fake person on the internet" does it for a reason. That reason, generally a profit-motive, makes that fake person act in an uncanny and low-resolution fashion. It turns out, that letting a profit-motive write the script, creates a bunch of terrible that can't pas
  • by timeOday ( 582209 ) on Sunday June 23, 2024 @12:59PM (#64571709)
    1) 2/3 success is not very high if the baseline is 1/2 (guessing between two equally-likely options).

    2) The headline claims this is a study of deepfakes. But it's a study of a particular set of models on particular speakers using a particular version of some deepfake software applied by some particular modelers. It might or might not be the best model that can currently be made today, and there's likely to be a new version tomorrow. It's like doing one car review and drawing the conclusion that "cars have a bumpy ride."

    3) they tested the deepfake against the recording it was created from back-to-back (A/B). That's the most difficult setup. If somebody is deepfaking your supposedly-kidnapped granddaughter [theguardian.com] crying for help over the phone, you won't have the luxury of hearing it back to back with live speech (or screaming which you've probably never heard in real life from that person). In fact they'll probably mask any deficiencies with bad audio quality, short clips, etc.

    • They're using ancient technology.

      "To synthesize deepfake voices, we used the open-source voice conversion (VC) software SPROCKET16, which revealed the second-best sound quality scores for same-speaker pairs and the sixth-best quality for speaker similarity rating among 23 conversion systems submitted to the VC challenge in 2018"

  • There's a pretty clear Darwinian benefit to seeing through camouflage and counterfeit. At least when people are engaged with their environment, although not so clearly when they're being dimwitted consumers.
    • by Lehk228 ( 705449 )
      ok but what's out there that created enough evolutionary pressure to develop such a precisely tuned "that is a fake human" detector
      • It's not finely-tuned at all, because it's secondary. We developed senses that penetrate camouflage to see predators. As human society evolved, other humans started to fill the niches vacated by predators we could now easily defeat. They didn't eat the weak, but they starved them. The result was the same in the eyes of Darwin. A deepfake or an AI is just the latest incarnation of advertising or propaganda applied by people like that to deceive the masses.
  • The Deep Fake Flatulence is a bug (stink)

If entropy is increasing, where is it coming from?

Working...