Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×

Google Researchers Create TV Audio Analysis System 108

segphault writes "Ars Technica reports on a paper (PDF) about ambient audio analysis authored by Google researchers. The system described in the paper can effectively determine what television show a user is watching just by capturing a short audio clip. The paper explains how a regular computer microphone can be used to record an audio clip that is then converted into a statistical data summary and transmitted to a remote server which matches the clip against archived data in order to ascertain which TV show it is associated with. Apparently, the system is fully viable, and other kinds of ambient noise don't negatively impact its accuracy. The paper also describes how web services can provide contextually relevant information based on a consumer's television viewing activities."
This discussion has been archived. No new comments can be posted.

Google Researchers Create TV Audio Analysis System

Comments Filter:
  • This already exists? (Score:4, Interesting)

    by abigsmurf ( 919188 ) on Saturday June 10, 2006 @10:36AM (#15508974)
    There's a system in the UK where you can go out clubbing, here a song you like, dial a number and hold the phone out to the music and it'll text you the name of the song. Assuming they don't hire scores of extremely knowledgable music buffs with quick fingers, surely it's a very similar system. TV dialogue may be less distinctive to the human ear but to a computer it just means a larger amount of data to search through.
  • Re:Great... (Score:2, Interesting)

    by Rytis ( 907427 ) on Saturday June 10, 2006 @10:41AM (#15508999)
    For displaying ads. There's some info in TechCrunch [techcrunch.com].
    [...]to listen to the ambient audio in a room, determine what is being watched on TV and offer web-based supplemental information, services and shopping contextual to each program being watched.
  • Privacy Maximization (Score:3, Interesting)

    by twitter ( 104583 ) on Saturday June 10, 2006 @11:37AM (#15509195) Homepage Journal
    How about outlawing electronic easedropping without written consent? I won't use Macromedia Flash because it turns the microphone on. That's creepy and all non free software with a microphone can do the same thing. It would be better if that kind of thing were against the law.

    In the mean time, I avoid non free software and even have bad thoughts about my cell phone.

  • by Anonymous Coward on Saturday June 10, 2006 @11:41AM (#15509215)
    Two acquaintances of mine work at the company you mentioned. What is also amazing is that they record the Cd's to disc creating a finger print of the Cd's. They then sell the Cd's off at cost price. They are really cheap, as they by in bulk and even get tons of freebies. They are legally allowed to do this as keeping data of a fingerprint of the CD does not constitute keeping the actual CD. What is also amazing is that the fingerprint allows for the song to be playing at a slower or faster rate and it will still detect it. Impressive mathematics to scan such a large database so fast. I must say though that they first scan currently playing songs from major radio stations to reduce the search time. The server farm they are running is massive. They are now expanding into other countries too.

    Hope that was helpful.

  • Re:Nielsen (Score:3, Interesting)

    by apnielsen ( 981522 ) on Saturday June 10, 2006 @01:35PM (#15509709)
    Portable People Meters belong to Arbitron, not Nielsen Media.

    Not sure about PPM's tech, but Nielsen's A/P meter does exactly what TFA describes. That's the only way Nielsen Media could roll out Time Shifted Viewing [slashdot.org] at all (disclosure: I work for them). To say that Google "created" it is an insult to the people I work with every day.

    I see a patent suit in Google's future. As much as I hate patents and like Google, I'd like to at least see some full disclosure here. To (erroneously) state one one hand that they invented the technology and then admit (on page 4 of the PDF) that they intend to compete with the actual inventors, they're begging to get sued anyway.

  • eyes wide shout (Score:3, Interesting)

    by NetSettler ( 460623 ) <kent-slashdot@nhplace.com> on Saturday June 10, 2006 @01:39PM (#15509722) Homepage Journal

    other kinds of ambient noise don't negatively impact its accuracy

    This very statement presupposes that other noise is irrelevant, which seems bogus.
    Snoring is background noise, and suggests non-watching.
    Laughter is background noise, and suggests careful watching.
    Of course, the laughter might not be about what's on TV...

    watch [reference.com] v. tr. 1. To look at steadily; observe, carefully or continuously: watch a parade.
    look [reference.com] v. To employ one's sight, especially in a given direction or on a given object:
    --The American Heritage (R) Dictionary

    It seems to me that watching is an activity involving the eyes and mental processing. It seems to me that audio of what is coming out of the TV is not a statement about either the eyes or about mental processing. This technology of Google's may be an advance in something, but I hope the advertisers paying for this data have their eyes open about the nature of what they are buying because (to re-mix a metaphor) to my eyes this sounds a bit suspect.

    Sociologically, it sounds like a foot in the door to get harmless censors in place. Oops, Freudian slip there. That's sensors, I mean. Google would never involve itself with censorship.

    Once the sensors are in place, when "we" realize that it's not getting "us" the data "we" want, we'll just do a few "harmless" downloads of "upgrades", perhaps causing a minor tweak to look at the video data rather than the audio, or perhaps doing language processing after all, and ... With user-friendly software like this, who needs spyware?

    I also question the claim that because no information is transmitted back to Google that this is the definition of not invading privacy. How is this fundamentally different than the claim that if the police search your house but find nothing, they have not invaded your privacy because they've not placed any record of illegal activity on your permanent record?

    It seems to me that once you place a Turing Machine into someone's environment, capable of doing arbitrary processing, and all it sends is a sanitized report, you have all the mechanism in place for abuse. What if the Turing Machine, capable of arbitrary processing, decides that it doesn't want to send a sanitized report. Who is auditing what is sanitized and what is not?

    What if it turns out to later be possible to lift information from the supposedly cleansed records? Who will audit the use of that data?

    There seem to me to be a lot of slippery slopes here.

  • by po_boy ( 69692 ) on Saturday June 10, 2006 @03:26PM (#15510107)
    I'd like to implement something like this for myself, but with conversational noise instead of TV. I sometimes use my laptop as a visual aid during conversations in my living room. If we're talking about a particular topic, I may pull up a relevant wikipedia article, or something like that. I wouldn't mind if this were more automated.

    I can envision running a speech-to-text translator on my laptop mic and then piping that text into my beagle desktop searcher, or maybe even one of those google desktop search tools on windows. I'd rather not send this data to google, for privacy reasons, though.

    I could see this being useful at work, or in a conference or class, too. I could stand to have relevant pieces of notes that I took from previous classes pulled up with my professor mentions a particular topic.

    Anyone know of a tool or project like this?

BLISS is ignorance.

Working...