Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Music Media

Cantametrix Plans To Track All MP3s On The Web 166

Akilesh Rajan writes: "A Stereophile article reports that Cantametrix is further developing its MusicDNA system for identifying and tracking all MP3s on the Internet. MusicDNA's use of DSP (Digital Signal Processing) technology and psychoacoustic modeling allows it to analyze an MP3 and immediately tell what song it is, and so also recognize who, if anyone, owns its copyright. Company reps explain one possible application: 'A MusicDNA Analyzer can be located, for example, on the Web crawler of a large search engine, to ensure that the search engine only points to legal music.'" I could see this working a lot better if all the music on the Web was pristine and complete -- which it's not.
This discussion has been archived. No new comments can be posted.

Cantametrix Plans To Track All MP3s On The Web

Comments Filter:
  • I'd like to see them do this, and encompass the myriad of different protocols and formats that abound on the web today, plus the ones that will be designed just to break it. It seems to me that they'd only need to supply an appropriate codex to decompress it to raw audio before they could analize it. You're absolutely correct though that all it would take would be to wrap it in a zip file with the password in the file name.
  • No laughs, this would make a very good service. I hear many good songs on radio. The catch is that songs and announcements are all in Spanish, and I don't understand it :-( This service would allow me to identify music and then buy it.
  • Reminds me of Markwatch, [markwatch.com] a web trademark-monitoring service with an, um, entertaining website. You've probably seen Markwatch's bots, breather and snorkel, all over your weblogs... I have a student on a server at the college where I work who wrote a short story that mentions a Coke machine - they're all over it... how efficient...

    So now we're going to have this kind of thing for sound files??? Oh good - bots downloading all the music students' work constantly... er uh, does this thing respect robots.txt files?? (and if it does, then what good is it... um, scratch that question....)

  • I hope in between all their havering about 'monetizing the artists' they manage to implement one little simple obvious thing- the ability for any random person to submit a song and be directed to the official website of the artist- whether that means a major label site, a 'You're Under Arrest' page, or a page that simply says 'Here is more music like that for you to download'.

    Nothing in this technology _stops_ it from being usable in the last sense- a way to quickly be pointed to the rest of an artist's freely available (typically low bit rate) catalog. That, not 'monetization', is the new concept: the idea that for the first time a good but poorly resourced artist would have the same information distribution resources as the majors- the majors fight and spend billions to try and get some produced 'artist's name into peoples' ears, so that the consumer knows what they're hearing and where to buy more (at your local CD store, of course). For the first time this might be truly decentralised so that anyone, anywhere, who was listening to some anonymous and obscure song they liked, would be able to get the information. So it's on the radio? Hold up a mike and tape the radio. Pirate radio? Same deal. Old cassette tape that never had a label? No problem. mp3 marked "Metallica-One.mp3" erroneously? No problem...

    At that point, you start having a free market again- at that point good local or indie bands or musicians, or really specialised musicians (noise, trad jazz, ragtime, serial composition) can begin shortcircuiting the lines of distribution and undercutting the majors by the simple expedient of 'who cares if I can't make money at this, nobody does but _I_ can afford to voluntarily give out mp3s etc. and FIND MY AUDIENCE'. At a stroke, the barriers to entry for an entire industry fall, and genres like jazz can survive (contrast this with at the major labels, which not only will not support jazz but are known to actually destroy irreplaceable master tapes to save storage costs- refusing to allow anyone to salvage the masters).

    I really hope these people have enough sense to become this type of general resource. The risk is that the majors will not permit information to be stored for any music other than major label 'protected' music, and so the more obscure or indie stuff will turn up as 'no matches'.

  • It can distinguish between the original and bootlegs because they are entirely different recordings, with obvious differences (such as the presence of audience applause, and in the case of Pearl Jam, yelling and screaming).

    A much more interesting quuestion would be to ask if it could dinstinguish between to different 're-masters' of the same original recorded piece of music.
  • couldn't they send out threatening emails to anybody who has an MP3 with (for instance) Metallica in the file name, with roughly the same effect?

    They did that already. Didn't work all that well (because lots of things other than Metallica files have that in the name, and lots of Metallica MP3s aren't named so simply). It also caused a firestorm.

    Distinguishing between what is real and what's not is probably only useful in court... (correct me if I'm wrong...)

    Whether it's a real Metallica song or an unlicensed cover of it doesn't matter. They're both copyright violations.

    If the technology really does work - or even works moderately well as a bird-dog - using it for a webcrawler to hunt for infringers may work - and be within existing law. They can check manually before going to court. If it's good enough, they can weasel-word a cease-and-desist order and not get much problem from occasionally sending one to a host of a misidentified file.

    No law changes required. If they finally buy a clue and go after the >hostsindexers, they'd be on completely solid legal ground, and the litigation would be reduced to:
    - Did the defendant knowingly host a copyrighted work without obtaining the proper license?
    - Did he refuse to take it down in response to the cease-and-desist order?
    - Does the plantiff hold the copyright (or otherwise have standing)?
    - (In the first few cases) is such hosting fair use?

    But even if it's NEARLY perfect it will sometimes misidentify a non-infringing work. If it does this even once, it opens any subscription hosting service that uses it to civil action for contract violation by its customers.

    As we've seen, in free competition the indexing services that use a filter will lose to those that don't - because they'll lose the portion of the customer base that doesn't care whether they're downloading a copyrighted work. And any flase-positive flakeyness in the technology would produce the same sort of flap as the nannyware web filters. This should preclude attempts to pass and enforce a legal mandate, on first amendment grounds.
  • Chill Out mate. just using windows doesn't make them a fool. Just unenlightened!
  • For the first time this might be truly deentralised so that anyone, anywhere, who was listening to some anonymous and obscure song they liked, would be able to get the information.

    You might be interested in the Tropus project, at http://tropus.sourceforge.net [sourceforge.net]. We're still in the vaporware stage, but what you're talking about is one of the things on our wishlist.

    --William Dye

  • If this formula is widespread enough, the program will be able to adapt and ignore seconds 3-5 of the piece.

    If the formula is not widespread enough, then it is useless.
  • OK, Cantametrix has 12 references to other people talking about 'Music DNA (TM)', but *no* information about it on its own site.

    Their online presentation is simply about yet another stunningly crude music genre classification technique.

    Now usually if someone has something of substance to say, they are happy to back it up, no?

    My reading is this:
    1. Cantametrix is a company with not particularly special technology looking to up its profile.
    2. The SDMI competition and the Napster / Bertelsmann agreement mean that music copyright issues are particularly newsworthy at the moment.
    3. Cantametrix has *privately* spread around the idea that its technology could be used to help enforce copyrights.
    4. Note that they make no formal claim for themselves claim beyond "For labels and artists, CantaMetrix fingerprinting technology can be a valuable component in the song identification process."

    Therefore, this is PR fluff. Ignore with confidence.
  • You really think that just because the government says its illegal its bad. You ust have been anti gay rights in the past andprobably still think weed is bad. You are just ignorant!
  • If they finally buy a clue and go after the >hostsindexers, they'd be on completely solid legal ground ...

    Make that:

    If they finally buy a clue and go after the hosts rather than the indexers, they'd be on completely solid legal ground ...
  • Classical music is not the big money spinner that popular music is. They probably won't bother with that!
  • Maybe I'm not understanding correctly, but everytime they grab a music file to scan, aren't they making an illegal copy? As is stands, the only people who are copying music illegally are those who go and get it. Now anyone who uses this scanning software will also be a pirate.

    Don't forget, the RIAA is only powerful because we have given them a shitload of our money. The way to take their power away is to stop giving them money!

    - MFN
  • Refer to the subject!
  • Doesn't this plan rather ignorantly assume that the MP3 files will be traded purely. What if you were to gzip it first, or even just rename it to .gz. They can't possibly scan EVERY file as that would be an absurd amount of computer power. Anyway, this seems like a complete and utter waste of effort because people will ALWAYS work around this kind of crap.
  • Or if you really want to confuse their algorithm, edit together different songs to make medley mixes. Like 20 sec of Britney Spears, 20 sec of Metallica, 20 sec of Lawrence Welk, 20 sec of techno, etc. etc. It shouldn't be too hard to automate the process.

  • What if...
    This technology was used for good and not evil?
    They claim that they can identify an mp3 just by it's code by analyzing it.

    What if this technology was used to find all the crappy mp3s on the net.
    Just think about it. I could have some program running in the background on my computer that could search through gigs and gigs of my MP3's and find all the little "blips" and crackles from a poorly encoded MP3.

    Just start the program at night time when you go to bed, and wake up to a full .log file saying something like:


    /home/brad3378/MP3directory/Song_X.mp3 - 2:22 problem found. - verify the quality of this song
    /home/brad3378/MP3directory/Song_Y.mp3 - 2:49 problem found. - verify the quality of this song
    /home/brad3378/MP3directory/Song_Z.mp3 - 4:48 problem found. - verify the quality of this song


    I think this could be a big timesaver to folks like me that spend too much time verifying the quality of their MP3 collection.
  • A much easier way to get it in place would be to just buy the search engine. Considering that Disney owns GO Networks, it's not a streach.
  • The same law that prevents you from putting the mp3's on the server in the first place? :-)
  • I was wondering about that too, but I figured that, assuming it is based on both harmony/melody and a fairly exact timing, it could probably distinguish between two different versions of the same song. No two pianists will play all the corresponding notes with the same duration, even given a similar tempo. (This is especially true of pieces from the romantic era and later.) But naturally I have no idea how the analysis is really done...

  • Worry about the RIAA not altavista scanning your files. You better believe that they'll have it comming from multiple subnets.
  • Wouldn't it be cool (for someone who has a large collection of mp3s on their computer) to gain access to this data base and use a program to change the correct files, based on the files "DNA" to *name of band* - *name of song* ?

    That would correct wrongly named and labed files and make people's collections look neater.
    --

  • Of, so this 'may' work with files posted on an internet site, but what about streaming?

    For this to work the DSP filters would have to be fitted to either the backbones, or to clients.

    The former stands 0%, just look at the outcry with Carnivore.

    The latter..... hehehehe I can see the Open Source developers rushing to add this 'feature' :)

    Phil

    The Linux MP3-HOWTO [mp3-howto.com]

  • I totally agree. The reason they came late was not because they had no idea but because they don't want to change. Like oil companies. If i was in charge of one i would look into other technologies so that when the oil runs out you can still be in control. but the don't because they are greedy and only see tomorrow!!!!
  • Do you really think a judge, whose interest is in the status quo would really not fine a mp3 'pirate' even if they beat a confession out of him!
  • oh for christ sake what are you talking about? there is such a blaringly obvious fundamental difference between your personal mp3s and your books. the fact that people continue to allude that "censoring" your right to share files is somehow a gesture of government control and domination is absolutely ridiculous. here's why:

    writing a book is free speech
    making copies of something that isn't yours and distributing it for free when someone's trying to make money off of it is just rude and ridiculous. jesus christ think about what you're talking about before you start getting paranoid.
  • Te articles says that software on the end users computer could scan the mp3s. Firstly they couldn't get me to install this software let alone most people who use mp3s. And i bet this could be hacked to scan harddrives for more than mp3s. -- Challenge the system --
  • let me know exactly what search engines and ill be damned sure NOT to use those ones, LETS HAVE REBELL SEARCH ENGINE WITH A FUNNY LOGO and NO BANNERS... oh wait google's already around... my bad :P
  • Your fucked! Just because the government says its illegal you think it is. You probably agreed with anti gay laws and still think weed is bad. You suck. Do you really think that artists can afford to release cds themself. You don't know what market forces are. The RIAA is not about competition its about monopolies. Ever wonder why cds cost the same no matter where you go. Thats becuase the 4 or 5 majors companies are involved in price fixing!
  • 2 -- an expert system powerful enough to comprehend and categorize musical information, that could tell a licensed recording of Mozart from a bootleg NIN concert, i.e. practically full-blown Artificial Intelligence.

    1. Convert the .ogg, .mp3, etc. to .wav or some other transparent linear PCM format.
    2. Use a low-pass filter to go down to the low frequencies where the bass line lives.
    3. Use a similar algorithm to the "beat finder" in many XMMS/Winamp plugins, along with the Fourier transform, to reduce the wave to a list of notes being played.
    4. Taking into account transpositions and speed changes, pattern match with MIDI files from the ASCAP, SESAC, BMI, and RIAA libraries.
    5. Sue.
    This technique also would have caught "Ice Ice Baby" (really "Under Pressure") and "Come As You Are" (really "Eighties").
  • One possible workaround: prepend a random barely-audible rumble to each MP3.
    --
  • I think what he was alluding to is the "slippery slope" phenomenon. The basic parts of this technology can probably be modified to examine content OTHER than MP3s - like the aforementioned example: books.

    There are those in this world who seek control in order to destroy. This sort of technology is what they dream of posessing. That article the other week about that company which had mapped every IP address on the Internet by geographic location is a similar case.

  • by Peter Dyck ( 201979 ) on Sunday November 12, 2000 @12:48AM (#629460)
    And I can see this quote in the future:

    "...to ensure that the search engine only points to legal, government approved books."
  • THis is just another way that the corporate world destroys our freedom. The greed dominated record companies, as we all know, are afraid of mp3s as it is the one thing that can finally challenge their monopoly. Ever wondered why cds are the same price no matter what store you go to. Thats because the major labels are all involved in robber-baron type price fixing. They crap on about market forces being destroyed but in reality they are the market. I don't see it as stealing in as much as i am not making money off the mp3s. That would be stealing. I support bands by going to concerts. i am not talking about ticketmaster type concerts but small ones at universites and pay at door ones. This battle is indicative of a great struggle in society. The corporate control of our lives and the destruction of the environment are all part of this. If you have any views or thoughts on this instead of just venting your anger here do it else where as well. Join a protest like those at Seattle last year or melbourne (i did). Join a party, speak out. The way to throw off the chains of corporate control is through direct action. Hack a corporate websites such as Nike and expose their treatment of workers. The movement is building join in!
  • I guess you can test that theory about infinite monkeys being able to reproduce the complete works of Britney Spears. :-) "I swear to god, that link was /dev/random. I have no idea how it completely reproduced that song!"
  • If any one is interested drop me a line!
  • by pb ( 1020 ) on Sunday November 12, 2000 @12:50AM (#629464)
    Yeah, right.

    I'd like to see them do this, and encompass the myriad of different protocols and formats that abound on the web today, plus the ones that will be designed just to break it.

    I think that simple passwords, encryption, steganography, and file-sharing will each be enough to defeat this, but who knows, maybe we'll have to go to something really sophisticated, like trading over IRC, or ratioed ftp...

    Companies that base their business model over scare tactics just crack me up...
    ---
    pb Reply or e-mail; don't vaguely moderate [ncsu.edu].
  • Ohhh, maybe they will sell this technology too. Then I can finally categorize all my mp3's. I've got a couple that I can't tell whether they're Christina Aguilera or Britney Spears. With this cool technology I could just run it and it'd go grab some info from cddb.com and voila. This rules! heh. ;-)
  • by Anonymous Coward on Sunday November 12, 2000 @12:51AM (#629466)
    Time to convert *.mp3 to *.ogg.
  • and they don't discover serious flaws while doing it (say, discover 100 different tracks that have the same fingerprint)

    More likely to end up with 100 different fingerprints for the same track. Well the same track so far as humans are concerned, different tracks because of the way they have been ripped and processed. Remember that MP3 uses a lossy compression.
  • Yeah, one of the most common ways for "pirates" to survive is to break themselves into a bunch of little islands, so there's lots of small targets rather than one large target. "Islands" can be back alleys, secret societies, separated napster/gnutella networks, little-known alternative protocols (eg. Hotline or Napster when they first came out), dorm buildings, VPNs, etc... The idea is to be small or unknown enough to keep under the radar of the authorities but still recruit enough people that their collective scavenging makes it worthwhile (which, with zero-cost copying, is almost always).

    But after a while, the pirates get greedy and form larger clumps, which makes them more visible to the authorities. Eventually the clump gets raided and everyone scatters. Some form small islands again and grow over time and the cycle continues.

    As for me, my particular island got raided and somehow I've never gotten back into it. Other than the casual Napster use, which doesn't count because Napster is an island that was allowed unmitigated growth for long enough via unanswered legal questions that it grew to an immense size and its members are now powerful enough to openly do damage to the authorities (eg. Metallica vs. The Fans) and maybe even force The Rules Of The Game to change.

    That said, pirate islands will still be able to play by whatever rules they want.
    --

  • Quick Solution, Quick Program

    1. make a program to insert Junk From 3 to 5 seconds
    2. make a winamp plugin to remove the 3rd to 5th second.
    3. Wash Rinse Repeat...

  • Well NOW you tell them. It may have also saved us from the "Oops I did it again" album(really "crap").
  • great. now they can track our music, what sites we visit(think doubleclick), and all the emails we send. i can just imagine: "he downloaded a korn song, he must be planning on shooting up his school!"
  • "A MusicDNA Analyzer can be located, for example, on the Web crawler of a large search engine, to ensure that the search engine only points to legal music."

    First of all, is this software going to have its own database of every copyrighted work in existance, no. It's going to use some form of CRC or HASH checking which will further limit its functionality. Take into account the number of songs it is going to be required to search through and the number of songs it is going to be required to compare against - and all of a sudden the methods used to discern one song from another become more simple, and less accurate.

    They should call it NetNazi (tm)

  • by sith ( 15384 ) on Sunday November 12, 2000 @12:52AM (#629473)
    So this means that a search engine is now going to need to download every mp3 file it finds each time it crawls the net? Boy I feel sorry for mp3.com when one day some machine @ inktomi decides to pull down *every* mp3 on the site. Thats gonna be an expensive bill...

  • is make music copyright enfringement enforcable.

    What it will do is create a new genre of music, "Sonographic Clone Rock." Creating a program that can identify sonic patterns across encoding formats, bit rate qualities and whatever slight effects can be added to a copyrighted song will make it broad enough set off false alarms with something as simple as a spoof song.

    Or at least that's my prediction.

  • by kris ( 824 ) <kris-slashdot@koehntopp.de> on Sunday November 12, 2000 @02:40AM (#629475) Homepage
    I could see this working a lot better if all the music on the Web was pristine and complete -- which it's not.

    Now, that's funny. I could see this working a lot better if all the music on the radio was new and original. How would you tell one Britney Spears song from another, or from any Ace Of Base title?


    © Copyright 2000 Kristian Köhntopp [slashdot.org]
  • What's the problem? Metalica proved that bands are more than happy to go to quite a bit of trouble to find people illegally distributing their music. I'd much rather see this used in one of those searches. If it works as advertised it would cut down on false positives, instead of getting five thousand people who just had a cover or remix of the song you'd limit the list of IPs/Users to those with copies of your music. Since distributing music that's not yours is illegal I'd have no problem with this tech being used to single people out on Gnotella.
  • In theory that would be a non-trivial use, but I get the feeling that this technology is not for public consumption -- you could use it to verify that your MP3's were undetectable, and that would defeat the purpose of it. So, I imagine there would be hefty licensing fees.

    However if somebody wanted to create an independent freeware version for this purpose... well, that seems like a lot of work to go to for not much gain.

  • Sounds to me like someone has found a creative way to make money off the fear that big record companies have towards mp3. Sell them some fancy system that will basically just be a big waste of time.

    oh i thought you were talking about SDMI...

  • best freaking idea i've heard all day.
  • Okay, this idea has basically broken the bounds of good sense in my mind here. At this point, it has become clear to me that the RIAA and the whole music industry who are against the transmission of copyrighted work over the Internet are totally without a clue. Think I'm being overly harsh, judgemental and trollish? Well, look at what's gone on through jaded eyes shall we? First they see that people are getting copyrighted music for free on the Net, and rightfully see it as being dangerous to their capital, so they threaten to sue everyone and actually sue the biggest offenders. Then they realize that they were basically the LAST ones to show up at the party per say, since everyone and their cat now downloads MP3s (unless you simply don't, which of course is possible, but everyone I know does, and everyone they know does too). So then they start a guilt/FUD campaign (ads like that "Artists against music piracy" or whathever it is) and in the meantime decide that they have to compete in the "digital arena" rather than in court, so they develop SDMI. Right off, we know that SDMI is a joke, no one will use it when they can have just as good for free, but let's humor them. They develop SDMI, thereby trying to trump MP3 in terms of technological brilliance, and in the process stick a bandaid on a gushing headwound. So now it's time to REWORK OUR STRATEGY FOLKS!! Yup, when lawsuits and technology don't work, let's combine the two so that we can fight MP3s all over the net and censor search engines! That'll keep human nature from manifesting!

    Okay, I'm sorry about the vehemnance, but this whole issue has gotten very ridiculous. The RIAA/Music Industry couldn't have gone about this in a worse way if they had gone to the Supreme Court and asked for a law to make it a capital offense to distribute copyrighted material, punishable by death. I'm not going to lie and say that getting copyrighted material for free is not stealing, but the RIAA has screwed this whole issue up so badly, that it has become a laughingstock and an object of ridicule. If they had acted in a manner befitting of supply and demand in a consumer-friendly fashon, MP3s would never have caught on so well, and they might have been prepared for online digital music ahead of time. But this is it, in my mind. Pack up your suitcases and lawyers boys, you've lost. You took out your guns, pointed them at your respective heads and fired, and you deserved every bit of it. The Music Industry as a whole will survive, in some form or another, without you. And stay the hell away from my search engines and my Internet, because you don't know how to play our game.

  • Well then the engine won't show any of your files, including potentially illegal ones, so the engine will work as intended in the sense that it'll still only be legal mp3s.. so it won't include yours, but it won't include all anyway... there's no law against people using another engine, so what's the problem? actually I think this would sort itself out just fine, if the US requires it to be installed, go european. If that fails, go japanse, or whereever the hell else needed.. all it takes is one country denying to implement it
  • Why does everyone forget that:

    a) they don't have to send out a cease and desist letter directly from the output of the program. Lawyer for RIAA will use the output of the program to find where an infringing MP3 might be, use that to go look for him/herself, and then decide if the C&D letter is appropriate. The program will just be used to improve lawyer efficiency.

    b) they don't have to catch every bootleg MP3. Just enough to put a chill on the free speech issue.

    c) they only have to search for a subset of their copyrighted songs. Those top 100 that are currently popular. They aren't loosing much money on the rest, and those are scarce on the Net anyway.

    d) the RIAA et.al., can run their own damn search engines on mainframes with all the money they've ripped from artist.

    Hasn't Microsoft proven time and again that software technology doesn't have to be good to fool the sheople, just good enough?

  • oh, that was me. you can download my mp3's here [geocities.com].
  • So this means that a search engine is now going to need to download every mp3 file it finds each time it crawls the net?

    That might be true, but a more logical use of the tech would be to integrate it with Napster. That would allow Napster to only share "legal" songs, or songs that are owned by companies that made a deal with them.

    Songs don't need to be re-encoded, in fact, the tech works on any song format. It looks for key elements of a sound and compares that data with a master database to ID a song. So it won't make things lossy, and it should be able to overcome attempts to bypass it by renaming or using a different song format. Encryption or zipping still might work, but Napster could stop files with certain headers from being sent. Of course there is no way to ever stop the illegal sharing of music, but this could make it more difficult for the common folk.
  • NOT in Texas prisons! I read a long article trying to prove that the incredible level of rape in Texas prisons force inmates to band together for protection, and that is how the two morons who dragged that black man were intoduced to that kind of violent racism. They were forced to join the Aryan Brotherhood or something, and get disgusting racist tatoos to prove there allegiance. That kind of coersion can break a weak mind.
  • so, is there gonna be an online jail cell for all the violators of copyright infringement, or will we arrest people and stick them in already overcrowed jail cells over the country while there are petafatofiles out there reeking havoc, or, will the government milk money out of us, and pay the overly rich mucisians. (don't get me wrong there are a lot of musicians that i like and stuff!) WHAT IS THE WORLD COMING TO...HOUSE ARREST FOR YOUR COMPUTER?!?!?!!
  • This and that stupid map of the Internet that was on /. the other day are more amusing than not. I have gigs of MP3's and various other files that are probably questionable and I certainly haven't seen anyone that shouldn't be there in my iplogs scanning those files. People scp the files from me all the time so it does make me wonder what exactly they are tracking.
  • the essence of this plan is to identify which of the mp3's are copyrighted (most likely by certain record labels who pay for a service) so that unrecognized mp3's can be distinguished from those that are recognized as copyrighted. Any use of this stagnates the development of the web as a medium for music sales as they're not suggesting being able to recognize specific rips, only specific songs. So if Artist X releases track Y as an mp3 and on an album, there will be no way of determining whether an mp3 floating about is from the released mp3 or ripped from the album. So if this tracking is ever utilized, it will act as an impediment to the web's capability for legitimate sales of music.
  • They probably wouldn't give it away free IMHO
  • What about remixes of songs? These can be identical in places to the original, while still being a work by an independent artist. Are unauthorized remixes legal? What would this software do if it found one?
  • Freetantrum [freetantrum.org]. Free, open-source song fingerprinting with a sample client that can decode WAV, MP3, and Vorbis. They have an online database of all known songs with their fingerprints, and the client automatically looks the song up in the database.
  • By basing the pattern matching strictly on low-range frequencies, you're FFT/"beat finder" algo isn't going to catch any of the patterns in the higher frequencies (thus being b0rked by songs with sounds strictly >= 1000Hz or so, or different songs that use the same base sounds, like, say, every rap song in existence). Further, you're planning on running this algorithm (which requires doing a digital format change and computing a FFT, neither of which is cheap in terms of disk or CPU) on every song retrieved by a search engine (more resources in terms of bandwidth)? The search engnies would laugh at you if you proposed they spend money on this to actually put it into service on their machines. Oh, and now any search with music terms in it takes a leisurely 12 hours to complete if you match more than about 3 songs. Also, given the inherently distorted nature of the found song once you've bandpassed it, wouldn't you have to do the same thing to the MIDI files in the auth lib?

    So even if your non-AI algo were to work reliably (which I highly doubt), it would be prohibitively expensive in terms of system resources (now or a decade from now).


    --

  • If they can have code that automatically scans music, and can accurately identify it, it's use will NOT be in licencing it to existing search engines. The search engines simply wouldn't buy it unless some law forced them to. It could only serve to alienate users.

    It's far more likely that they'll get hired (by the RIAA, or certain artists I can think of) to write their own spiders that go out and seek music, and write script-generated cease-and-desist E-Mails to webmasters and ISPs.

    It's almost certainly possible to plug something like this into Napster or Gnutella as well.

    If this kind of technology is both efficient and accurate, it *could* actually change things.

    -Lux
  • by theDigitizer ( 239913 ) on Sunday November 12, 2000 @02:57AM (#629494) Homepage Journal
    I'm am so tired of corporations/government scanning everything and everyone. Sure me have privacy legislation, but it's not doing enough.

    A privacy amendment will also us to quote it like we do now, such as, "take the 5th", "1st amendment rights", So we need an amendment that gives the people basic privacy rights, that pertains to the 21st century, and while were in there, we could probably solve some copyright use issues as well."

  • it occurs to me that google already does this. They have a cache of every page found by their search engine.
  • I remember not long after I got an internet connection (through the U, august of 94), this big brouhaha happened about some people (Unisys? Lawyers acting for them? How quickly brain cells die when soaked with hard alcohol...) that were supposedly releasing a worm onto the Internet to "ferret out" patent-infringing GIFs...

    The small problem with that was, it was impossible. Even if some secret header code existed in "licensed" gifs, which to my semi-sketchy knowledge about graphics file formats does not (unless maybe gifs from "licensed" authoring tools had some sort of characteristic fingerprint like "made by gimp" or whatever), imagine for a second the difficulty of finding, cataloging, and determining the ownership of every gif on the net.

    Now take all of the previous difficulties of this type of InfringeWare, undiminished and in fact probably heightened, and add to them the fact that now instead of being concerned about the file format (a relatively fixed thing), you're trying to judge infringe/not-infringe based on the content itself. This would require one of two things to work (from what I can tell talking out my ass on slashdot @ 5am whilst drinking):

    • 1 -- a complete DB of every song ever recorded in any digital format, possibly at different bitrates. This would be for a "dumb" approach using pattern matching/comparison (like a global regular expression for mp3 contents, which might actually be a nifty hack for a local program ("computer, play me someting hardcore, with lots of drums" and the machine looks for patterns in the files themselves to spit something out into the soundstream)).
    • 2 -- an expert system powerful enough to comprehend and categorize musical information, that could tell a licensed recording of Mozart from a bootleg NIN concert, i.e. practically full-blown Artificial Intelligence.
    Both of these things seem less than likely to occur anytime soon. (If they had the former they'd be whoring it out a la terraserver's approach to space imagery, if they had the latter I hope as a human being that we could find more meaningful things to do with true AI than searching for mp3z! :-) )

    No, I think that this is just hot air intended to scare people into thinking the Big Bad Patent/Copyright-Holding Wolf is Just Around The Corner, so It's Time To Shape Up And Quit Trading Mp3s You Little Monsters... Another option is this is a vaporware company trying to feed of the greed and stupidity of the record labels...


    --

  • by Uncle Jimmy ( 253443 ) on Sunday November 12, 2000 @03:18AM (#629497)
    Let's say it can only match perfect rips (ok, of course it will have some sort of tolerance, but anyway), then this would be useful to weed out bad copies. How many times have you downloaded an mp3 to find that it has been encoded really bad, or worse still recorded off the radio, complete with back-announcements? Very, very annoying, because they never manage to announce the next song right.
  • I get your point, but... couldn't they send out threatening emails to anybody who has an MP3 with (for instance) Metallica in the file name, with roughly the same effect? Distinguishing between what is real and what's not is probably only useful in court... (correct me if I'm wrong...) But which is more convincing to a judge? A printout that says "This MP3 REALLY is a Metallica song," or listening to the CD version and the MP3 version in turn? At the very least I guess it could help the RIAA decide who to file charges against... but I've yet to see actual trials.

    Threatening emails are one thing; publicized court action is quite another. As soon we start seeing some martyrs, I imagine we'll think twice before signing up for the next Napster-clone.

  • Well then the engine won't show any of your files, including potentially illegal ones, so the engine will work as intended in the sense that it'll still only be legal mp3s.

    OK, what about configuring your server to return random noise to the search engine subnets, but the real MP3s to other people?

  • A much easier way to get it in place would be to just buy the search engine. Considering that Disney owns GO Networks, it's not a streach.

    Nope. They would still be destroying the value of the search engine with their actions. The next week everyone will be using the latest startup search engine, because it will have greater utility, and now that one will be worth billions. They can't go on buying them up and destroying their value for long. A law change is required.

  • This technique also would have caught "Ice Ice Baby" (really "Under Pressure") and "Come As You Are" (really "Eighties").

    Which pretty much invalidates any confidence you might have as to knowing what file you have. Not to mention covers, and how does this thing deal with real bootlegs (i.e. live recordings)?
    --
  • Sounds like a hi-tech version of name that tune. I can name it in 3 TCP packets, any one else?
  • Or could we just "remix" all our mp3z by adding a few seconds of silence (or static) at the front and end, thus changing the signature?
  • I once took a piece of music (un-named) and using certain sound editing software sped up the music without screwing up the pitch.

    The result was a piece of music, a performance, that had never existed before, done are a tempo that had more punch and groove.

    This worked out really well. But now I have a bit of music that is something the original artist never recorded.

    Who owns the copyright to that, and how would it sort out according to this proposed technology?

  • Yes, it probably would contribute to the disapearance of free music. (Atleast any which is not approved of by copyright holder).

    Yes, a percentage of it would slip underground to be forgotten. But it would slow stuff like Napster considerably since it could, in theory, sample available music from each subscriber and removed any subscribers with Copyrighted material.

    None of this is a suprise to me. I was telling my girlfriend I expected to see something like this show up soon.

    Whether or not we think RIAA and its members overcharge for CD's (they do!), they do have the right to protect their stuff.

    I would rather see them use this to limit something like Napster, than to hunt down and sue individuals.

    Which they could also do.

    I won't make people happy saying this, but I prefer they do this than SDMI. SDMI limits fair use, this limits distribution.

  • The article mentions that it will block all sites that have illeagal MP3's and only show sites that have good leagal MP3's. With all major record labels denouncing the trading of MP3's all together, it would be next to impossible to have a site with "leagal MP3's". In order for this to work as planned, the record labels would have flip flop and allow MP3's to be traded, which is unlikely. So who is this software going to please?
  • by Masem ( 1171 ) on Sunday November 12, 2000 @03:39AM (#629508)
    Stuff like this, and the earlier /. stories on geographically mapping the web and such, make me upset. We have that cybercrime treaty that's pending that would probably make all these things illegal, yet if Hacker X were to do what this article talks about they'd get jail time, while if Corporation Y does it, they get praised.

    The other problem is, will they adhere to robots.txt files? If they do, then bypassing the mp3 'sniffer' is a joke; if not, then they should be considered to be violating the explicit denial of a site to allow 'hacking tools' such as a search bot and are still in the wrong. In other words, this will either be uneffictive, or treading illegal water territories (and not necessarily in the vein of copyright infringement).

  • From the I can't believe they're this stupid dept.

    Could this be the Technolibertarian's dream come true and the end for constant vigilance and street corner phophetizing as we know it? FuckedFromtheOutset has announced a preliminary effort to start the planning process on some more vaporwear. Music DNA, that the company claims *cough* that it is capable of identifing and tracking of billions of existing and new MP3 files on the internet providing (get this) exact accounting for the copyright. "Thus enabling file sharing and linking value added data to songs" Fucked said in a pathetic attempt to spin. When asked if they were suggesting that it is currently illegal to share files, Fucked said "No Comment."

    Fucked also announced that, in order to cover it's massive burn rate, it has duped some brainless Europeans (similar to brainless Americans, but know more than one lanugage) into throwing money at Fucked. Musican Eric Clapton has been starving in recent months due to the evils of Napster, but still managed to scrape up a few million dollars to throw into the furnace. "Mr. Clapton's investment in the company speaks of the importance importance Music DNA will have in returning to the record labels their rightful monopolies, I mean, I saw the guy, he's all skin and bones." Someone said in another interestingly unattributed quote.

    The company anticipates that with industry-wide adoption of its music registry, acceptence by every node on the internet, a constutional amendment, a UN Resolution, and a few minor acts of God, the system will enable copyright holders to identify their content usage through at least a portion of the internet, thus ensuring that ownership and royalty right are fully "exploited, oops, don't print that, I meant 'monetized'". According to Fucked, Music DNA dosen't have an offical ship date but should come out "in a few months".

    Music DNA is an extension of other FuckedFromTheOutset products which have already made a huge impact on the distribution of copyrighted material across the Internet, which include a bunch of neat sounding jargon and buzzwords. "I assure you, we have tons of buzzwords. MCSE's bow to our buzzword dominance".

    FuckedFromTheOutset bullshits about how the process works: "Ok, see, it's sorta like this, Songs have patterns, right? and these don't change much if you have an exact digital copy, like a compressed 40kpbs mp3 recorded throught an analog bridge, see? So bitrate dosen't matter because this is about the information carred in it, all codec's have the same information, they don't try to elimanate information and guess at what's in the gaps." our weakly attributed source continued making a fool of himself for a few minutes, then said "Search engines can increase by atleast tenfold the amount of time and bandwith their spiders crawl through to make sure they're not linking to copyrighted materal, they're really gung ho about that, plus, an analyzer can be incorperated into a peice of client software residing on the PC to er, make sure the music is complete? Appearently, one can't figure that out by listening to it. We've talked with the XMMS people, they're all over that."

    Mor E. Assplease, an investor in the company fumbles: "Obviously, copyright protection rackets maintainence is a seminal issue confronting the Cyber-eNew iEconomy.com at the moment, and music is at the heard of the matter. With Music DNA, Napster and Scour could cover their asses by putting a lame block that dosen't work to appease the courts. We can now account to the artists and songwriters who have been shortchanged by the labels for long before the eInternet iEconomy.com, or wait, I didn't mean that". The company's Olsen Wells expresses his hopes for the process, adding that "as the industry transitions from music as a product to music as a service, Music DNA could conceivably have the greatest single impact on the music buisness since the creation of the MP3 format". When asked if he could clarify that statment, remove a few buzzwords, or somehow make it make sense, Wells replied "No Comment".

    Richard Stallman, leader of the Free Software Foundation, and proponent of free music, corrected our use of the word 'Linux' (appearently, it's GNU/Linux) but then began to laugh hysterically as we attempted to explain what Music DNA was. "I can just mess it up with dd on my Linux box" He continued, "GNU! GNU/Linux box I mean! please don't print that".

    Lawrence Lessing, a Technolibertian known for his book Code and other Laws of Cyberspace, when asked about it, faught to keep an amused look off of his face and said "Well, we've obviously overestimated the enemy here, I'll have to drastically restructure my 'invisable hand' theory, it assumes a much higher caliber opponent than that with which we are dealing".

  • I'm am so tired of corporations/government scanning everything and everyone.
    Is scanning your publicly-available files an invasion of privacy? If I run an ad in the newspaper offering pirate CDs, should I be able to put small print that says "Under the Privacy Act, no RIAA member or law enforcement agent or anyone acting on their behalf may reply to this advert"?
  • by interiot ( 50685 ) on Sunday November 12, 2000 @06:01AM (#629516) Homepage
    How 'bout just using political pressure? "Ladies and Gentlemen, this company could easily ensure that it doesn't trade in illegal wares. We have refined the program to a point where anyone can use it, in fact, Joe Schmoe here, a kindergarden teacher, was able to install it in his school's library in two hours. If search engine X isn't willing to take such easy steps, then the statement they're intending to send is that they wish to support trafficing in illegal wares."

    An argument similar to this was used to get the mandatory-porn-filters-in-schools-n-libraries amendment [68k.org] included in the House Appropriations bill that has a good possibility of being passed in the next week or two:

    • Mr. McCain: Internet filtering system work[s], and they need not be blunt instruments that unduly constrain the availability of legitimately instructional material. Today they are adaptable, capable of being fine-tuned to accommodate changes in websites as well as the evolving needs of individual schools and even individual lesson-plans. ...

      As we have seen through an increasing flurry of shocking media reports, the Internet has become the tool of choice for pedophiles who utilize the Internet to lure and seduce children into illegal and abusive sexual activity. ...As we wire America's children to the Internet, we are inviting these lowlifes to prey upon our children in every classroom and library in America.

    From porn filtering to copyright filtering. Not a large leap.
    --
  • or any other lossless compression mechanism. Wouldn't be too hard to develop 3pm, either, which stores the files backwards, or shuffles every n seconds, where n is a number between say 1 and 10 (depending on bandwidth).

    When will they realise (like BMG) that working with this new paradigm is much better than trying to defeat it? Oops, I guess they still haven't figured it out, witness the losing 'War on Drugs'
  • I want my lossy MP3s to become even lossier by running them through another codec. Right.

    - A.P.

    --
    * CmdrTaco is an idiot.

  • ...when, upon reading the article, I saw they used the "word" "monetized".

    Please. Anybody who thinks that's a word obviously has about 3 working brain cells (i.e. marketing.)

    - A.P.

    --
    * CmdrTaco is an idiot.

  • If this works as it should, how is it going to distinguish between covers/bootlegs and the original? This is particularly important with bands like Pearl Jam allowing bootlegs to be distributed for non profit purposes freely on the internet.
  • A workaround that's even easier to implement, and doesn't affect exisiting software: Use Apache's mod_rewrite so that every visit from the evil crawler gets fed the same small file, no matter what is requested.

    If the file contains a brief obscenity, so much the better.

    Those not running Apache, well, they need their own solution. Or to upgrade to Apache!

  • by stevens ( 84346 ) on Sunday November 12, 2000 @01:05AM (#629545) Homepage

    Their "application" (a webcrawler not logging 'illegal' mp3s) is a load of crap. Let's say I have cut in the first 15 seconds of a copywritten song--without permission--as a sample that I go on to critique in the audio file. I think that's fair use.

    IANAL, but neither is the webcrawler a lawyer. It doesn't have the ability to judge fair use.

    Worse, think if this 'webcrawler' is an RIAA bot looking for people to sue. It could lead to lots of frivolous actions.

    Steve
  • by EasyTarget ( 43516 ) on Sunday November 12, 2000 @01:06AM (#629548) Journal
    mp3, pron, are by far the two biggest search catagories out there, if the shlock-horror headlines in most rags is true.

    "Dear Mrs search engine owner, please may we install something on your search engine servers to cut out a sizeable proportion of your customer base?"

    What a great business model? it will -require- a law change to work, unless they think UCITA/DMCA can already be used to imtimidate big players like altaVista.

    EZ
    -'Press Ctrl + Alt + Delete to log on..'
  • This could be a great system. A friend and me are looking for a way of synchronising our mp3 collection - but not copying the same song if theres a few seconds in difference in size, or a different name.

    this system could be very useful :)

  • Create a page with 15,000 links to /dev/random, each with a different .mp3 filename.

    Could be great for hours of fun.

  • #1. It's easy to do
    It seems fairly easy to me do do a good "fingerprint" of a song by doing the math, determining the notes of the song, and the tempo, and maybe even determining who is singing based on voice sample matches once you're close.

    #2. It's hard to defeat
    Once you've got the code to do it, you can tweak the engine to work with different bit rates, streaming, etc.

    Because they base it on the psychoacoustic model, it pays attention only to the parts you want to hear anyway. It will ignore the various means you use to tweak the files, as long as they sound the same, which is the main goal for the consumer of the files in the first place.

    #3. It's hard/impossible to implement
    What's also obvious is that the "search engine" would now have to download every instance of MP3 file it happens to encounter. This whould result in a massive increase in the amount of traffic for an already futile system of indexing the web.

    We've already seen that the spiders that back search engines just don't have a prayer of keeping up with everything that is available. This is just dealing with the text part of web pages. Imagine trying to deal with millions of 3-10 Megabyte files that change every day!

    #4. It must surface in a different model
    It's just not feasible to download all of the MP3s that are available to do this, which means that the system is going to have to be selective in its downloading, and will, by necessity, result in "selective enforcement" of any laws this may detect the violation thereof.

    If lawmakers decide to run with this approach, they'll have to settle for selective enforcement (with the resulting requirement of making the penalty huge to compensate for the odds of getting caught), or they will have to resort to the insipid approach of requiring ISPs to run the program against their own servers. (The FBI could also be even more insidious and build it into Carnivore). Let's also consider it might get built as a feature into the web servers. (Good thing Apache is open source!)

    Mike Warot, Hoosier

  • Yep, this is obvious tech ... but, the genie is out of the bottle.

    How is this going to deal with gnutella, freenet, mojonation etc?

    Me, I like the 'private networking' option in Gnotella and others. Me and my buddies setup private little sharing networks. I believe that Groove and others have taken this P2P thing to new heights also.

    Sure, this may well work for all those geocities accounts and stuff, but at last count there were about, what? 20million+ Napster users ...

    When will these turkeys wake up and stop trying to prosecute their customers.

  • by Anonymous Coward
    This system is gonna have a really hard time with classical music. How can it know whether it's an amateur interpretation or a CD rip? The system can either accept everything or accept nothing, hereby insulting either the music labels or common sense.
  • And the MP3's of my band are going to start the flashing lights saying "sounds like stevie ray, but can't make out which song. wait! it's all of them!"

    It's as good an idea as the Strategic Defense Initiative.
    And it will be as succesful.

    FatPhil
  • by frenchs ( 42465 ) on Sunday November 12, 2000 @01:41AM (#629573) Homepage
    I was thinking... (which is sometimes a dangerous thing)

    Let's just say you have a server with a bunch of MP3's on it. And let's say this analysis of mp3's becomes a viable technology. Well then what is to prevent me from configuring *my* server (banning the ip) to ignore the search engine that implements this? :)

    Steve

  • You are right.

    The only working method (taking things like freenet into account) that I can think of would be the closing up of both the hardware and software that's used in connecting to the net.

    Integrate the network adapter into the motherboard and make it add a unique and traceable (who bought it, physical location, packet contents hint,...) ID into every packet.

    No more self-assembled computers. Access to the stuff inside the chassis would be allowed only to authorized personnel. Just like heroin can be manufactured and sold legally today but only by the authorized people in the drug industry. Any unauthorized access would be a criminal offence.

    Only authorized Operating Systems and device drivers allowed. Programming tools would also become controlled material.

  • Sounds to me like someone has found a creative way to make money off the fear that big record companies have towards mp3. Sell them some fancy system that will basically just be a big waste of time.

    Let's think about just a few of the simple ways to defeat such a system.

    Firstly - password protect mp3 download sites - Duh. In which case if the robot gets unauthorized access to the site, the ppl running it would be liable to break & enter charges.

    Secondly, it would be a very simple matter to have an mp3 encoder shift a lot of the audio values around so that any track appears quite differently from the perspective of a binary analysis, but doesn't alter the end sound remarkably.

    Yet another example of how AI isn't. And how it is always much simpler to fool an AI than it is to improve it. Think of the Iraqi techniques to fool american smart-bombs - current AI systems are all incredibly stupid when put against even moderate human ingenuity.

If all else fails, lower your standards.

Working...