Slashdot Log In
Ogg Vorbis - The Free Alternative To MP3
from the blinded-me-with-science dept.
Christopher Montgomery:
Vorbis is a hybrid time/frequency transform coder like mp3, but the similarity really ends there; it's more similar to TwinVQ in some ways (many shared mechanisms, albeit used somewhat differently).
Like mp3 (and virtually every other useful transform coder), we first look for strong changes and natural breaks in the input audio, and can use this information to break up the incoming audio into different sized blocks. When you lose information in the frequency domain, the resulting noise spreads throughout the time domain. A very strong spike in time will get smoothed out by frequency quantization, so the larger the block, the more audible it is. You want to isolate these strong, sharp events in smaller blocks.
Past this point, the similarities with mp3 end. Vorbis can do a time-domain pre-encoding using wavelets to further reduce spreading of time events and non-tone data. The current libvorbis doesn't have the code to do this yet, but the hooks are there for when we do finish this code (this feature will be post 1.0. Wavelets are still something novel that no one else is using in serious production yet, and we need to do more real R&D before it's ready).
Vorbis takes the time data directly to the frequency domain with an MDCT, where mp3 first subbands the data. The polyphase pseudo-QMF filter that mp3 uses for subbanding is not completely orthogonal; no matter how good the implementation, there will always be some aliasing. For this reason, Vorbis dispenses with subbanding altogether and just uses a large MDCT.
Vorbis then computes line-by-line masking curves for local peaks, long-distance simultaneous tone masking, simultaneous noise masking and temporal masking. These curves are use to separate inaudible tones from audible tones, and then choose a frequency domain amplitude curve that represents the 'base energy' of that audio frame. The base energy curve (I call it a floor) is subtracted from the MDCT data (like a whitening filter), which produces 'frequency residue'. The floor is converted to an LSP (line spectral pair) representation and then it and the MDCT residue are vector quantized into the final output codewords by a cascade of custom VQ codebooks that are packed along in the header of the bitstream. The result is one vorbis audio packet.
The audio packet is them embedded into an Ogg bitstream page and the page (when full of packets) is shipped out in the stream.
The decode side does the reverse, but without all the masking analysis. We extract the string of packets from the Ogg bitstream, and for each packet unpack the floor and residue, take the dot product and then do an inverse MDCT to recover the time-audio frame. Each frame is lapped and added to the previous frames and we get the original audio out.
Very simple, see? :-) To be fair, the masking analysis is the only real black magic. What I'm doing is almost entirely based on the masking curve data published in the late 50's by Robert Ehmer.
One thing the current release of Vorbis does not have is channel coupling (like mid-side stereo, although we'll be doing it differently). Beta 1 and beta 2 actually include multiple totally separate channels. The fact that we equal and better mp3's quality missing this huge piece is exciting. Mid/side stereo in mp3 drops the final bitrate of a stereo stream by 30-50kbps. To get a real comparison of Vorbis vs. mp3, compare mono streams or force the mp3 encoder not to use joint/intensity stereo (eg, -m m in LAME 3.84). Vorbis at 56kbps mono beats mp3 at 80kbps. At equal bitrate there's no comparison at all.
Slashdot:For those just tuning in, what's the project all about, and how did it get started?
The Vorbis codec is a lossy audio compression codec similar to mp3, but we're shooting for better performance (lower bitrates for a given level of quality) as well as keeping it totally Free as in Beer and Speech. I started work on Vorbis a week or two after Fraunhofer sent out 'cease and desist' letters to several free mp3 encoder projects in the fall of '98. At that point, it was clear the worst case was happening; the squeeze was on by commercial entities to not only dominate the legal distribution of music, but the underlying technology as well. A 'free license' to owned technology means nothing (and that's why Real and Windows Media are also worthless as infrastructure to us).
Fraunhofer (and MPEG in general) and the RIAA are also a bit too friendly behind the scenes, if not entirely in bed together. If you really believe SDMI is about protecting the artists, well, I have some wonderful Oklahoma beachfront property for sale at prices that are a steal, but you'd better act fast!
It's ironic that at the same time mp3 has been an agent to open up music distribution, it's becoming a tool for commercial interests to reclaim control. If online music is to fulfill its potential, an oligarchy can't be allowed to control its distribution or the technology behind it. The Internet would not have reached critical mass if it was a product of Microsoft or AOL or Oracle... It wouldn't ever have happened. Corporate control of every facet of online music will just strangle it in the cradle. The inventors of the Internet 'gave it away,' and that's been a great thing for business. However, the important lesson here is that the foundations were set in stone and wrought from iron before any company had self-interested influence. TCP/IP (brought to you by research laboratories) is elegant and farsighted; it's taken thirty years for it to begin wearing thin. E-mail is similarly brought to you from academia. HTML, on the other hand, (as ultimately brought to you by Netscape and Microsoft) makes good engineers weep and gnash their teeth.
We need to have unbreakable free music foundations in place before letting the commercial interests have their way with the infrastructure. I wouldn't rely on any infrastructure they build themselves.
Ogg and Vorbis are trying to continue the principles for which we in the open world see mp3 standing.
Slashdot: What are you working on right now?
Vorbis second beta. General quality improvements, additional bitrate modes in the encoder (96-350kbps stereo, mono modes), bugfixes, etc. After beta 2 (look for on Tuesday at about the time LinuxWorld Expo in San Jose opens), we have low bitrate modes to finish, channel coupling (joint stereo and joint surround) and constant bitrate modes (Vorbis by default is VBR).
Others in the project are working on tools... Mike Smith, Kenneth Arnold and others are knee deep in utils, Jack and Chad of Icecast are adding Ogg streaming to Icecast, Ralph Giles and Rob Kaye are working on stream mixing, metadata streams (Ralph is also hacking on MNG over Ogg). Kim, Tori and Emily at iCast are writing documentation...
The project has also outgrown our group. There are now Vorbis news sites (like govorbis.com and vorbiszone.com), an all-vorbis music label (vorbisonic.com) and other vorbis related sites poppin up. angrycoffee.com is working on Vorbis tutorials for beginners.
Within the core team, we need to get more people who are up on signal processing aspects like in the community around LAME.
Slashdot: Is this your full-time thing?
Yes. Ogg and Vorbis development are sponsored by iCast and they're also deploying it internally. In addition to paying salaries, they're pitching it to the industry and providing legal assistance.
Slashdot: Xiphophorus is a collection of people, projects and tools. What's going on with the collective?
Vorbis is a 'serious' project now, so we're expensing the massive espresso consumption ;-) The few of us who are now getting paid to do this can afford to be extremely intense about it. Other contributors still come and go. Right now, we're all pretty much focused on Ogg Vorbis; I have to apologize to all the cdparanoia users out there. I'll be working on it again in the future, but right now I only have so many cycles.
Ogg and Vorbis are currently getting more outside attention than we can really gracefully handle (well, handle and still get work done at the rate we're used to, which was still always slower than we want ;-) Apparently someone on some list claimed 'Vorbis was dead' because we hadn't updated the Web site in a month. Ha! If we were 'dead' we'd have plenty of time to write HTML :-) And answer mail. Anyone who sent me personally mail in the past month and a half, I'll answer it eventually, I promise...
Slashdot: Are you out to replace mp3 as the sound format of choice? If not, why not, and if so, what are the challenges?
We're out to keep things Free (capital F intentional). If MPEG turned around and made the mp3 spec and patents public domain, we'd definitely declare victory (and then continue coding to improve Vorbis). But we all know that isn't going to happen. More likely, if Fraunhofer decides we're a threat, they'll just delay licensing (remember kids: free licenses to binaries aren't worth jack) until the competition dies down. Then they'll squeeze again.
Honestly, I don't think we're going to 100% replace mp3 (people still use RAR for Christ's sake). I lay better than even odds on us eclipsing mp3 in the next year if the licensing picture stays the same. We also intend to have 80-96 kbps stereo streams that sound better than mp3 128 by that point, so people (and businesses) won't exactly have to give anything up to save money. Also expect hardware support soon, possibly by end of year if things go smoothly.
Slashdot: You talk a lot on your Web site about Open software. Which came first, the desire to deliver multimedia, or the drive to develop it openly?
My real hacking skills germinated at the MIT Lab for Computer Science. I'd coded practically all my life before getting to MIT, but I'd always been the best coder I knew, so I hadn't really learned much. When I got to MIT, I didn't feel stupid but it drove home that I had a lot of catching up to do. Most of my mentors were from the previous generation (all open source people) but a few of the very hardcore people were younger than me, too.
I've been a musician all my life too, albeit not a very good one (I feel a bit like Soliari in Amadeus) and Ogg was born in '93 when I bought a 1 Gig hard drive and a sound card and thought 'this is unlimited space! I can put music on this! And do things with it!'. I quickly found out that a Gig wasn't unlimited by a long shot, not even in '93 (I filled it with mail eventually), so I started muddling with compression. Greg Hudson made an offhand remark about there not being any good, free, music compression libs at the time, and Squish was born. I got a letter from a lawyer a few months later politely informing me that 'Squish' was a registered trademark and if I didn't change the name of my software, I could forget ever owning anything in the Western World ever again. Mike Whitson renamed the codec 'OggSquish'. The Ogg project was born. Oh, and we plan to release an updated Squish codec again sometime in the next year.
yes but... (Score:3)
Will Vorbis be the next VQF? (Score:4)
Hardware Hardware Hardware!! (Score:5)
When we get the first wavelet-enabled version, I would love to see Ars Technica (or somebody else) do an independent technical review of the audio quality vs. mp3 (and maybe vqf, aac, windows media, and whatever else there is...)
vs LAME (Score:3)
The LAME team takes extra care in analyzing the output and comparing it the FhG encoder and the previous version of LAME (just in case something broke). How does the Ogg team compare results? Is it with listening tests?
-mark
How about sync issues? (Score:4)
When MP3 is illegal (Score:5)
(Sure, you'll be able to find them on the Net, but if RedHat can't legally put them on their CDs, they're in the same twilight zone as arcade ROMs.)
As there is a single point of control for MP3, the RIAA could easily pay Fraunhofer a few billion (or even buy them outright through a front company), and get open MP3 pulled, forcing everybody to upgrade to encrypted SDMI formats.
Owning the patents for a technology you wish to bury can be very powerful. When Macrovision developed the copy protection mechanism embedded in all DVD players, they also created and patented a device for removing the protection. This enables them to sue anyone attempting to sell such a device or distribute the details of constructing one. (Not that it eliminates said information, but it drives it sufficiently underground to keep the ordinary people from seeing it.)
Once Fraunhofer start getting heavy with MP3 licensing, the penguinhead army will adopt Vorbis in a flash, and hopefully so will Windows-using music fans. Then the battle lines shift to hardware players.
The way to win the market (Score:4)
"Hey, this piece of music is available in both mp3 and ogg format, but oggs are a lot smaller, so I think I'll download that one."
if it does, that attitude will be why.. (Score:3)
as for me, i plan on investigating it more as soon as i get home today. it sounds (of course) great, we'll see how it
still, just like mp3, it won't be a replacement for CDs.. i haven't downloaded an mp3 in months, actually. i can't play them in my car, and why would i want to? CDs sound better. sure, there's the hassle of switching CDs, but really, with a nice sized disc-changer, that's just a once-a-month switch.
at one time i had 13gigs of mp3s available on my machine. that was over a year ago. right now i have none.
it's useful, but only for songs you don't really care about, you have an urge to hear them, download it, listen to it, and you're done. maybe check out a new album. but songs you care about, ones you want in a collection (i hope i'm speaking for more than myself here), are worth having on CD.
...dave
VQF got eaten by being proprietary (Score:5)
It may not go mainstream, but it will not be defeated like vqf was.
And if we get hardware players, you can bet I'll be moving all my music to vorbis!
Re:When MP3 is illegal (Score:3)
Just like we all use use .png instead of .gif today you mean?
Sorry, I'm in a pessimistic mode today
Why wouldn't it? (Score:4)
Furthermore, if this new format prooves to have better quality for lower bitrates then there is an additional incentive to use it. Even if it didn't people don't really have a loyalty to Codecs. People talk about MP3's because that's the only tech out there right now that provides the quality for the space constraints. It could be WAV, or AIFF, or RealAudio for all they care. Since they don't have to buy new hardware to support new codecs it doesn't matter to them.
---
Try minidisc (Score:3)
I went out and bought a minidisc player for about $200. It is a dream since discs are only $1.50 a piece and hold as much as a CD. It supports digital recording and is smaller than almost all mp3 players except maybe that mp3 player in a pen from sony. Anyway, it will save you on flash costs and on long trips, having ALL of your music instead of 32 megs worth is a real plus. Only downside that I have found, realtime recording, but you only record once so it hasn't been a hassle. Check out minidisc.org [minidisc.org] if you are interested. Some players even support text transfer from the computer ect, and it saves so much on media costs.
Re:I'm hearing FUD or a FUD-like substance (Score:5)
> RIAA. MP3 is VHS, Ogg is Beta (a bit better - but is it worth the switch?), and the only way
> that it is going to catch on is through scaring people away from MP3.
If you are a musician, and you want to put a recording up on your web site, you owe $15,000 per year according to www.mp3licensing.com.
That is scary.
Re:Try minidisc (Score:3)
Blank CDRs are going for about $0.50 / disk. Using ogg or mp3 format, they hold roughly 10 CDs worth of music on one disk.
MP3 players which double as CD Players are the perfect solution -- you can burn your own music collection onto CDs and listen to them anywhere. As the original poster correctly points out, we need ogg support on these hardware mp3 players! Fortunately, most are flashable, so upgrading to new, better formats (such as ogg) shouldn't be a problem.
Re:yes but... (Score:5)
Re:URL? (Score:3)
and I expect you to actually make it a link, like this: http://www.vorbis.com/ [vorbis.com]. You shouldn't lambaste /. for being lazy if you are too :)
i just can't shut up today.. (Score:3)
but this isn't all good, it's a question of quantity vs. quality
MP3 players which double as CD Players are the perfect solution -- you can burn your own music collection onto CDs and listen to them anywhere. As the original poster correctly points out, we need ogg support on these hardware mp3 players! Fortunately, most are flashable, so upgrading to new, better formats (such as ogg) shouldn't be a problem.
the whole hardware-mp3-player hype is mind boggling to me. i mean, i understand it, as a fad and gadgetry thing, but.. you're basically making a downgrade in your audio system when you go from a CD player to a straight-up mp3 player. the hybrids are the obvious, good, middle-ground.. as long as the price difference is reasonable.
the hardware fad is so bizarre though,.. there was never this kind of reaction over MOD or S3M
...dave
Re:How about sync issues? (Score:3)
Ogg is designed to allow seeking to sample precision, and indeed the current library can do exactly that (the players only seek to a given time, rather than a given sample, but that's a limitation of the players rather than the format). All the info is there, and the library gives you everything you need to use it.
Ogg also allows for multiplexing of streams (so an (as yet non-existent) ogg video codec could give you one stream, and vorbis give you another, all within the same file, and all with the syncing info you need.
Vorbis support in CDex (Score:4)
Re:Damn.. OGG *is* good! (Score:3)
Also, I'm listening to Trio's quintessential "Da Da Da" song, which the encoder actually made _larger_ in OGG format. Quality's decent, but you have to wonder what causes some files to become larger. Most of the files I've encoded get smaller -- around 60-80% of the original size.
This guys got it all wrong. (Score:3)
<beep>
<frequency>50hz</frequency>
</beep>
<guitar>
<style>bass</style>
<note>high C</note>
</guitar>
...
Re:The way to win the market (Score:3)
Also they have ecoders for Linux X86, BEOS and WIN.
Forget Napster Gnutella already has support for every freaking format there is or any new formats you could ever come up with.
Holy CRAP! (Score:3)
But wait, it gets better! A minimum of one cent per download??? Just how much do they freakin' think I _make_ from the downloads at mp3.com? I don't CHARGE people for those, it's part of the incentive program. As near as I can figure this would be in the thousands of dollars! O_O
Oh, man, you are right, this _is_ scary. Hopefully I can get paid what I'm currently owed while still remaining 'blissfully ignorant' of this situation. To me, the one cent minimum royalty on each download is the CRUSHER- that is an absolute showstopper. Makes me glad I never invested in an encoder beyond BladeEnc (say what you want, if you have the capacity of pre-emphasising the highs you can get a really smooth full sound out of it that's not over-dull).
Who knew? I could see a scenario in which mp3.com itself switches to Vorbis- that is, if it is actually possible to levy an mp3-distributing tax of a penny a download. Oy... this is nuts... looks like I have to teach myself C programming just to be able to compile and build Vorbis binaries for my Mac just to be able to operate as a musician... talk about 'not in the job description'! o_O
Hardware MOD (Score:5)
I guess the lack of press coverage for AmigaRut's products is just another lamentable sign of the media conspiracy against forward-thinking Amiga-friendly companies striving to keep the hype alive for the latest, most bleeding-edge 80's technology.
Unfortunately, our MOD player has been delayed because we have been working to incorporate not only S3M files, but also the old Apple II faux-stereo PCM files, complete with a codec that faithfully reproduces the wonderful warm, buzzy sound of the Apple II system speaker.
You're not going to find value like that in any johnny-come-lately MP3 player, bucko!
--
Re:Horse and Buggy (Score:4)
Bandwidth *is* increasing, but not at the geometric rate that storage has been (~50x over the last 5 years). Not only that, we've hit the first of the infrastucture bottlenecks with broadband (Quick show of hands - How many of you are getting your Broadband via DSL _through a DSLAM_? Not many, huh? Didn't think so.) If you can't get broadband now, odds are good that you won't have it for at least another year. (18-30 months where I live, a hour from Austin in the I-35 corridor. God help you if you're in the sticks.)
But that's just last mile stuff. 3 years ago, the combined US backbones (OC-48s at best) could handle ~300,000 simultaneous 16kbps RA streams given the exclusion of all other traffic. It's way larger than that now, but so is the online popluation, and if you want to stream audio and have enough listeners to make it worthwhile, you're gonna need a thin stream and a shitload of bandwidth. In real terms we still don't have the infrastructure. A T-3 will only handle ~2800 16kbps streams, and that's not a lot of listeners for the money. Many novel solutions have been proposed to this dilemma, but you always end up looking at a bandwidth problem at the backbone or the last mile.
Any technology that reduces the bandwidth required will quickly find a home. If a ogg/vorbis encoded song is 30% smaller, streaming or absolute, it signifigantly changes the economics of the bandwidth equation and lowers the time cost for any user who's not fortunate enough to have broadband.
Don Negro
Re:Possibly... (Score:4)
Let's look at the situation for mp3...
If you want to sell a program that decodes mp3's, it'll cost you 50 cents/unit shipped, with a $15,000 annual minimum royalty (if you give it away no licensing fee is necessary). Should you want to ship an encoder, free or not, it'll cost you $2.50/unit if you develop your own software, or $5.00/unit if you use Frauenhofer's, again with a $15,000 annual minimum.
So, what if you're a musician and you want to sell your music on the net in mp3 format? Hey, it'll only cost you 1% of the price you charge per mp3 (1 cent minimum), again with that pesky $15,000 annual minimum. Such a bargain.
Now, compare that with an encoder/decoder that costs you exactly $0, and makes better sounding files to boot. The musician makes an extra $15,000/year minimum just for switching formats. Diamond gets to keep another $0.50 per Rio (and this in an industry where they spend millions to whittle a few cents off the per unit price); those selling encoders get to keep an extra $2.50 - $5.00 per unit. Do you really think the Vorbis guys will have a hard time getting people to use their format?
It may take a while for the finale to come, but I think it's a good bet that mp3 is already the walking dead.
Answering a few of the readers' replies (Score:5)
First, Beta 2 is not out yet!. It'll be on www.vorbis.org and www.vorbis.com in inch high letters when it is :-) We're still shooting for releasing tomorrow.
Consumers are not going to switch, the industry is. Both small and large industry players are going to try to avoid mp3 because of the licensing. For a small artist, $15,000 is alot of money. For the big companies, a flat percentage/per track fee is a huge chunk of cash. I stand to save my sponsor, iCast, around eight figures next year and they're not even one of the industry 'heavyweights' (yetNow, the Slashdot crowd is not the typical herd of consumer sheep, but we're also a drop in the consumer bucket (we have more weight as techies than marketing segments). Ogg Vorbis will achieve market penetration top down because it saves everyone a ton of money and frees business plans from a large, uncontrollable external influence And if *companies* will use Vorbis to eliminate being yanked around (who says mp3 prices aren't going to go up? Remember, FhG reserves the right to set licensing case-by-case; MusicMatch gave away around 20% of their company for a free encoder license), the Right Thing for individuals is even more clear.
"Ogg Vorbis: Don't sell your Soul (or your equity)"
Actually, this is backwards. Beta died exactly because Sony strangled the format with licensing in order to keep complete control of it. VHS won because of relatively open licensing. First off, you're probably comparing beta 1; there were several analysis bugs that are fixed in CVS and beta 2.Secondly, I mentioned although I did not emphasize, that Ogg Vorbis does not currently have channel coupling. If you're comparing Ogg Vorbis to LAME, you're comparing an essentially 'bundled mono' compression [today] to mid/side stereo in mp3. If you tell lame to compress two mono channels (like *current* Ogg Vorbis is), you'll see that you need to hike LAME up to about 192-225kbps to compare to Vorbis 128kbps in non-mid/side stereo. The fact that Vorbis (l/r stereo) still often beats LAME (m/s stereo) is astounding.
Yeah, it's not fair to say 'we're better than mp3 if you cripple mp3'. The point is that this is the next feature we're implementing and at that point, our bitrate, for a given stereo quality, will drop by about 40% just like in mp3. From Segher, a hardcore mp3 hacker and friend:
(I was being conservative with 30-50kbps) You're right. There's no secret conspiracy. It's all very out in the open. MPEG (FhG especially) is fundamental in developing SDMI, and RIAA-mandated SDMI is an integral part of AAC/MPEG4.Courtney Love and others go off on this particular rant much better than I do, so I'll let it go at that ;-)
Correct, but that is not the case here.MPEG is not non-commercial (why do people thing they are?). It is an industry standards consortium. The aim of the RIAA and MPEG is to *make money* and maintain the necessary control to do so. That does not mean that they will act abusively, however the chances of them doing so are greater without any moderating agent.
Who here remembers the old phone company joke [back when AT&T had a monopoly in the US]: "We don't care; we don't have to. We're the phone company."? Extrapolate and roleplay accordingly. Why do people get up in arms about Echelon controlling/monitoring email when it's perfectly OK for MPEG/RIAA/SDMI to do the same thing?
Also true for now. Vorbis decode is *not* more complex than mp3, I'm simply a better engineer than I am an optimizer. Decode is bound on the iMDCT and iDRFT transforms I wrote (couldn't find any open source for them at the time) and they're not particularly speedy. Segher, Takehiro from GOGO/LAME and others are looking at making my solid but slow code a little less station-wagon-like(BTW, if you're using top to see CPU usage, you're suffering from undersampling inaccuracy. At a minimum, compare mp3 decoders to vorbis decoders using 'time' not 'top' ;-)
IP patents are slowly turning into "watch the USPTO go clinically insane", so there are no guarantees. However, iCAST is footing the bill for an independent patent review of Ogg Vorbis. I'm probably not allowed to make an official statement at this time, but I will say (whether I should or notWe'll have an official statement eventually, but the Wheels of Justice are already grinding much faster than the lawyers involved are used to ;-)
I will also say I've been pleasantly surprised at how technically sharp the lawyers we're working with are.
plu$ ca change (Score:3)
Streaming is nowhere near economical as is. The existing commercial formats take way too big a bite of a would-be streaming company's budget. To make way for small broadcasters, you need Ogg or something like it.
Running a company like iCast, you see your margins get shaved to hairs by the likes of Rob Glaser and Bill Gates. But it's not feasible to engineer a new format from scratch and force its adoption on the market. So what do you do?
Find a smart, dedicated believer in an alternative format, and fund his dream project. In the end, you give away the product, but you also liberate yourself from crushing per-stream licensing and expensive, unstable, coercive OS choices. You give away the milk, and in return, you get the cow. If the standard takes off, they will be able to compete against Real, and iCast's investors will be slapping each other's backs over the best investment they ever made. Just keep the smart guys who made it possible, treat them well, and if the format takes over, you can get all the business for outsourced streaming (and very few companies want to do it themselves) since you are, after all, the source of the streaming server software that powers it.
That's why you should pay very close attention to the article and parent comment's mentions of hardware players and industry support. These guys have made a very smart play, one I wish I was in a position to make because I saw it coming.
Boss of nothin. Big deal.
Son, go get daddy's hard plastic eyes.