Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
Check out the new SourceForge HTML5 internet speed test! No Flash necessary and runs on all devices. ×
The Internet

Usenet Encoding: yEnc 431

Motor writes "Anyone remotely interested in usenet binary newsgroups must have noticed the spread of yEnc. yEnc is an encoding scheme for usenet binaries which avoids the enormous (30-40%) bloat associated with the schemes currently in use - which all have to produce 7-bit data to stop ancient newsservers from choking. A good thing, surely? Well, not according to some people. The guy has some good points about yEnc and standards, but I can't help thinking that "standards" people have endlessly discussed better encoding schemes, and nothing has come out of it. yEnc may not be perfect, but it works and it's here - hence the rapid adoption. What do you think?"
This discussion has been archived. No new comments can be posted.

Usenet Encoding: yEnc

Comments Filter:
  • by ksw2 ( 520093 ) <obeyeaterNO@SPAMgmail.com> on Saturday March 23, 2002 @09:28PM (#3214633) Homepage
    The article points out some interesting points why yEnc shouldn't be adopted... none of which will probably keep the community from adopting it, however. If it's here, and being used, that is a whole lot more intertia than common sense can usually gain. Er, betamax, anybody?
    • by Zeinfeld ( 263942 ) on Saturday March 23, 2002 @09:52PM (#3214699) Homepage
      I have some sympathy with the article author, but not when it comes to the MIME issues. I have written plenty of IETF and other standards, I know the value of going through a standards process, however the IETF is not a place to do research, it is a place to standardise and improve existing protocols. The idea is that you start from code.

      Breaking MIME is not something I would (do) lose sleep over. People in the MIME community screamed at us when we had the temerity to introduce the text/html content type, rather than use application/binary. They were completely obstructionist when it came to insisting on 8-bit clean transport for HTTP. In the end we treated them as damage and routed around them. HTTP uses several headers that the MIME people villified.

      The functional issues raised are significant and it would be good to see them addressed. In particular using the subject line is pretty lame. Either you want the encoding format to be completely independent of MIME or you don't. I think that MIME independence would be the better route since then it would be easier to move to a more modern protocol such as BEEP. But using magic numbers and MD5 inside the encoding does not seem like a bad move.

      The more interesting 'meta-point' however is that tweaking the encoding format is only scratching the surface when it comes to fixing UseNet. The main problem with USEnet is that it still has to route every single article to every single node whether it is going to be read or not. While the flood fill routing was a good scheme when NNTP was developed and the number of nodes was small it is needlessly wasteful now that we have hundreds of thousands of NNTP servers, it is just not necessary to have that level of redundancy to route arround censorship.

      • by Chasing Amy ( 450778 ) <asdfijoaisdf@askdfjpasodf.com> on Sunday March 24, 2002 @01:30AM (#3215248) Homepage
        > it is needlessly wasteful now that we have hundreds of thousands of NNTP
        > servers, it is just not necessary to have that level of redundancy to route
        > arround censorship.

        I disagree entirely. Never underestimate the government's ability to stretch censorship to new levels.

        Unless the very way NNTP servers operate is to gulp down and pass on each article for each newsgroup, the government would easily target those servers that spcifically carried groups or posts it doesn't like.

        Pressure for news providers to drop certain groups began several years ago when the Vacco busts of people trading in child pornography led a news service to be criminally charged for the content of some groups and led other news servers in that state and elsewhere to drop gcertain groups thanks to their content. The charged news service took a plea even though they clearly would have won at trial or on appeal by claiming common carrier status, but hey, nobody wants to be the expensive test case.

        Some may not see the problem with news servers being coerced by the government to drop those particular groups thanks to their contents, but the principle it sets is horrid. Certain "content owners" have of late been threatening to use the DMCA as a club to get news servers to drop groups which share TV shows and other such copyrighted material. If groups were more "localized" to a set of specific servers, or articles were localized to their originating servers, that would make it exceptionally easy for the DMCA to be used to require the "closure" of groups or removal of articles from USENET.

        Furthermore, in this time of anti-terrrorist hysteria, the government has gotten away with the USA/PATRIOT mess already and is continually making some questionable choices. If it finds a newsgroup dedicated to dissent, or more spcifically dedicated to anti-globalism, for example, it cannot easily dstroy such a group because of the nature of USENET--the damage would be routed around by servers in other countries, even if every U.S. server could be forced to remove a group or article (not that they could be).

        However, if the architecture of USENET were redesigned to localize groups or articles to subsets of servers--the likelihood of a government censoring USENET speech is magnified considerably.

        It is the redundant architecture of USENET which will keep it free of censorship long after the WWW has been tamed--as it will be. Just look at the broiling mess within ICANN over officials trying to hand control of the WWW over to government-appointed reps. Eventually something like that will happen, and governments will cooperate with each other to make censorship in their mutual interests easier. Thanks to the architcture and nature of USENET, it will remain free and uncensored long after the WWW has fallen to censorship.

        Just my 2 pence, though...
    • by Spy Hunter ( 317220 ) on Saturday March 23, 2002 @09:58PM (#3214718) Journal
      I would argue that the adoption of a standard is a much better indication of its "goodness" than its technical features. yEnc has been adopted by lots of people because it solves problems that they have, therefore it is proven to be good. If someone fixes the flaws that this author talks about and makes a new scheme that works better, then it might get adopted. If it does, it will be because it solves real problems people have with yEnc. If it doesn't, that means that it is too much of a pain for people to switch and that the problems yEnc has are not that much of a problem for real users. I think this is probably the case. So you can't use filenames with double quotes. Big deal! Change them to single quotes or something! So one out of a thousand posts will be corrupted because of mis-recognized magic strings or something. Its not any worse than it was before, and the downloads are smaller! If the problems really are THAT bad, a solution will come and people will use it.

    • Yes. A shipping product beats theoretical vapor every time, or, as previous generations would have said it, "A bird in the hand is better than two in the bush." But, that doesn't mean the product couldn't have been done right the first time (particularly with the apparently large number of folks who had made recommendations that would have improved the spec, which the yENC author is only now hacking into it). I believe this is the article author's point.

      < tofuhead >

      • Er, betamax, anybody?

      Not really comparable to the betamax vs. VHS debate. I've not seen anyone arguing that the alternatives, uuencode or base64, are better than yEnc, just that yEnc has serious deficiencies.

      Perhaps Mr. Nixon is arguing that yEnc is worse than some wholly theoretical alternative.

      Some of Mr. Nixon's points do seem interesting, but if he is convinced that there is a better alternative to be put forward, he should get the code out there. Anything else is just sniping.

    • Betamax failed because it was a proprietary format. If Sony had allowed other companies to make it (like JVC did with VHS) it would still be around (20 years ago it was better than VHS is today). Betacam still dominates the broadcast world.

      Now, yEnc looks like it was created by Microsoft. It's not a standard, it's a hack. The only way it can become a standard is by pushing it down people's throats and then using "public pressure" to force applications to support it.

      To use a videotape analogy, it's like releasing magnetic tape reels after people had been using cassettes for years, just because the reels use slighlty lighter tape.

      I hope yEnc in its current form is *not* supported by the industry. I think a company such as Forté could create a real standard using an encoding method similar to yEnc (it wasn't "invented" by yEnc's author, anyway). I think Agent's programmers, of all people, should know how hard it is to deal with these (non)standards, and could save themselves a lof of work in the future by making sure it's done right.

      As it stands, yEnc is the same as UUEncode, only in smaller portions (actually it's worse, because you can sort of wrap UUE in MIME; you can't with yEnc).
  • Yenc is great! (Score:4, Informative)

    by Mean_Nishka ( 543399 ) on Saturday March 23, 2002 @09:31PM (#3214641) Homepage Journal
    I for one am happy about Yenc's rapid adoption. My newsreader ( Xnews [3dnews.net])software supports it, and I have noticed no difference in using Yenc over traditional binary encoding.

    In fact, Yenc will help pay-per-gigabyte Usenet users achieve a greater bang for their buck. Anything that saves money is a good thing!!

    • ...I have noticed no difference in using Yenc over traditional binary encoding.

      Did you read the article? One of its major points was that traditional binary encoding sucks, and instead of yEnc, people should come up with something better.
      • Dammit! They should do something better! What exactly, I dunno. But they gotta do something.

        This is what I interpreted from the article, A rant about the use of a new type of encoding because it is a new take on the old. So what if it is kludgy. It seems to be working. I don't see any new type of encoding coming down the pipe that would be considered a next-generation-USENET-type-of-thingy.

        Hell, maybe USENET has lived beyond it's usefulness and does need an overhaul, but I seem to remember a time when people were bitching "Stop posting in base64! The rest of us can't read these files!" or even "Stop posting in JPG format! It takes too long for my poor 286 to decode JPGs instead of GIFs"

        All this appears to be is more of the same. The haves versus the have-nots.
  • instead of breaking the standard, code it right, wait till it matures, use it then.
  • DIME? (Score:3, Informative)

    by leighklotz ( 192300 ) on Saturday March 23, 2002 @09:36PM (#3214652) Homepage
    There are some proposals in thee XML and Web Services arena for dealing with some of the problems tha yEnc is skirting.

    One, called DIME, is a MIME-like system that handles binary content, chunks, etc.

    http://www.oasis-open.org/cover/dime.html
  • yEnc is terrible. (Score:2, Insightful)

    by Anonymous Coward
    I had scripts that automatically got some a.b* newsgroups, but after the invention of this bastardized yEnc piece of crap, all my scripts are broken, and I'm 2 months behind on data for our clients.

    BTW, I work for a pr0n site :) Only e-biz that's thriving still.
  • Agent (Score:4, Informative)

    by Account 10 ( 565119 ) on Saturday March 23, 2002 @09:37PM (#3214658)
    Forte [forteinc.com] released Agent 1.91 [forteinc.com] 2 days ago with yEnc support. it looks like Mr. Nixon is fighting a losing battle.
    • Pan (Score:2, Informative)

      by Alien Being ( 18488 )
      And Pan's got it too [rebelbase.com]. Tastes great, less filling!

      I wonder if it's in CPAN yet...
      Module Convert::yEnc (P/PN/PNE/Convert-yEnc-0.03.tar.gz)

      Yep. Works for me!
    • Either it is /.ed to hell or the server is having major problems (not that they are exclusive). A lot of people are having the download stop at 10-15%. All this for a simple 2 meg file. If someone already has it, throw it to a binary group and provide a link.
  • NOTE: This is actually a question, not a troll

    Does anybody really use usenet anymore? everytime i've poked around on my ISP's NNTP server, it seems to be filled with 90% spam, and non-spam posts seem to always be grossly offtopic. And no, i'm not just talking about the alt.binaries groups either.
    • Brings to mind Yogi Berra: "The trouble with that place is: it's so crowded that nobody goes there anymore."
    • Usenet is very much alive and a great way to bring together communities who like to share original material, even in binary newsgroups. My favorite is alt.binaries.pictures.motorcycles.sportbike where I can see many insane pictures posted on an impulse not found on websites.

      The binaries groups I have seen have been pretty much noise free. If the users are vigilant and like to use physical means to squash spammers, the forum is void of abuse.
    • Does anybody really use usenet anymore?

      I do. Some groups are mostly noise, but others are still pretty good.

      As for the spam problem, maybe you can suggest to your ISP that they install Cleanfeed [exit109.com] or similar. (Yes, that's the same site as the anti-yEnc page, which BTW I agree with.)
    • I use Usenet all the time (pun intended). While there is a good deal of spam in some groups others are pretty light on the spam. For me Usenet is the place to turn when I run into a problem in Linux because I've found there is a lower proliferation of idiots using it unlike IRC which is filled with 15 year olds blabbering about RTFM. It also has some pretty good discussions on several topics if you hunt around a bit.
    • Does anybody really use usenet anymore?

      Yes.

      That's why folks care about the yEnc issue. If it were a fight over, say, a Gopher implementation do you imagine there'd be much discussion?

      Usenet is in trouble, it may be mortally wounded, but it'll be around for awhile yet and in that time lots of folks use it.

    • Yeah...but once you get outside the porn groups the signal-to-noise ratio gets better :)

      As someone else said, there are plenty of idiots posting the same questions over and over (groups.google.com anyone?), and topics tend to degrade, but for the most part, the groups that are actually dedicated to something are pretty decent. The Tolkien groups, the linux groups, some of rec.arts groups.
    • I never post to it. (And after finding my ten year old posts to alt.drugs sitting in plain sight on Google, I doubt I ever will again!) But I do searches through newsgroups all the time when I'm looking for answers to obscure technical questions, or when I want to know if anyone's come across the same bug that I'm having trouble with at the moment. USENET might have more crap, but sometimes crap is what you want! You don't always want your search results to get choked up with corporate stuff. If you're doing Java programming for example, and you want to find out why class X doesn't work, a normal web search is difficult to control because all you get is bullcrap from Sun, and a zillion identical javadocs for class X that people leave around on their HTTP servers. I want to find out what people are complaining about, or whether anyone actually USES a certain API I'm considering. For getting a feel for what's going on in the field, without getting snowed under by marketing materials from a vendor, USENET is great. It isn't as corporatized.

      Of course, the existence of alternate web based bulletin board systems like this one decreases its relevance for search purposes. And it's suffocating under the weight of all that spam. But USENET is still the biggest forum out there, and it's still the one that's the most easily searched.
    • I've heard this "is usenet dead" crap one too many times. Comp.lang.python isn't dead. Soc.religion.quaker isn't dead. comp.protocols.tcp-ip.ibmpc isn't dead (okay, so it doesn't have a lot of traffic, but the principals read it, which is what matters).
      -russ
  • no no no (Score:3, Insightful)

    by SETY ( 46845 ) on Saturday March 23, 2002 @09:38PM (#3214666)
    The author of the article seems to keep saying there isn't a problem with USENET encoding, but then goes onto complain that yenc shouldn't be used. He points out flaws in how this encoding scheme is implemented. fine fine fine.


    There was a market for this thing, it spread like wild fire. It's too bad that no one made a better spec and program (the author aludes that there was planty of time to do this). yenc meets the "GOOD ENOUGH" criteria, thus it will be used, shitty, non-robust standard or not.

    • It's this kind of thinking that has made MSIE's broken HTML and Outlook's poor security the industry standards. Just because people can stand shitty software doesn't mean we need to give it to them.
  • yENC (Score:4, Insightful)

    by arsaspe ( 539022 ) on Saturday March 23, 2002 @09:40PM (#3214668)
    I'm all for standardisation... but sometimes it takes _forever_ to get something standardized. If someone writes a better product, they generally don't want to wait for it to be declared a standard, especially with something like uuencoding which has been around as long as usenet, and isn't going to be replaced in a hurry unless someone comes out and waves a product around yelling "hey try this. it works better". Ogg Vorbis isn't a standard by any means. Hell, it is still on RC3. _but_ a lot of people are using it because it has far better sound compression than mp3. You don't hear people complaining that Vorbis has jumped the standardisation process do you?

    Personally I can't see why we can't just send the data as 8-bit binary. uuencode and similar encoding formats should have died out with UUCP years ago, since there is no physical reason why 8bits can't be sent over the wire anymore.
    • by MSG ( 12810 )
      Ogg Vorbis isn't a standard by any means

      Ogg Vorbis has a specification for its file format that doesn't break any previously specified standard. Therefore, Ogg Vorbis can be called a standard.

      yEnc, on the other hand, breaks the NNTP standards, and will likely break the MIME standard. That's the difference. yEnc must fit within the standards already specified for the transmition methods it uses, and does not.
    • Re:yENC (Score:3, Insightful)

      A "standard" is not the same as "a common thing", or even "something that most programs can read". For example, most programs can't read YCrCb TIFF files (Photoshop included), but these files do follow the TIFF specification, hence they are standard.

      When you create an encoding method that is going to be used to transfer data across a network, you need to ensure that this method is compatible with everything along the way. When you send an e-mail with an Ogg file attached, this file is encoded in a way that enables the servers and the client at the other end to identify it, check its integrity, reconstruct it, process it, delete it independently from the rest of the message, etc. It doesn't matter what the file itself is (Ogg, MP3, TIFF, Doc, XYZ, etc.); these methods work with any file, regarldess of its type or contents.

      8 bits can be sent over the wire, and in fact are sent over the wire. But to make sure the servers (and the clients) can tell where one file (or part of the file) ends and the next one begins, you need to "wrap" the data in a package that programs can understand. That's what MIME does. It says "this part is the message text in HTML", "this part is the message text in plain text", "that part is an image", "that part is an executable file".

      Instead of using this "universal" packaging system, yEnc forces programs to look for specific strings and try to guess where things begin and end. And it has no mechanism for identifying individual parts in a multi-part post (again, programs must look at the text or message subject and try to guess).

      Doing something right is usually not much harder than doing something wrong. And when people get used to something that's broken, they won't want it fixed.

    • Re:yENC (Score:3, Insightful)

      by JamieF ( 16832 )
      It's interesting that it took this long for an ancient problem to be sorta kinda attacked, if not solved. Maybe this is because the Usenet infrastructure has changed to the point where 7-bit is silly, but why is this something that's just being addressed now? Don't tell me this is a brand new problem that the best minds of the news admin community couldn't have figured out by now.

      One approach to short-circuit standards is to take the rogue approach that Netscape did, which is very different from the one folks keep accusing Microsoft of. That approach is to take an almost-suitable standard and extend it, in the same spirit, with the intent that your well-thought-out extension will be adopted later. Of course they did a less than perfect job, but the idea is sound: don't wait for the spec, lead it, while making it possible for it to remain open. That's different from the Microsoft approach of "Embrace, Extend [and Exterminate]" wherein you add undocumented proprietary features that lock you into a single vendor's solution. A hell of a lot of what's in the HTML 4 standard was released as a non-standard but documented extension by Netscape in 1.1 and 2.0. Some of that was bad (blink, and arguably frames), but when you judge Netscape, try to separate the new HTML tags from the bugs, because it's the bugs (and having to code around them) that make the web such tower of Babel.

      So basically what I'm saying is, if MIME is almost there, CAREFULLY THINK THROUGH and implement the extensions you want, and implement it already. Or, pick something other than MIME. But, don't start from scratch and make mistakes that have already been learned from and solved. Build on the good decisions of the past. The real crime is solving the same problem, badly, over and over; getting ahead of a standards body while keeping intentions pure and decisions wise is not nearly as bad.
  • by mmusn ( 567069 ) on Saturday March 23, 2002 @09:43PM (#3214676)
    Uuencoded text will compress down to nearly the same size as its corresponding binary (or less, if the binary can be compressed). That kind of compression is now a standard part of modems, Internet protocols, and many file systems. Even the CPU overhead of compressing and decompressing that kind of data is negligible. If yEnc doesn't end up using less space on disk and doesn't end up using any less bandwidth than uuencode, indeed, "why encode" in yEnc and break a lot of software that expects USENET posts to be text-only?
    • Because people with premium news services (AKA, ANYONE that's serious about large binary downloads, the people at which yEnc is aimed):
      - have megabyte quotas, both upload and download
      - pay by the megabyte for their downloads

      Also, this saves space on the server hard drive. NO WAY are usenet servers compressing data on their hard drives. It's one of the most challenging situations for a hard drive, they're not going to wreck their performance by using compression. Having less data means more retention on the server.

      I personally have a Newsguy extra account, grandfathered at 1GB per day. I EXHAUST MY QUOTA by about 10AM, most days. I still do, but by then I've gotten 30% more data.

      It's not all about transfer speed. yEnc means people with download quotas get 30% more stuff per day.
    • Uuencoded text will compress down to nearly the same size as its corresponding binary (or less, if the binary can be compressed). That kind of compression is now a standard part of modems, Internet protocols, and many file systems.

      I have yet to see any DSL or cable modem with compression. So for most of the heavy binary users, uuencode data will not be compressed. And on regular modems it won't be smaller than the the yEnc, since if, as you say, the binary can be compressed, then the yEnc will be compressed..
      Even the CPU overhead of compressing and decompressing that kind of data is negligible.

      Do you run a heavily used news servers to provide proof that the CPU overhead is 'negligible'.
  • Adoption of yEnc (Score:4, Interesting)

    by jonbrewer ( 11894 ) on Saturday March 23, 2002 @09:43PM (#3214678) Homepage
    I am seeing smaller binaries as a result of yEnc. This is fine. The problem is, my favorite binaries grabber has no idea what to do with the files, and won't even download them. I figured out how to make Agent download them, but A. I hate Agent (and don't understand why anyone likes it!) and B. the binaries don't always decode.

    As I'm a lifetime lurker (well eight years, but it seems a lifetime!) I can only choose not to download yEnc encoded binaries. And no one will know! (my news server doesn't log downloads) It's all up to the posters to adopt or not.
    • Re:Adoption of yEnc (Score:2, Informative)

      by Backov ( 138944 )
      Check out Newspro, it's better than most binary grabbers.. I wouldn't use it for reading text newgroups, but XNews is good for that.. Newspro lets you combine headers from multiple servers. I download from my paid usenet server plus the @home servers and regularly saturate my connection at 500k/sec.. A buddy across town gets 700k/sec.

      Newspro - http://www.usenetopia.com

      XNews - http://xnews.3dnews.net/

      Newspro is shareware, XNews is freeware.. They're both good, for different things.

      Cheers,
      Backov
  • I've d/l some MP3's in the past few weeks with it and have not formed an opinion either way. I have not seen any compatibility problems and everything so far has decoded correctly. The only thing I noticed was it does not seem to compress as much as normal encoding does. I d/l all of my news binaries over a compressed SSH connection. Comparing bytes into and out of the interface with normal encoding and SSH compression saves me about 35%. Meaning for 500kB of data from the newsserver to middleman, and 300-350KB from middleman to me. Yenc encoded binaries seem to be about 0-5%. I have not compared two of the same binaries encoded in different methods to see how much smaller the Yenc really is. Remeber, this is only based on my very small completely unscientific numbers, the Yenc binary would have to be roughly 30% smaller to see an overall improvement. I'm sure this would only apply to people who normally d/l with some type of line compression like modem users or another means of compression, I assume the source materal would play a huge role in the final file size also. The only thing I've d/l so far and compared were MP3's which are already compressed.
    From a news server standpoint, smaller physical size files of Yenc would allow more binaries and longer times on the server before being pushed off.
    • Bytes are the wrong metric, in this case.

      Traditional encodings first inflate the binary by 30% or so. Then your compression scheme compresses that expansion right back down.

      yEnc inflates by much less, and your compression scheme deflates it right back down.

      Upshot: In real terms, you're probably looking at the same amount of time to download 300KB of real, decoded data. The old system looks faster because it has a bigger bytes/second number, but in real data per second, after compression, they're roughly equal.

      This will probably hold for nearly any simple encoding; encode in Base64 and watch your compression achieve 3-4x speed increase.

      Consider this next time you are looking at marketing materials; numbers can say anything, but it's real performance that matters. yEnc gains by having less data sit on the news server, and costs nothing to those using compressed connections.
      • Oops, typo (well, braino) on the Base 64 number. Let's pretend I said "Base4" encoding, defined the same way Base64 is. Now you can get 3-4x speed increases...
      • Upshot: In real terms, you're probably looking at the same amount of time to download 300KB of real, decoded data.

        I agree and thats why I brought this point up. The article linked in the story written by Jeremy states:

        yEnc creates significantly smaller encoded binaries than either uuencode,
        base64, or binhex. That means faster downloads and faster uploads.


        He uses that as the only real advantage of using yEnc. I tend to disgree that it is faster.

    • Re:Does it compress? (Score:3, Interesting)

      by Reziac ( 43301 )
      It takes more overhead to compress something that's not really compressable -- compare how long it takes to ZIP textfiles (highly compressable) vs. to ZIP the same byte volume of say, MP3s or JPGs (almost uncompressable).

      Download an ISO vs a ZIP of an ISO from the same server on the same connexion, and you can see this at work with modem compression too. Chances are fair that the ISO download will complete first. (Well, unless you have a WinModem :)

      I have to agree, the only real benefit is longer server retention. I don't see any benefit in having to juggle newsreaders every time yEnc gets updated. Maybe once it's a mature standard, then it'll be worth the effort to find another newsreader (I *loathe* Agent, and unfortunately my preferred NewsXpress is an abandoned project with no source available); meanwhile, I choose to ignore yEnc'd posts. And I'm thankful that they're marked in the subject line.

  • by phr2 ( 545169 ) on Saturday March 23, 2002 @09:55PM (#3214708)
    In the old CP/M days there was no standard way to transfer files over serial connections, except maybe Kermit. Kermit was slow because of its ping-pong protocol (no packet window--that was added later) and because it encoded binaries as printing characters. Ward Christensen invented XMODEM, which basically dumped the file through the wire as 8-bit characters, with very crude error checking and file headers. yEnc does something pretty similar for Usenet articles. It's a crude method for posting binary files as 8-bit characters instead of 6-bit characters. That of course cuts down transmission time considerably.

    Despite its problems, XMODEM took off because it filled a need, just as yEnc does. Nixon's complaint that shrinking files by 35% won't make Usenet any smaller because people will just post more files is besides the point; it's like saying getting a 35% salary increase won't help your finances because you'll just buy more stuff with the extra money. Most people want that extra 35%, and Jürgen stepped up to the plate and delivered it.

    Thankfully, as far as I know, nobody railed against Ward Christiansen the way Nixon does against Helbing. XMODEM's problems became obvious and the solution was to introduce YMODEM and then ZMODEM. XMODEM is still around, but its successors (and of course serial IP) have pretty much supplanted it. Ward's initial efforts are still deeply appreciated.

    Yes there's the problem of legacy software, but a protocol that's only been around for a few weeks or months can't have that much of a legacy. The only programs that currently support yEnc are the ones whose maintainers react pretty fast to new developments, and those maintainers are likely to also quickly pick up any revisions/fixes to yEnc.

    So the solution Nixon should be calling for is not a years-long bureaucratic standardization process that will get yEnc 1.3 entrenched while the standardization is happening. The solution is to fix yEnc's problems and release a new version as fast as possible, before the old version gets spread around too widely.

    • Yes there's the problem of legacy software, but a protocol that's only been around for a few weeks or months can't have that much of a legacy. The only programs that currently support yEnc are the ones whose maintainers react pretty fast to new developments, and those maintainers are likely to also quickly pick up any revisions/fixes to yEnc.
      The incompatibility problem isn't with clients. The problem is with the NNTP(newsgroup) servers. Some ancient servers will choke if there are eight-bit characters in the message.

      I still feel that moving to an eight-bit encoding system is fine, but let's be clear about what the issues are, okay? :)
  • by smoondog ( 85133 ) on Saturday March 23, 2002 @10:04PM (#3214739)
    Hey, if this guy [slashdot.org] adopts it, he could save slashdot some money...

    -Sean

  • What he's whining about is that it didn't fix every other problem in addition to overhead, and if anyone should actually bother making some huge new mime standard, now they won't have that carrot.

    Obviously, if the rest of the problems were as big as he's trying to claim, yEnc would only be a minor setback for a new and more comprehensive standard, but the fact is that the 35-40% overhead of current standards is by far what's most annoying to usenet users. After we got PAR (parchive.sourceforge.net), reposts have been reduced drastically (except for pr0n and partly warez groups, where the dumb people with shitty servers rules).

    Also, he's trying to say that because the increase in volume will outgrow the savings, there really is no savings. What kind of logic is that? Let's stop making processors faster, we'll just find bigger problems for them to be too slow for anyway, so what's the point?

    After the introduction of PAR and yEnc, as a long time binaries downloader, I'll say the actual content of multimedia groups has more than doubled, and probably tripled, the last 6-9 months. That's progress to me.
  • by gambit3 ( 463693 ) on Saturday March 23, 2002 @10:10PM (#3214753) Homepage Journal
    In one sentence, standards ARE important because they allow for the most people to get the most benefit.

    I work in an industry that relies heavily on standards, and my job deals specifically with standards. Making sure that WE follow standards, and making sure that other vendors follow standards.

    Sure, they're slow to develop. But they're the best for interoperability, and that's crucial. In my line of work (for a major Mobile Phone System NSS provider), I have to deal with other providers that have to follow the same standars we do. That allows both of our products to communicate. This gives the end consumer (i.e., Cingular, Sprint, etc.,) the option to buy from different vendors. This forces us to make better products. This forces us to be more efficient. This forces our competitors to do the same thing. In the end, everybody wins.

    The other alternative is what I see as the Micro$oft approach: Standards be dammed, I'm going to do it this way, and f*ck everybody else. It's the same approach that gives you security holes in your browser, because, well, who needs the standards?

    I can't believe I'm reading comments like "well, it's here and it works so what's the problem?"
    The problem is the future.
    The problem is the inability to send an SMS from a CDMA service like Sprint to a GSM one like Voicestream. That's what happens when you blow off standards.
    The problem is the inability to read an M$ Word doc that was sent to a Linux user.
    Ignoring standards and going off on your own (especially, going off BADLY on your own) just divides us.
    Good standards help us all. They give us better products. The lower costs.
    CD-Rs. FireWire. PCI. countless others.

    Besides, as the article begins by asking: Just what problem were they trying to solve?
  • I see nobody has said it yet:

    Standards are crucial. And the best thing about standards is: there are so many to choose from!

    Sig: What Happened To The Censorware Project (censorware.org) [sethf.com]

  • by Anonymous Coward
    The big savings on binaries is coming from .PAR files.

    If you don't know what they are, then you haven't been on usenet for a while.

    But essentially, it allows you to stripe you sets with parity so that you can lose up to "n" posts and the PAR programs can rebuild the missing pieces.

    I believe this has helped the backbone tremendousl.

  • Uuencoding relies on searching for "magic strings" in the message body of a Usenet post. This is unreliable, error-prone, and has already led to problems with certain client software. It is absolutely the wrong way to go about tagging message content, because what you really want is something reliably machine-readable and precisely specified. However, yEnc also relies upon magic strings in the body.

    There is no reason to despise magic strings. They work, and cannot ever occur in the user data. All yEnc magic strings start with =y, = being the escape character. Ctrl-Y does not need to be encoded, so yEnc is free to use =y for it's own purposes (e.g. =ybegin, =yend). Jeremy Nixon continues his misled rant...
    With a uuencoded multi-part post, client software typically uses the Subject line of the post to attempt to determine the filename, and to tell where the segment falls in the sequence. This is obviously a terrible way to do it.

    No, using the subject line is not obviously a terrible way to determine filenames, segments, and anything else. I find it very convienent to know exactly what my yEnc files will be saved as, how big they are, and how many parts they are in inside the subject line. Nixon says "Sure, it works out most of the time, but it is imprecise and error prone (especially when spaces are used in filenames)" This is blatently false nonsense. Quotes reliabily allow clients to discern the filename. It's not "imprecise and error prone" by any stretch of imagination.

    When non-ascii characters are used in message headers, software currently just has to guess what they mean. Jürgen's filename specification cannot even be used to reliably reproduce his own name.

    I give them that. Non-USASCII data in headers is a pain, and a large powerful organizational bodies needs to agree on a character encoding standard. Oh wait, they already did - Unicode!


    but gives no method to specify a filename which happens to contain quotes, which is not uncommon

    False again. I've never had a filename containing quotes on my Windows box. If we expect newsgroups standards to reach everyone, we must use the lowest common denominator. Similar to how ISO9660 used 8.3 filenames, but on a higher level.

    And the bandwidth savings? That's an illusion. A smaller encoding scheme gives us exactly one benefit: faster downloads and uploads for the users

    Which is exactly what the creators of yEnc intended.

    Meanwhile, the transition creates confusion for the users

    They mean "AOL users" of course. Usenet hasn't had a new encoding format in 6 years, it's about time. Adopting this format should be as easy as switching from Napster to OpenNap to Morpheus to Grokster to Blubster and so on.



    When Jürgen found that going through an actual standardization process within MIME would take time, he chose to ignore MIME in favor of getting something out there right away.

    I don't blame him. Jurgen is a coder, not a politician. I would have done the same thing.


    In short, yes I agree yEnc needs to be more polished. But the point is it works right now, and it's working great. It filled a gap in Usenet, itched a stratch to borrow an ESRism. Once yEnc is standardized as Y.32049 Annex D or whatever those standard organizations call it, we will use it. Until then, yEnc forever!

  • by Tofuhead ( 40727 ) on Saturday March 23, 2002 @10:55PM (#3214869)

    It should be pointed out that this site [faerber.muc.de], linked from yENC's own website, goes into more technical detail regarding the technical flaws of yENC. The fact that it's linked from yENC's own site is proof that the author is at least familiar with the concerns that people have with his implementation.

    I personally still find it difficult to argue against the article author's point that THERE WAS NO RUSH to force yENC out the door in such an unpolished form. After so many years of waiting for something better, why ignore the recommendations of those you are trying to help?

    < tofuhead >

  • by Ruddygore ( 109218 ) on Saturday March 23, 2002 @10:57PM (#3214879)
    Well then. When I put that page up, I honestly didn't expect many people to read it outside news.software.nntp and a few curious folks in alt.binaries.news-server-comparison. I certainly wan't expecting to get Slashdotted. Well, that's fine, except that the uproar might have waited a little bit.

    In my essay, I state that what Usenet needs is "a better way to post Binaries". The next piece of the puzzle, of course, is to answer the question, "What IS a better way to post binaries?" I was thinking about finishing that page up tonight, but I am writing code at the moment instead.

    So, when reading my comments, just keep in mind that, yes, I DO have some answers to that question, too. It's just that it's a bit of a more time-consuming question, so that page isn't done yet.

    This time around, though, I will make sure to include a prominent warning to NOT run off and implement the ideas as quickly as possible, and to please not use all of Usenet as beta-testers. The idea that whatever gets done fastest is best just doesn't work for me. There were good reasons I didn't go and get people to implement my smaller encoding ideas when I first wrote the code. If only the yEnc implementor had continued where I left off rather than going down his rather misguided path...

    All the comments are welcome. I've been getting some interesting email, too, of course. Many programmers of Usenet client software absolutely despise the thing and are quite annoyed at the amount of their time it is wasting. I guess it's just more of that never-ending divide between the users and the techies. So it goes.

    yEnc is here, that's for sure. Now we just have to try to deal with it.

    Jeremy
    • Your essay is the best summary I've seen so far of the reasons not to use yEnc. You have done a service to those of us who have been annoyed with yEnc -- now we don't have to explain it to anyone, we can just point them to your essay.

      So, be it resolved that yEnc leaves much to be desired.

      However, if yEnc is the impetus which actually gets the community moving toward implementing a good, solid standard, then it will have served its purpose. Perhaps if we had had yEnc 5 years ago, we would have a standard already. But we didn't, and now we must pay the piper.

      Since people aren't going to give up the advantages of yEnc without a substitute, the priority going forward is clear: to develop a better standard. If it truly is better (and not simply another hack) then ensuring its wide adoption shouldn't be too much of a problem. If, however, people can't be persuaded to switch, so much the worse for Usenet -- but no point in dwelling on doomsday scenarios. As you say, the cat is out of the bag, and all we can do is damage control.

  • by JohnA ( 131062 )
    I'm not sure if it is still true, but I know that Jeremy Nixon (the author of the article) worked at Supernews (now ReMarq) as one of their chief engineers. Not to be jaded, but it stands to reason that he would be against a technology that will decrease the data transferred by customers who pay by the gigabyte.
    • Actually, Supernews is now part of Critical Path (the RemarQ name went away). And yes, Jeremy Nixon does work for Supernews. He also wrote Cleanfeed, the anti-spam program that Usenet admins everywhere to make news bearable to read.

      The "he's against it because it saves bandwidth" argument makes no sense. If it saves users a little bandwidth, it saves Supernews many many times that much bandwidth, lowering their costs (which means they don't have to charge users as much to provide the same service). It also saves disk space, meaning Supernews doesn't have to buy new disks quite as soon. And a good bit of Supernews' business is in the corporate (outsourced ISP) service, which they don't charge by the gigabyte (they have speed caps, not monthly download quotas).

      The problem is that any savings are just an illusion; this is just a momentary blip in the growth of Usenet. Since yEnc doesn't have the 100% market penetration that uuencode and MIME have, people are more likely to post binaries in multiple formats, causing storage and bandwidth needs to increase, not decrease.

    • If you think this is going to decrease the amount of data transferred by customers, you need to take off those rose-colored glasses. I suspect it won't affect that at all.

      Smaller encoding helps us (the providers), it doesn't hurt us. (Besides, one look at the Supernews price list versus the competition will reveal that metered individual accounts are hardly the main focus.) With the customers the actual money comes from, less downloads would increase the margins, not decrease them. In fact, to a lesser extent, it would do so with the individual accounts as well -- the pricing on individual Usenet accounts (at almost all of the top providers) is such that the margins are lower at higher download levels. At the highest levels, it's nearly break-even. So less downloading would even be somewhat better there.

      If you really think that I'm against a smaller binary encoding scheme, then you completely missed the point of my essay -- and you also must have missed the part about how it was my first experimental code, implementing smaller encoding, from which yEnc was hatched. If I truly wanted to avoid smaller encoding, why would I have implemented it in the first place? Why would I have done it in public and then sent the code to several people, explicitely releasing it into the public domain? You think I would have done that to prevent the spread of smaller encoding?

      Had Jürgen picked up where I left off and did the thing right, I would be singing an entirely different tune.

      Jeremy

  • by Sloppy ( 14984 ) on Saturday March 23, 2002 @11:07PM (#3214909) Homepage Journal

    If, by any chance, you're transferring things over a modem (v42bis' lzw) or ssh vpn (zlib's deflate) or possibly other types of links, then you're probably not going to notice a difference anyway. The systematic encoding inefficiency that goes with base64 and uuencoding, results in a substantial lack of entropy that will be picked up on and exploited by good compression algorithms. Then end result won't be quite as good as having efficient encoding to begin with, of course, but it will be in the same ballpark. There's no way it'll be anywhere near a 33% difference.

    This sounds like something that would have been useful 15 years ago before compression was widely used, and when people were still writing newsreaders. Now it looks like a waste of time and an excuse to get people to "upgrade" their software.

  • by roystgnr ( 4015 ) <roystgnrNO@SPAMticam.utexas.edu> on Saturday March 23, 2002 @11:51PM (#3215006) Homepage
    Can *anyone* look at the uuencoded, mime encoded, and other similarly mangled into 6bit, 70 character-per-line standards, and honestly tell me that Usenet was designed with binary file transmission in mind?

    There are no Usenet binary transmission standards, just a few different hacks to make it work. If this guy's new hack makes it work better, good for him.
  • I lost sympathy about here:

    A smaller encoding scheme gives us exactly one benefit: faster downloads and uploads for the users. It is not going to make Usenet smaller. It is not going to allow servers to increase retention. Do you really think people aren't going to post more, if they can do it faster? Of course they are. They're always going to post more, with or without yEnc [...] big deal.

    So effectively, what he's saying is, in effect: "this system changes nothing, and is of no benefit, except that it makes more data available on the Usenet and gives users faster uploads and downloads. So it's worthless."

    This guy obviously hasn't had to use a metered dial-up account for a while. A 33% saving on transfer times is an enormous benefit. I feel quite insulted by the way he seems to think it's of no importance, as if my time and money aren't worth anything. "What's the rush" indeed! I'd happily tear up MIME and MD5 tomorrow if it would speed up my transfers by a third.

    If yEnc is so widespread, it can only be because there's a demand for it. And if there's a demand for it, why the hell shouldn't programmers support it? Last time I checked, RFC's weren't enforced by law. The Net has seen a million non-standard hacks, and has, for the most part, assimilated the good ones and outlived the bad. yEnc is by no means the worst, and it brings real benefits to tens of thousands of people every day. I say leave it alone - or if you have to oppose it, at least oppose it constructively, for Christ's sake!
    • You just gotta love the speculation that runs rampant around here.

      FYI, I am using a metered dialup account (actually ISDN, which isn't much faster, just more reliable). I cannot get unlimited, high-speed access (yet, I keep hoping). I pay well over $100 per month for Internet access at a fraction of the speed of peoples' cable modems and DSL lines, and it would be quite a lot more if I left it nailed up 24x7.

      You, like several others here, seem to have somehow gotten the impression that I am opposed to smaller encoding. I can only conclude that someone is spoofing my web host's DNS and some of you are reading a different page than the one I wrote.

      Jeremy
  • If everyone is as anal as the poster of this story and cares about elegance and "pretty code" and whatnot, we all would not be using a x86 processor because its architecture is just "sooo damn ugly!!". Get over it. If it works, it works, does it really matter how anything is implemented if it does its job just perfectly fine?

    Personally, I will adopt anything that will make my life (not the developer's life) easier. yenc allows me to download 30-40% less, so I use it. Why? because if I don't use it, I won't be able to obtain the mp3s, the games, the warez, the pr0n, etc, etc, etc that I wanted :) Real life is not pretty, deal with it... stop living in fantasy world.
  • Good idea, but... (Score:3, Informative)

    by Firecaster ( 169752 ) <firecaster@aCHEETAHlcedea.net minus cat> on Sunday March 24, 2002 @12:09AM (#3215060) Homepage

    ... the implementation sucked.

    I leech regularily from alt.binaries.anime [animeusenet.org] and the related newsgroups. When the yEnc posts started coming in, I simply upgraded my newsreader [3dnews.net] to the newest version. But a LOT of people out there use Agent [forteinc.com], and it was absolute pain to combine/decode all the yEnc posts that started popping up all over the place. The worst of it is that the yEnc posters were basically saying, "Start living in the present and upgrade". Nevermind that at the time that only yEnc-capable newsreaders were for Windows...

    I mean, I don't know, but this sounds a lot like the OS wars that have been going on for quite some time. Some people simply don't want to have to switch newsreaders. Some people don't want to have to switch OSes. And that's fine, because it's a free world out there. I like the idea of yEnc (I get more out of my Easynews account), but I really don't think it should have been introduced so quickly.

    ~ Firecaster ~
  • Two Problems: (Score:5, Interesting)

    by NeuroManson ( 214835 ) on Sunday March 24, 2002 @12:09AM (#3215061) Homepage
    One: yENC, when it was unveiled, did not really allow most conventional newsreaders any opportunity to adapt, til after the fact. This is akin to perhaps releasing zip files long before any archival software was actually available to open them... So do most of the folks using usenet for binaries get the opportunity to at least *choose* the way they do their downloads? Nope, they also are forced to adapt, or lose out...

    Two: Loss in transmission... I've been downloading yENC attachments for the last month, and out of them, found over 50% loss/corruption in posting... Not due to retention/propagation either... Just files missing large chunks... Now this *could* be due to some problems on the senders' end, but it seems just a little *too* coincidental that almost all of the losses have occured with yENC uploads...
  • by topham ( 32406 ) on Sunday March 24, 2002 @12:28AM (#3215110) Homepage
    This just reminds me of the napster data format. Anybody ever read the reverse engineered specs? It's scary. It looks like it was designed by a monkey. And not a smart one.

    yEnc sounds like a good idea, and a horribly bad implementation.
  • by Charles Kerr ( 568574 ) on Sunday March 24, 2002 @01:00AM (#3215183) Homepage
    I'm one of the authors of the Pan newsreader [rebelbase.com] and agree with Jeremy's analysis of yEnc. yEnc repeats many of uu's mistakes, so news clients have to search text/plain messages for =ybegin and =yend blocks instead of looking in the headers.

    But yEnc's bandwidth savings are real, which is a huge win for alt.binaries users. yEnc has been the most-requested feature for Pan over the last month. (0.11.2.90 [rebelbase.com] supports it.) IMO yEnc is the format to use for multiparts right now.

    Hopefully yEnc will motivate others to come up with a mime-friendly alternative encoding for Usenet. yEnc Considered Harmful [faerber.muc.de] is another yEnc opposition page that suggests mzip compression [faerber.muc.de], but I haven't seen any public discussion of it yet.

    If/when such a replacment comes along, Pan will support it too and add an are-you-sure dialog for yEnc postings.

  • by medcalf ( 68293 ) on Sunday March 24, 2002 @01:28AM (#3215244) Homepage
    It used to be that someone did something useful, then the community, through use choices, adopted it as standard. Then, if there were flaws, these would be ironed out with an updated standard, usually all or mostly backwards-compatible with the original implementation. It's gotten to where new standards are useless, either because companies (like, say, RealNetworks or MS) refuse to submit their protocols/formats for public use/review, or because the standards committees (say, for Java (before it was pulled) or the W3C) argue for years without actually doing anything.

    I, for one, am happy to see a useful format publically available.

"Trust me. I know what I'm doing." -- Sledge Hammer

Working...