Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×

ZeoSync Makes Claim of Compression Breakthrough 989

dsb42 writes: "Reuters is reporting that ZeoSync has announced a breakthrough in data compression that allows for 100:1 lossless compression of random data. If this is true, our bandwidth problems just got a lot smaller (or our streaming video just became a lot clearer)..." This story has been submitted many times due to the astounding claims - Zeosync explicitly claims that they've superseded Claude Shannon's work. The "technical description" from their website is less than impressive. I think the odds of this being true are slim to none, but here you go, math majors and EE's - something to liven up your drab dull existence today. Update: 01/08 13:18 GMT by M : I should include a link to their press release.
This discussion has been archived. No new comments can be posted.

ZeoSync Makes Claim of Compression Breakthrough

Comments Filter:
  • Current ratio? (Score:2, Interesting)

    by L-Wave ( 515413 )
    Exscuse my lack of compression knowledge, but whats the current ratio? Im assuming 100:1 is pretty damn good. =) btw...even though this *might* be a good compression algorithm and all that, how long would it take to decompress a file using your joe average computer??
    • Re:Current ratio? (Score:3, Informative)

      by CaseyB ( 1105 )
      but whats the current ratio?

      For truly random data? 1:1 at the absolute best.

      • Re:Current ratio? (Score:2, Redundant)

        by CaseyB ( 1105 )
        That's not right. A 1:1 average for a large sample of random data is the best you can ever do. On a case by case basis, you can get lucky and do better, but no algorithm can compress arbitrary random data at better than 1:1 in the long run.
    • Re:Current ratio? (Score:5, Informative)

      by radish ( 98371 ) on Tuesday January 08, 2002 @08:30AM (#2803216) Homepage

      For lossless (e.g. zip, not jpg, mpg, divx, mp3 etc etc) you are looking at about 2:1 for 8-bit random, much better (50:1?) for ascii text (e.g. 7-bit non-random).

      If you're willing to accept loss, then the sky's the limit, mp3 @ 128kbps is about 12:1 compared to a 44k 16bit wave.
    • The maximum compression ratio for random data is 1. That's no compression at all.
    • Re:Current ratio? (Score:5, Informative)

      by markmoss ( 301064 ) on Tuesday January 08, 2002 @09:01AM (#2803418)
      whats the current ratio? I would take the *zip algorithms as a standard. (I've seen commercial backup software that takes twice as long to compress the data as Winzip but leaves it 1/3 larger.) Zip will compress text files (ASCII such as source code, not MS Word) at least 50% (2:1) if the files are long enough for the most efficient algorithms to work. Some highly repetitive text formats will compress by over 90% (10:1). Executable code compresses by 30 to 50%. AutoCAD .DWG (vector graphics, binary format) compresses around 30%. Back when it was practical to use PKzip to compress my whole hard drive for backup, I expected about 50% average compression. This was before I had much bit-mapped graphics on it.

      Bit-mapped graphic files (BMP) vary widely in compressibility depending on the complexity of the graphics, and whether you are willing to lose more-or-less invisible details. A BMP of black text on white paper is likely to zip (losslessly) by close to 100:1 -- and fax machines perform a very simple compression algorithm (sending white*number of pixels, black*number of pixels, etc.) that also approaches 100:1 ratios for typical memos. Photographs (where every pixel is colored a little differently) don't compress nearly as well; the JPEG format exceeds 10:1 compression, but I think it loses a little fine detail. And JPEG's compress by less than 10% when zipped.

      IMHO, 100:1 as an average (compressing your whole harddrive, for example), is far beyond "pretty damn good" and well into "unbelievable". I know of only two situations where I'd expect 100:1. One is the case of a bit-map of black and white text (e.g., faxes), the other is with lossy compression of video when you apply enough CPU power to use every trick known.
  • how can this be? (Score:3, Informative)

    by posmon ( 516207 ) on Tuesday January 08, 2002 @08:14AM (#2803128) Homepage
    even lossless compression still relies on redundancy within the data, normally repeating patterns of data. surely 100-1 on TRUE random data is impossible?
    • by jrockway ( 229604 ) <jon-nospam@jrock.us> on Tuesday January 08, 2002 @08:21AM (#2803159) Homepage Journal
      I'm going to agree with you here. If there's no pattern in the data, how can you find one and compress it. The reason things like gzip work well on c files (for instance) is because C code is far from random. How many times do you use void or int in a C file? a lot :)

      Try compressing a wav or mpeg file with gzip. Doesn't work too well, becuase the data is "random", at least in the sense of the raw numbers. When you look at patterns that the data forms, (i.e. pictures, and relative motion) then you can "compress" that.
      Here's my test for random compression :)

      $ dd if=/dev/urandom of=random bs=1M count=10
      $ du random
      11M random
      11M total
      $ gzip -9 random
      $ du random.gz
      11M random.gz
      11M total
      $

      no pattern == no compression
      prove me wrong, please :)
      • They just threw out information theory entirely... too restrictive. They came up with their own theory... disinformation theory! Everyone seems to be jumping on the bandwagon, too... these guys [wired.com] even compiled a list of the pioneers!
      • by Rentar ( 168939 ) on Tuesday January 08, 2002 @08:26AM (#2803185)
        I'm going to agree with you here. If there's no pattern in the data, how can you find one and compress it. The reason things like gzip work well on c files (for instance) is because C code is far from random. How many times do you use void or int in a C file? a lot :)

        So a perl programm can't be compressed?

      • Well, I can think of two ways that "random" data might be compressed without an obvious pattern:

        * If the data was represented a different way (say, using bits instead of bytesize data) then patterns might emerge, which would then be compressable. Of course, the $64k question is: will it be smaller than the original data?

        * If the set of data doesn't cover all possibilities of the encoding (i.e. only 50 characters out of 256 are actually present), then a recoding might be able to compress the data using a smaller "byte" size. In this case, 6 bits per character instead of 8. The problem with this on is that you have to scan through all of the data before you can determine the optimal bytesize...and then it still may end up being 8.
        • by Dr_Cheeks ( 110261 )
          If the data was represented a different way (say, using bits instead of bytesize data) then patterns might emerge...
          With truly random data [random.org] there's no pattern to find, assuming you're looking at a large enough sample, which is why everyone else on this thread is talking about the maximum compression for such data being 1:1. However, since "ZeoSync said its scientific team had succeeded on a small scale" it's likely that whatever algorithm they're using works only in limited cases.

          Shannon's work on information theory is over 1/2 a century old and has been re-examined by thousands of extremely well-qualified people, so I'm finding it rather hard to accept that ZeoSync aren't talking BS.

      • yes, but, /dev/urandom isn't really random... if gzip was 'smart' enough, it could figure out the seed & algorithm for /dev/urandom and just save the output data that way. We don't really have any good way of generating really random data, so theoretically all data is not random and therefore arbitrarily compressible. In practice, of course, this is bullshit, and I think this press release will prove to be as well.
      • by nusuth ( 520833 ) <(moc.oohay) (ta) (su0000_oooo)> on Tuesday January 08, 2002 @01:19PM (#2804742) Homepage
        I have been pretty late to this thread, and I'm sorry if this is redundant. I just can't read all 700 posts.

        1:100 average compression on all data is just impossible. And I don't mean "improbable" or "I don't belive that", it is impossible. The reason is pigeon hole principle, for simplicity assume that we are talking about 1000bit files, although you can compress some of these 1000bit files to just 10bits, you cannot possibly compress all of them to 10bits, as with 10 bits is just 1024 different configurations while 1000bits call for representations of 2 different configurations. If you can compress the first 1024, there is simply no room to represent remaining 2-1024 files.

        ...And that is assuming the compression header takes no space at all...

        So every loseless compression algorithm that can represent some files with other files less than original in length must expand some other files. Higher compression on some files means number of files that do not compress at all is also greater. Average compression rate other than 1 is only achiveable if there is some redundancy in original encoding. I guess you can call that redundancy "a pattern." Rar, zip, gzip etc. all achieve less than 1 compressed/original length on average because there is redundancy in originals : programs that have some instructions, prefixes with common occurance, pictures that are represented with full dword although they use a few thousand colors, sound files almost devoid of very low and very high numbers because of recording conditions etc. No compression algorithm can achive less than 1 ratio averaged over all possible strings. It is a simple consequence of pigeon hole principle and cannot be tricked.

    • Re:how can this be? (Score:3, Interesting)

      by Shimbo ( 100005 )
      They don't claim they can compress TRUE random data only 'practically random' data. Now the digits of Pi are a good source of 'practically random' data for some definition of the phrase 'practically random'.
    • Re:how can this be? (Score:2, Informative)

      by mccalli ( 323026 )
      even lossless compression still relies on...normally repeating patterns of data. surely 100-1 on TRUE random data is impossible?

      However, in truly random data such patterns will exist from time to time. For example, I'm going to randomly type on my keyboard now (promise this isn't fixed...):

      oqierg qjn.amdn vpaoef oqleafv z

      Look at the data. No patterns. Again....

      oejgkjnfv,cm v;aslek [p'wk/v,c

      Now look - two occurences of 'v,c'. Patterns have occured in truly random data.

      Personally, I'd tend to agree with you and consider this not possible. But I can see how patterns might crop in random data, given a sufficiently large amount of source data to work with.

      Cheers,
      Ian

      • Re:how can this be? (Score:3, Informative)

        by s20451 ( 410424 )
        Of course patterns occur in random data. For example, if you toss a fair coin for a long time, you will get runs of three, four, or five heads which recur from time to time. The point is that in random, noncompressible data, the probability of occurrence for any given pattern is the same as the probability of any other pattern.
    • I realize that what I'm about to propose does not work. The challenge is to figure out why

      Here's a proposal for a compression scheme that has the following properties:

      1. It works on all bit strings of more than one bit.

      2. It is lossless and reversible.

      3. It never makes the string larger. There are some strings that don't get smaller, but see item #4.

      4. You can iterate it, to reduce any string down to 1 bit! You can use this to deal with pesky strings that don't get smaller. After enough iterations, they will be compressed.

      OK, here's my algorithm:

      Input: a string of N bits, numbered 0 to N-1.

      If all N bits are 0, the output is a string of N-1 1's. Otherwise, find the lowest numbered 1 bit. Let its position be i. The output string consists of N bits, as follows:

      Bits 0, 1, ... i-1 are 1's. Bit i is 0. Bits i+1, ..., N-1 are the same as the corresponding input bits.

      Again, let me emphasize that this is not a usable compression method!. The fun is finding the flaw.

    • True random data, however, is extremely rare. Even random number generator algorithms used on PCs don't generate truly random numbers, but rather "semirandom numbers" resulting from a number of operations being applied to the current timestamp. If you pull bytes out of /dev/random at specified intervals for a long enough time, you will eventually be able to discern what pattern connect these semirandom numbers to the time.

      As far as we can tell, the digits of Pi are random. They are also, however, based on mathematical relationships which can be modeled to find patterns in the digits. There are formulae to calculate any independent digit of Pi in both hexadecimal and decimal number systems, as well as known relations like e^(i*Pi) = -1.

      Anyway, the press release says that the algorithm is effective for practically random data. I'm not sure exactly what this means, but I would guess that it applies to data that is in some way human-generated. Text files might contain, say, many instances of the text strings "and" and "the", no matter what their overall content. Even media files have loads of patterns, both in their structure (16 bit chunks of audio, or VGA-sized frames) and in their content (the same background from image to image in a video, for example). Even in something as complex as a high resolution video (which we'll take to be "practically random"), there are many patterns which can be exploited for compression.
    • Re:how can this be? (Score:5, Informative)

      by ergo98 ( 9391 ) on Tuesday January 08, 2002 @09:17AM (#2803505) Homepage Journal

      Well firstly I'd say the press release gives a pretty clear picture of the reality of their technology: It has such an overuse of supposedly TM'd (anyone want to double check the filings? I'm going to guess that there are none) "technoterms" like "TunerAccelerator" and "BinaryAccelerator" that it just is screaming hoax (or creative deception), not to mention a use of Flash that makes you want to punch something. Note that they give themselves huge openings such as always saying "practically random" data: What the hell does that mean?

      I think one way to understand it (Because all of us at some point or another have thought up some half-assed, ridiculous way of compressing any data down to 1/10th -> "Maybe I'll find a denominator and store that with a floating point representation of..."), and I'm saying this as not a mathematician or compression expert : Let's say for instance that this compression ratio is 10 to 1 on random data, and I have every possible random document 100 bytes long -> That means I have 6.6680144328798542740798517907213e+240 different random documents (256^100). So I compress them all into 10 byte documents, but the maximum variations of a 10 byte documents is 1208925819614629174706176 : There isn't the entropy in a 10-byte document to store 6.6680144328798542740798517907213e+240 different possibilities (it is simply impossible, no matter how many QuantumStreamTM HyperTechTM TechoBabbleTM TermsTM) : You end up needed, tada, 100 bytes to have the entropy to possibly store all variants of a 100 byte document, but of course most compression routines put in various logic codes and actually increase the size of the document. In the case of the ZeoSync claim though they're apparently claiming that somehow you'll represent 6.6680144328798542740798517907213e+240 different variations in a single byte : So somehow 64 tells you "Oh yeah, that's variation 5.5958572359823958293589253e+236!". Maybe they're using SubSpatialQuantumBitsTM.

      • The output from a pseudo-random number generator is usually considered "random enough for practical purposes." So if you define "practically random data" as "data that is random enough for practical purposes," you can compress it by storing the random seed and the string length. ;-)

        I think I can beat their 100:1 compression ratio with this scheme.
    • by FlatEarther ( 549227 ) on Tuesday January 08, 2002 @11:41AM (#2804236)
      It is possible despite the many (uninformed) negative comments that have appeared concerning this truly amazing breakthrough in compression technology. I, myself, using my own patented compression technology - The Shannon-Transmogrificator (TM) have managed to compress the entire Reuters article to a mere 4 ASCII characters (!), with essentially no loss in meaning: 'C', 'R', 'A', 'P'. I wonder if anyone can improve on this ?
    • by Alsee ( 515537 ) on Tuesday January 08, 2002 @12:55PM (#2804624) Homepage
      Note the results are "BitPerfectTM", rather than simply saying "perfect". They try to hide it, but they are using lossy compression. That is why repeated compression makes it smaller, more loss.

      "Singular-bit-variance" and "single-point-variance" mean errors.

      The trick is that they aren't randomly throwing away data. They are introducing a carefully selected error to change the data to a version that happens to compress really well. If you have 3 bits, and introduce a 1 bit error in just the right spot, it will easily compress to 1 bit.

      000 and 111 both happen to compress really well, so...

      000: leave as is. Store it as a single zero bit
      001: add error in bit 3 turns it into 000
      010: add error in bit 2 turns it into 000
      011: add error in bit 1 turns it into 111
      100: add error in bit 1 turns it into 000
      101: add error in bit 2 turns it into 111
      110: add error in bit 3 turns it into 111
      111: leave as it. Store it as a single one bit.

      They are using some pretty hairy math for their list of strings that compress the best. The problem is that there is no easy way to find the string almost the same as your data that just happens to be really compressable. That is why they are having "temporal" problems for anything except short test cases.

      Basicly it means they *might* have a breakthrough for audio/video, but it's useless for executables etc.

      -
  • by Mr Thinly Sliced ( 73041 ) on Tuesday January 08, 2002 @08:14AM (#2803129) Journal
    They claim 100:1 compression for random data. The thing is, if thats true, then lets say we have data A size (1000)

    compress(A) = B

    Now, B is 1/100th the size of A, right, but it too, is random, right (size 100).

    On we go:
    compress(B) = C (size is now 10)
    compress(C) = D (size 1).

    So everything compresses into 1 byte.

    Or am I missing something.

    Mr Thinly Sliced
    • by oyenstikker ( 536040 ) <slashdot @ s b y rne.org> on Tuesday January 08, 2002 @08:19AM (#2803147) Homepage Journal
      Maybe they'll be able to compress their debt to $1 when they go under.
    • No...the compressed data is almost certainly NOT random, so it couldn't be compressed the same way. It's also highly unlikely any other compression scheme could reduce it either.

      I'm very, very skeptical of 100:1 claims on "random" data -- it must either be large enough that even being random, there are lots of repeated sequences, or the test data is rigged.

      Or, of course, it could all be a big pile of BS designed to encourage some funding/publicity.

      Xentax
    • by arkanes ( 521690 ) <(arkanes) (at) (gmail.com)> on Tuesday January 08, 2002 @08:21AM (#2803161) Homepage
      I suspect that when they say "random" data, they are using marketing-speak random, not math-speak random. Therefore, by 'random', they mean "data with lots of repetition like music or video files, which we'll CALL random because none of you copyright-infringing IP thieving pirates will know the difference"

      • I suspect that when they say "random" data, they are using marketing-speak random, not math-speak random. Therefore, by 'random', they mean "data with lots of repetition like music or video files, which we'll CALL random because none of you copyright-infringing IP thieving pirates will know the difference"


        Actually, if you change the domain you can get what appears to be impressive compression. Consider a bitmapped picture of a child's line drawing of a house. Replace that by a description of the drawing commands. Of course you have not violated Shannon's theorem because the amount of information in the original drawing is actually low.

        At one time commercial codes were common. They were not used for secrecy, but to transmit large amounts of information when telegrams were charged by the word. The recipient looked up the code number in his codebook and reconstructed a lengthy message: "Don't buy widgets from this bozo. He does not know what he is doing."

        If you have a restricted set of outputs that appear to be random but are not, ie white noise sample #1, white noise sample #2 ... all you need to do is send 1, 2... and voila!

    • This is a proof ('though I doubt it is a scientificly correct one), that you can't get lossless compression with a constant compression factor! What they claim would be theroretically possible if 100:1 where an average, but I still don't think this is possible.

    • B is not random. It is a description (in some format) of A.

      But, what you say does have merit, and this is why compressing a ZIP doesn't do much - there is a limit on repeated compression because the particular algorithm will output data which it itself is very bad at comrpessing further (if it didn't why not iterate once more and produce a smaller file internally?).
    • by Mr Thinly Sliced ( 73041 ) on Tuesday January 08, 2002 @08:34AM (#2803248) Journal
      Not only that, but I just hacked their site, and downloaded the entire source tree here it is:

      01101011

      Pop that baby in an executable shell script. Its a self extracting
      ./configure
      ./make
      ./make install

      Shh. Don't tell anyone.

      Mr Thinly Sliced
    • From their press release:
      Current technologies that enable the compression of data for transmission and storage are generally limited to compression ratios of ten-to-one. ZeoSync's Zero Space Tuner(TM) and BinaryAccelerator(TM) solutions, once fully developed, will offer compression ratios that are anticipated to approach the hundreds-to-one range
      What I read this to mean is that for some data sets, they anticipate 100:1 (or more) compression. For 'random' data, they will get some compression. Also note the 'once fully developed' phrase and the word 'anticipated'; they haven't actually achieved these results as yet; until they do, this is vapourware.

      BTW, someone shoot them for using so many TMs...

    • by swillden ( 191260 ) <shawn-ds@willden.org> on Tuesday January 08, 2002 @08:48AM (#2803350) Journal

      So everything compresses into 1 byte.

      Duh, are you like an idiot or something?

      When you send me a one-byte copy of, say, The Matrix, you also have to tell me how many times it was compressed so I know how many times to run the decompressor!

      So everything compresses to *two* bytes. Maybe even three bytes if something is compressed more than 256 times. That's only required for files whose initial size is more than 100^256, though, so two bytes should do it for most applications.

      Jeez, the quality of math and CS education has really gone down the tubes.

      • by pmc ( 40532 ) on Tuesday January 08, 2002 @09:12AM (#2803479) Homepage
        Duh, are you like an idiot or something?

        You're the moron, moron. When you get the one byte compressed file, you run the decompressor once to get the number of additional times to run the decompressor.

        What are they teaching the kids today? Shannon-shmannon nonsense, no doubt. They should be doing useful things, like Marketing and Management Science. There's no point in being able to count if you don't have any money.
    • by Bandman ( 86149 ) <bandman@gm a i l .com> on Tuesday January 08, 2002 @09:12AM (#2803480) Homepage
      I get the idea that this part of the algorithm is perfected by them...its the decompresser that's giving them fits...

      Step 1: Steal Underpants
      Step 3: Profit!

      We're still working on step 2
  • Maybe they just needed more bandwidth for their terrible site?
  • by Anonymous Coward on Tuesday January 08, 2002 @08:15AM (#2803134)
    The odds on a compression claim turning out to be true are always identical to the compression ratio claimed?
  • by bleeeeck ( 190906 ) on Tuesday January 08, 2002 @08:15AM (#2803135)
    ZeoSynch's Technical Process: The Pigeonhole Principle and Data Encoding Dr. Claude Shannon's dissertation on Information Theory in 1948 and his following work on run-length encoding confidently established the understanding that compression technologies are "all" predisposed to limitation. With this foundation behind us we can conclude that the effort to accelerate the transmission of information past the permutation load capacity of the binary system, and past the naturally occurring singular-bit-variances of nature can not be accomplished through compression. Rather, this problem can only be successfully resolved through the solution of what is commonly understood within the mathematical community as the "Pigeonhole Principle."

    Given a number of pigeons within a sealed room that has a single hole, and which allows only one pigeon at a time to escape the room, how many unique markers are required to individually mark all of the pigeons as each escapes, one pigeon at a time?

    After some time a person will reasonably conclude that:
    "One unique marker is required for each pigeon that flies through the hole, if there are one hundred pigeons in the group then the answer is one hundred markers". In our three dimensional world we can visualize an example. If we were to take a three-dimensional cube and collapse it into a two-dimensional edge, and then again reduce it into a one-dimensional point, and believe that we are going to successfully recover either the square or cube from the single edge, we would be sorely mistaken.

    This three-dimensional world limitation can however be resolved in higher dimensional space. In higher, multi-dimensional projective theory, it is possible to create string nodes that describe significant components of simultaneously identically yet different mathematical entities. Within this space it is possible and is not a theoretical impossibility to create a point that is simultaneously a square and also a cube. In our example all three substantially exist as unique entities yet are linked together. This simultaneous yet differentiated occurrence is the foundation of ZeoSync's Relational Differentiation Encoding(TM) (RDE(TM)) technology. This proprietary methodology is capable of intentionally introducing a multi-dimensional patterning so that the nodes of a target binary string simultaneously and/or substantially occupy the space of a Low Kolmogorov Complexity construct. The difference between these occurrences is so small that we will have for all intents and purposes successfully encoded lossley universal compression. The limitation to this Pigeonhole Principle circumvention is that the multi-dimensional space can never be super saturated, and that all of the pigeons can not be simultaneously present at which point our multi-dimensional circumvention of the pigeonhole problem breaks down.

    • If I recall my set theory properly the "Pigeon Hole Principle" simply states that if you have 100 holes and 101 pigeons, when you distribute all the pigeons into all holes, there will be at least one hole with at least two pigeons.

      I don't recall any of this crap about pigeons flying out of boxes. Or am I getting old?

  • Is this April 1st? (Score:3, Informative)

    by tshoppa ( 513863 ) on Tuesday January 08, 2002 @08:16AM (#2803136)
    This has *long* been an April 1st joke published in such hallowed rags as BYTE and Datamation for at least as long as I've been reading them (20 years).

    The punchline to the joke was always along the lines of

    Of course, since this compression works on random data, you can repeatedly apply it to previously compressed data. So if you get 100:1 on the first compression, you get 10000:1 on the second and 1000000:1 on the third.
    • But this is no joke.

      Please note they claim to be able to compress data 100:1, but do not say they can decompress the resultant data back to the original.

      By the way, so can i.
      Give me your data, of any sort, of any size, and i will make it take up zero space.

      Just don't ask for it back.

  • Press Release here (Score:2, Informative)

    by thing12 ( 45050 )
    If you don't want to wade through the flash animations...

    http://www.zeosync.com/flash/pressrelease.htm [zeosync.com]

  • a breakthrough in data compression that allows for 100:1 lossless compression of random data.
    That's fine if you only have random [tuxedo.org] data - but a lot of mine is non-random ;o)
    - Derwen

  • No Way... (Score:2, Redundant)

    Pure random data is imposible to compress - If You compress 1Mb of random data (propper Random Data, not pseudo random).. and you get, say 100K's worth of compressed output; what's stopping you feading this 100K's worth back through the algorhythm, again and reduceing it down even more.... again, and again, untill the whole 1MB is squashed into a byte! (Which, obviously is a load of rubbish).....
  • by neo ( 4625 ) on Tuesday January 08, 2002 @08:19AM (#2803150)
    ZeoSync said its scientific team had succeeded on a small scale in compressing random information sequences in such a way as to allow the same data to be compressed more than 100 times over -- with no data loss. That would be at least an order of magnitude beyond current known algorithms for compacting data.

    ZeoSync announced today that the "random data" they were referencing is string of all zero's. Technically this could be produced randomly and our algorythm reduces this to just a couple of characters, a 100 times compression!!
  • The pressrelease (Score:4, Informative)

    by grazzy ( 56382 ) <grazzy AT quake DOT swe DOT net> on Tuesday January 08, 2002 @08:20AM (#2803153) Homepage Journal
    ZEOSYNC'S MATHEMATICAL BREAKTHROUGH OVERCOMES LIMITATIONS OF DATA COMPRESSION THEORY

    International Team of Scientists Have Discovered
    How to Reduce the Expression of Practically Random Information Sequences

    WEST PALM BEACH, Fla. - January 7, 2001 - ZeoSync Corp., a Florida-based scientific research company, today announced that it has succeeded in reducing the expression of practically random information sequences. Although currently demonstrating its technology on very small bit strings, ZeoSync expects to overcome the existing temporal restraints of its technology and optimize its algorithms to lead to significant changes in how data is stored and transmitted.

    Existing compression technologies are currently dependent upon the mapping and encoding of redundantly occurring mathematical structures, which are limited in application to single or several pass reduction. ZeoSync's approach to the encoding of practically random sequences is expected to evolve into the reduction of already reduced information across many reduction iterations, producing a previously unattainable reduction capability. ZeoSync intentionally randomizes naturally occurring patterns to form entropy-like random sequences through its patent pending technology known as Zero Space Tuner(TM). Once randomized, ZeoSync's BinaryAccelerator(TM) encodes these singular-bit-variance strings within complex combinatorial series to result in massively reduced BitPerfect(TM) equivalents. The combined TunerAccelerator(TM) is expected to be commercially available during 2003.

    According to Peter St. George, founder and CEO of ZeoSync and lead developer of the technology: "What we've developed is a new plateau in communications theory. Through the manipulation of binary information and translation to complex multidimensional mathematical entities, we are expecting to produce the enormous capacity of analogue signaling, with the benefit of the noise free integrity of digital communications. We perceive this advancement as a significant breakthrough to the historical limitations of digital communications as it was originally detailed by Dr. Claude Shannon in his treatise on Information Theory." [C.E. Shannon. A Mathematical Theory of Communication. Bell System Technical Journal, 27:379-423, 623-656, 1948]

    "There are potentially fantastic ramifications of this new approach in both communications and storage," St. George continued. "By significantly reducing the size of data strings, we can envision products that will reduce the cost of communications and, more importantly, improve the quality of life for people around the world regardless of where they live."

    Current technologies that enable the compression of data for transmission and storage are generally limited to compression ratios of ten-to-one. ZeoSync's Zero Space Tuner(TM) and BinaryAccelerator(TM) solutions, once fully developed, will offer compression ratios that are anticipated to approach the hundreds-to-one range.

    Many types of digital communications channels and computing systems could benefit from this discovery. The technology could enable the telecommunications industry to massively reduce huge amounts of information for delivery over limited bandwidth channels while preserving perfect quality of information.

    ZeoSync has developed the TunerAccelerator(TM) in conjunction with some traditional state-of-the-art compression methodologies. This work includes the advancement of Fractals, Wavelets, DCT, FFT, Subband Coding, and Acoustic Compression that utilizes synthetic instruments. These are methods that are derived from classical physics and statistical mechanics and quantum theory, and at the highest level, this mathematical breakthrough has enabled two classical scientific methods to be improved, Huffman Compression and Arithmetic Compression, both industry standards for the past fifty years.

    All of these traditional methods are being enhanced by ZeoSync through collaboration with top experts from Harvard University, MIT, University of California at Berkley, Stanford University, University of Florida, University of Michigan, Florida Atlantic University, Warsaw Polytechnic, Moscow State University and Nankin and Peking Universities in China, Johannes Kepler University in Lintz Austria, and the University of Arkansas, among others.

    Dr. Piotr Blass, chief technology advisor at ZeoSync, said "Our recent accomplishment is so significant that highly randomized information sequences, which were once considered non-reducible by the scientific community, are now massively reducible using advanced single-bit- variance encoding and supporting technologies."

    "The technologies that are being developed at ZeoSync are anticipated to ultimately provide a means to perform multi-pass data encoding and compression on practically random data sets with applicability to nearly every industry," said Jim Slemp, president of Radical Systems, Inc. "The evaluation of the complex algorithms is currently being performed with small practically random data sets due to the analysis times on standard computers. Based on our internally validated test results of these components, we have demonstrated a single-point-variance when encoding random data into a smaller data set. The ability to encode single-point-variance data is expected to yield multi-pass capable systems after temporal issues are addressed."

    "We would like to invite additional members of the scientific community to join us in our efforts to revolutionize digital technology," said St. George. "There is a lot of exciting work to be done."

    About ZeoSync

    Headquartered in West Palm Beach, Florida, ZeoSync is a scientific research company dedicated to advancements in communications theory and application. Additional information can be found on the company's Web site at www.ZeoSync.com or can be obtained from the company at +1 (561) 640-8464.

    This press release may contain forward-looking statements. Investors are cautioned that such forward-looking statements involve risks and uncertainties, including, without limitation, financing, completion of technology development, product demand, competition, and other risks and uncertainties.
  • Buzzwordtastic (Score:2, Interesting)

    by Steve Cox ( 207680 )
    I got bored reading the press release after finding the fourth trademarked buzzword in the second paragraph.


    I simply can't believe that this method of compression/encoding is so new that it requires a completely new dictionary (of words we presumably are not allowed to use).

  • 100 to 1? Bah, that's only 99%.
    The _real_ trick is getting 100% compression. It's actually really easy, there's a module built in to do it on your average unix.
    Simply run all your backups to the New Universal Logical Loader and perfect compression is achieved. The device driver, is of course, loaded as /dev/null.
  • by tshoppa ( 513863 ) on Tuesday January 08, 2002 @08:26AM (#2803183)
    From the Press Release [zeosync.com]:
    This press release may contain forward-looking statements. Investors are cautioned that such forward-looking statements involve risks and uncertainties, including, without limitation, financing, completion of technology development, product demand, competition, and other risks and uncertainties.
    They left out Disobeying the 2nd law of Thermodynamics! [secondlaw.com]
  • Many people may say this is bull, but think of it in another way.

    Instead of assuming that data is static, think of it constantly moving. Even in random data, moving data can be compressed because it constantly moving along. It is sort of like when a herd of people file into hall. Sure everyone is unique, but you could organize and say, "Hey five red shirts now", "ten blue shirts now".

    And I think that is what they are trying to achieve. Move the dimensions into a different plane. However, and this is what I wonder about. How fast will it actually be? I am not referring to the mathematical requirements, but the data will stream and hence you will attempt to organize. Does that organization mean that some bytes have to wait?
    • For lossless compression simply saying "There were 5 red shirts and 7 blue shirts" isn't enough: You'd have to also store information on exactly where those 5 red shirts and 7 shirts were in the sample to be able to recreate the situation exactly as it was. Because of this it has been found to be impossible to "compress" truly random data without actually increasing the size of the file.

      Of course if you're talking lossy then everything changes: Who cares where the shirts are just tell em how many there was. Unfortunately lossy is only relevant for images and sounds.

  • What're they talking about? 20Gb of rand() output?

    If so, they're a bunch or twits.
  • by color of static ( 16129 ) <smasters@@@ieee...org> on Tuesday January 08, 2002 @08:31AM (#2803226) Homepage Journal
    There seems to be a company claiming to exceed, go around, obliterate Shannon every few years. In the early 90's there was a company called Web (before the WWW was really around by a year or so). They made claims of compressing any data, even data that had already been compressed. It is a sad story that you should be able to find in either the sci.compression FAQ or the renewed deja archives. It basically boils down to as they got closer to market, they found some problems... you can guess the rest.
    This isn't limited to the field of compression of course. There are people that come up with "unbreakable" encryption, infinite gain amplifier (is that gain in V and I?), and all sorts of perpetual motion machines. The sad fact is that compression and encryption are not well understood enough for these ideas to be killed before a company is started or stacked on the claims.
  • Blah! (Score:2, Funny)

    by jsse ( 254124 )
    We already have lzip [slashdot.org] to compress the files down to 0% of their original size. ZeoSync doesn't catch up with latest technologies on /. it seems.
  • If you read the press release carefully, they claim to be able to compress practically random data, such as pictures of green grass, 100 : 1. They never claim to be able to do the same with true random data, since this is impossible.

    There may be something about that. However, there are also many points that make me sceptical, but maybe the press release has not been reviewed carefully enough.
    This new algorithm does not break Shannon's limit, which is impossible, so the phrase about the "historical limitations" is a hoax...
  • ZeoSync intentionally randomizes naturally occurring patterns to form entropy-like random sequences through its patent pending technology known as Zero Space Tuner(TM). Once randomized, ZeoSync's BinaryAccelerator(TM) encodes these singular-bit-variance strings within complex combinatorial series to result in massively reduced BitPerfect(TM) equivalents. The combined TunerAccelerator(TM) is expected to be commercially available during 2003.

    I think they have made a buzz-word compression routine, even our sales peoply have difficults putting this many buzz-words in a press release :)
  • by Quixote ( 154172 ) on Tuesday January 08, 2002 @08:34AM (#2803245) Homepage Journal
    Section 1.9 of the comp.compression FAQ [faqs.org] is good background reading on this stuff. In particular, read the "WEB story".
  • Most random generation uses bytes as their unit.
    Now, what if they look for bit-sequences (not only 8-bit sequences but maybe odd numbers) in order to generate patterns ?
    I guess this could be a way to significantly compress data but this'd imply a huge number of data read in order to achieve the best result possible.
    Note they may also do this in more than one pass-through but then their compression thing should be really lengthy, then.
  • Never, *EVER* accept any advice from the Aberdeen Group. Apparently their analysts don't know shit.

    "Either this research is the next 'Cold Fusion' scam that dies away or it's the foundation for a Nobel Prize. I don't have an answer to which one it is yet," said David Hill, a data storage analyst with Boston-based Aberdeen Group.

    Wonder which category he expects them to win in...

    Physics, Chemistry, Economics, Physiology / Medicine, Peace or Literature

    There is no Nobel category for pure mathematics, or computing theory.
  • ... by compressing some VC's bank account, by a factor of greater than 100!

    "It was just data, you know," the sobbing wretch was reportedly told, "just ones and zeros. And hey - you can look at it as a proof of principle. We'll have the general application out ... real soon now, real soon".
  • Not random data (Score:4, Redundant)

    by edp ( 171151 ) on Tuesday January 08, 2002 @08:36AM (#2803258) Homepage

    ZeoSync is not claiming to reduce random data 100-to-1. They are claiming to reduce "practically random" data 100-to-1, and Reuters appears to have misreported it. What "practically random" data should mean is data randomly selected from that used in practice. What ZeoSync may mean by "practically random" is data randomly selected from that used in their intended applications. So their press release is not mathematically impossible; it just means they've found a good way to remove more information redundancy in some data.

    The proof that 100-to-1 compression of random data is impossible is so simple as to be trivial: There are 2^N files of length N bits. There are 2^(N/100) files of length N/100 bits. Clearly not all 2^N files can be compressed to length N/100.

  • Egads... (Score:5, Funny)

    by RareHeintz ( 244414 ) on Tuesday January 08, 2002 @08:36AM (#2803261) Homepage Journal
    ZeoSync said its scientific team had succeeded on a small scale...

    The company's claims, which are yet to be demonstrated in any public forum...

    ...if ZeoSync's formulae succeed in scaling up...

    Call the editors at Wired... I think we have an early nominee for the 2k2 vaporware list.

    ZeoSync expects to overcome the existing temporal restraints of its technology

    Ah... So even if it's not outright bullshit, it's too slow to use?

    "Either this research is the next 'Cold Fusion' scam that dies away or it's the foundation for a Nobel Prize," said David Hill...

    Somehow I think this is going to turn out more Pons-and-Fleischmann than Watson-and-Crick. Almost anytime there's a press release with such startling claims but no peer review or public demonstration, someone has forgotten to stir the jar.

    When they become laughingstocks, and their careers are forever wrecked, I hope they realized they deserve it. And I hope their investors sue them.

    I should really post after I've had my coffee... I sound mean...

    OK,
    - B

  • What is compression (Score:3, Interesting)

    by Vapula ( 14703 ) on Tuesday January 08, 2002 @08:37AM (#2803266)
    Compression, after all, is removing all redundancy from the original data.

    So, if there is no redundancy, there is nothing to remove (if you want to remain lossless).

    When you use some text, you may compres by remving some letter evn if tht lead to bad ortogrph. That is because English (as other langages) is redundant. When compressing some periodical signal, you may give only one period and tell that the signal is then repeated. When compressing bytes, there are specific methods (RLE, Huffman's trees,...)

    But, in all these situations, there was some redundancy to remove...

    A compression algorithm may not be perfect (it usually has to add some info to tell how the original data was compressed). Then, recompressing with another compression algorithm (or sometimes, the same will do the trick) may improve the compression. But the information quantity inside the data is the lower limit.

    Now, take a true random data stream of n+1 bits. Even if you know the value of the n first bits, you can't predict the value of n+1. In other words, there is no way that could allow the express these n+1 bits with n (or less) bits. By definition, true random data can't be compressed.

    And, to finish, compression ratio of 1:100 can be easily archived with some data... take a sequence of 200 bytes at 0x00... It may be compressed to 0xC8 0x00. Compression ratio is really only meaningful when comparing different algorithms compressing the same data stream.
  • From the press release:
    [...] once fully developed, will offer compression ratios that are anticipated to approach the hundreds-to-one range
    Hundreds to one! Someone help me breathe!! :)
  • by Zocalo ( 252965 ) on Tuesday January 08, 2002 @08:39AM (#2803278) Homepage
    Reading through the press release it seems to imply that they take the "random" data, massage the data with the "Tuner" part, then compress it with the "Accelerator" part. This spits out "BitPerfect" which I assume is their data format. It's this "massaging" of the figures where it's going to sink or swim.

    Take very large prime numbers and the like, huge strings of almost random numbers that can often be written as a trivial (2^n)-1 type formula. Maybe the massaging of the figures is simply finding a very large number that can be expressed like the above with an offset other than "-1" to get the correct "BitPerfect" data. I was toying around with this idea when there was a fad for expressing DeCSS code in unusual ways, but ran out of math before I could get it to work.

    The above theory maybe bull when it comes to the crunch, but if it could be made to work, then the compression figures are bang in the ball park for this. They laughed at Goddard remember? But I have to admit, I think replacing Einstein with the Monty Python foot better fits my take on this at present...

  • Is it possible, at all, to trust a company whose home page [zeosync.com] has silly javascript that resizes your browser window?
  • by sprag ( 38460 ) on Tuesday January 08, 2002 @08:42AM (#2803303)
    A thought just occurred to me: If you can do 100:1 compression and compress something down to, say, 2 bytes, what would 'ab' expand to? My thought is "ZeoSync Rulz, Suckas"
  • by harlows_monkeys ( 106428 ) on Tuesday January 08, 2002 @08:44AM (#2803313) Homepage
    From one of the things on their site: Although currently demonstrating its technology on very small bit strings, ZeoSync expects to overcome the existing temporal restraints of its technology and optimize its algorithms to lead to significant changes in how data is stored and transmitted (emphasis added).

    Using time travel, high compression of arbitrary data is trivial. Simply record the location (in both space and time) of the computer with the data, and the name of the file, and then replace the file with a note saying when and where it existed. To decompress, you just pop back in time and space to before the time of the deletion and copy the file.

  • by HalfFlat ( 121672 ) on Tuesday January 08, 2002 @08:47AM (#2803334)
    They're looking for investment money?

    Just think of it as an innumeracy tax on
    venture capitalists.
  • by dannyspanner ( 135912 ) on Tuesday January 08, 2002 @08:48AM (#2803349) Homepage
    For example, at the top of the list Dr. Piotr Blass is listed as Chief Technical Adviser from Florida Atlantic University [fau.edu]. But he seems to be missing [fau.edu] from the faculty. Google doesn't turn up much on the guy either. Hmmm.

    I've not even had time to check the rest yet.
  • by Mr Z ( 6791 ) on Tuesday January 08, 2002 @09:05AM (#2803441) Homepage Journal

    Their claims are 100% accurate (they can compress random data 100:1) only if (by their definition) random data comprises a very small percentage of all possible data sequences. The other 99.9999% of "non-random" sequences would need to expand. You can show this by a simple counting argument.

    This is covered in great detail in the comp.compression [faqs.org] FAQ. Take a look at the information on the WEB Technologies DataFiles/16 compressor (notice the similarity of claims!) if you're unconvinced. You can find it in Section 8 of Part 1 [faqs.org] of the FAQ.

    --Joe
  • team members (Score:3, Interesting)

    by loudici ( 27971 ) on Tuesday January 08, 2002 @09:07AM (#2803451) Homepage
    navigating through the flash rubbish you can reach a list of team members that includes steve smale from berkeley and richard stanley from MIT who both are existing senior academics.

    so either someone has lent their names to weirdoes without paying attention or there is something of substance hidden behind the PR ugliness. after all the PR is aimed toward investors, not toward sentient human beings, and is most probably not under the control of the scientific team.
  • by jd ( 1658 ) <imipak AT yahoo DOT com> on Tuesday January 08, 2002 @09:08AM (#2803459) Homepage Journal
    Simply have the bit big enough. Let's say you're using one of those old-fashioned binary computers, and want to compress everything to 1/Nth the size. No problem, you simply need a bit with 2^N states. Everything then fits on that single bit.


    (Of course, this DOES create all sorts of other problems, but I'm going to ignore those, because they'd go and spoil things.)

  • by Sobrique ( 543255 ) on Tuesday January 08, 2002 @09:09AM (#2803464) Homepage
    Don't bother compressing it, just delete it, and then get an infinite number on monkeys on an infinite number of typewriters to re-produce the original.
  • by Thagg ( 9904 ) <thadbeier@gmail.com> on Tuesday January 08, 2002 @09:10AM (#2803467) Journal
    I was wondering as I read the headline and summary on slashdot "how can these sleazeballs possibly promote this scam, because it would be easy to show counterexamples?" This shows, once again, that I lack the imagination and chutzpah of a real con artist.

    The beauty of this scam is that zeospace claims that they can't even do it themselves, yet. They've only managed to compress very short strings. So, they can't be called to compress large random files because, well gosh, they just haven't gotten the big file compressor work yet. So, you can't prove that they are full of shit.

    Beautiful flash animation, though. I particularly like the fact that clicking the 'skip intro' button does absolutely nothing -- you get the flash garbage anyway.

    thad
  • Not possible (Score:5, Informative)

    by Eivind ( 15695 ) <eivindorama@gmail.com> on Tuesday January 08, 2002 @09:12AM (#2803477) Homepage
    Someone already pointed out that repeated compression would give infinite compression with this method. But there's another easy way to show that no compressor can ever manage to shrink all messages

    The proof goes like this:

    • Assume someone claims a compressor that will compress any X-byte message to Y bytes where Y<X
    • There are 2^(8*X) possible messages X bytes long.
    • There are 2^(8*Y) possible messages Y bytes long.
    • Since Y is smaller than X, this means that no 1 to 1 mapping between the two sets can exist, because they're not equally large.
    You see this simply if I claim a compressor that can compress any 2-byte message to 1 byte.

    There are then 65536 possible input-messages, but onle 256 possible outputs. So It is mathemathically certain that 99.7% of the messages can not be represented in 1 byte. (regardless of how I choose to encode them)

    These claims surface ever so often. They're bullshit every time. It's even a FAQ-entry on sci.compression

  • by mblase ( 200735 ) on Tuesday January 08, 2002 @09:22AM (#2803538)
    Existing compression technologies are currently dependent upon the mapping and encoding of redundantly occurring mathematical structures, which are limited in application to single or several pass reduction. ZeoSync's approach to the encoding of practically random sequences is expected to evolve into the reduction of already reduced information across many reduction iterations, producing a previously unattainable reduction capability. ZeoSync intentionally randomizes naturally occurring patterns to form entropy-like random sequences through its patent pending technology known as Zero Space Tuner?. Once randomized, ZeoSync's BinaryAccelerator? encodes these singular-bit-variance strings within complex combinatorial series to result in massively reduced BitPerfect? equivalents. The combined TunerAccelerator? is expected to be commercially available during 2003.
    Now, I'm not as geeky as some, but this looks suspiciously like technobabble designed to impress a bunch of investors and provide long-term promises which can easily be evaded by the end of the next fiscal year. I mean, if they really did have such a technology available today, why is it going to take them an entire twelve months to integrate it into a piece of commercial software?
  • by wberry ( 549228 ) on Tuesday January 08, 2002 @12:02PM (#2804321) Homepage

    Back in 1991 or 1992, in the days of 2400 bps modems, MS-DOS 5.0, and BBS'es, a "radical new compression tool" called OWS made the rounds. It claimed to have been written by some guy in Japan and use breakthroughs in fractal compression, often achieving 99% compression! "Better than ARJ! Better than PKzip!" Of course all my friends and I downloaded it immediately. Now we can send gam^H^H^Hfiles to each other in 10 minutes instead of 10 hours!

    Now I was in the ninth grade, and compression technology was a complete mystery to me then, so I suspected nothing at first. I installed it and read the docs. The commands and such were pretty much like PKzip. I promptly took one of my favorite ga^H^Hdirectories, *copied it to a different place*, compressed it, deleted it, and uncompressed it without problems. The compressed file was exactly 1024 bytes. Hmm, what a coincidence!

    The output looked kind of funny though:
    Compressing file abc.wad by 99%.
    Compressing file cde.wad by 99%.
    Compressing file start.bat by 99%.
    etc. Wait, start.bat is only 10 characters, that's like one bit! And why is *every* file compressed by 99%? Oh well, must be a display bug.

    So I called my friend and arranged to send him this g^Hfile via Zmodem, and it took only a few seconds. But he couldn't uncompress it on the other side. "Sector Not Found", he said. Oh well, try it again. Same result. Another bug.

    So I decided that this wasn't working out and stopped using OWS. Their user interface needed some work anyway, plus I was a little suspicious of compression bugs. The evidence was right there for me to make the now-obvious conclusion, but it didn't hit me until a few *weeks* later when all the BBS sysops were posting bulletins warning that OWS was a hoax.

    As it turns out, OWS was storing the FAT information in the compressed files, so that when people do reality checks it will appear to re-create the deleted files, as it did for me. But when they try to uncompress a file that actually isn't there or has had its FAT entries moved around, you get the "Sector Not Found" error and you're screwed. If I hadn't tried to send a compressed file to a friend I might have been duped into "compressing" and deleting half my software or more.

    All in all, a pretty cruel but effective joke. If it happened today somebody would be in federal pound-me-in-the-ass prison. Maybe it happened then too...

    (Yes, this is slightly off-topic, but where else am I going to post this?)

Talent does what it can. Genius does what it must. You do what you get paid to do.

Working...