Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Security

Hydan: Steganography in Executables 235

An anonymous reader says "Ever wanted to hide a message into an executable? Now you can with Hydan. Presented recently by Rakan El-Khalil at Defcon and Blackhat, this tool lets you embed data into an application without changing its functionality or filesize! Check it out. Use includes steganography as well as embedding a program's signature into itself to verify it's not been tampered with."
This discussion has been archived. No new comments can be posted.

Hydan: Steganography in Executables

Comments Filter:
  • What ... (Score:4, Funny)

    by Anonymous Coward on Thursday August 12, 2004 @04:05PM (#9952695)
    "What are you doing?"

    "Oh, hydan out."
  • If steganography is now in the hands of joe user, how useful is it really? It's not exactly a secret anymore, is it? ;P
    • by Ioldanach ( 88584 ) on Thursday August 12, 2004 @04:14PM (#9952801)
      If steganography is now in the hands of joe user, how useful is it really? It's not exactly a secret anymore, is it? ;P

      If I transmit files out to my friends that include encrypted data using steganography, then the extra data should be indistinguishable, effectively hiding within the noise of random crap on the web/usenet/email. Thus, without the key, an intercepted message is difficult to detect, and even if detected, I have sufficient plausible deniability to say "nothing there".

      In order to detect an message encrypted and included inside another file, you either need to know its there and be looking for it, compare it to an existing file which should be identical, or statistically detect some aspect of the file. If you know it should be there, you just need to grab any file that looks like the file you're seeking, grab the relevant bits, and attempt decryption. If you have a file that should be identical, (say, an image that looks the same that was posted to usenet a couple days earlier), you can take the bits that are different and try and make some sense of them. If you are just doing statistical analysis, you might be able to find files which have a set of bits whose randomness is just shy of where it should be, and maybe those bits mean something.

      In short, unencrypted steganography isn't particularly useful, but encrypted, you can really hide things.

      • Here's my question: How can you hide a message in an executable containing a single NOOP, or in the Perl program "" (without the quotes). You can't hide much in there.
        ----
        The Procrastinating Monkey [blogspot.com]
        • by Dun Malg ( 230075 ) on Thursday August 12, 2004 @06:20PM (#9953920) Homepage
          Here's my question: How can you hide a message in an executable containing a single NOOP, or in the Perl program "" (without the quotes). You can't hide much in there.

          The answer there is "you can't". You need a compiled executable large enough to have multiple instances of "alterable sequences". The way I understand it, they fiddle with reversable/interchangeable opcodes to create "bits". Say a program has 500 mixed instances of: (this is all made-up assembly)

          JNZ $(foo) ; jump if not zero to address (foo)

          JMP $(bar) ; otherwise jump to address (bar)

          and
          JZ $(bar) ; jump if zero to address (bar)

          JMP $(foo) ; otherwise jump to (foo)

          As you can see, a sequence of a JNZ followed by a JMP can easily be re-written as a JZ followed by a JMP. The program only needs to go through and change each instance to match bitwise value of the "message", treating JNZ-JMP as a bitwise 0 and JZ-JMP as a 1. There are probably more instances of "two ways to do it" one can exploit in a given executable to yield even bigger "message spaces".
          • As you can see, a sequence of a JNZ followed by a JMP can easily be re-written as a JZ followed by a JMP. The program only needs to go through and change each instance to match bitwise value of the "message", treating JNZ-JMP as a bitwise 0 and JZ-JMP as a 1.

            So long as none of the jumps is itself the target of a jump. I suspect detecting this in the general case of handwritten code is impossible. Presumably the assumption is that you never have a case like that. I'll have to read the paper.

            Of course

            • One of the most obvious examples of a jump -> jump is in the interrupt vector table, back in the DOS days. Also, some processors (such as intel 386, which I learned assembly programming on) have short and long jump instructions, so you might need something like a conditional short jump to a location that contains an unconditional long jump. IIRC, i386 didn't have conditional long jump instructions.
    • How do you know there's information in a given executable?
      If you know what compiler it was compiled under you could look for opcodes that aren't generated by that compiler. But what if you don't know what compiler generated the executable?
      And what if the information isn't hidden in the opcodes at all, but merely in the ordering of rearrangeable instructions?
      Take the following two instructions for example:
      mov ax, 5
      mov bx, 6
      What if your stenography program would set them in alphabetical order by register for
      • Neither have I, but it seems like it would be easy to scatter bits of a message throughout a data segment. After all, it would be difficult (or impossible?) to check that every bit of the data segment gets referenced in a non-trivial way by the code. Figuring out how to tip off the receiver as to the location(s) of the message is more tricky, but not impossible.
      • I'm pretty certain that Hyden has some kind of signature in it's method of hiding the message in the executable that will be easily identifiable in a short period of time. Even with encryption, all you would need to do is check any executable for the signature, then you only have the encryption standing in your way. This is concealment by obscurity at best.
  • I did this once .... (Score:4, Interesting)

    by taniwha ( 70410 ) on Thursday August 12, 2004 @04:05PM (#9952704) Homepage Journal
    Discovered a copyright string that was also executeable 68k code .... and included it in my main initialization routines
  • by Barondude ( 245739 ) on Thursday August 12, 2004 @04:06PM (#9952712)
    I am 1337.
  • by gl4ss ( 559668 ) on Thursday August 12, 2004 @04:06PM (#9952719) Homepage Journal
    if you blurt something like that out in the blurb maybe it would be nice to mention how the hell it happens. especially when the site gets slashed so fast.

    executable packing or actually increasing the filesize? either one has to happen.

    • by Carnildo ( 712617 ) on Thursday August 12, 2004 @04:10PM (#9952761) Homepage Journal
      Many executable formats include unused space for alignment purposes. For example, I've been working on a Mach-O equivalent of the super-tiny ELF executable mentioned a few days back. The executable produced by GCC includes 300 bytes of code and headers, and 8000 bytes of padding.
    • by jdray ( 645332 ) on Thursday August 12, 2004 @04:11PM (#9952774) Homepage Journal
      From the article:

      Hydan steganographically conceals a message into an application. It exploits redundancy in the i386 instruction set by defining sets of functionally equivalent instructions. It then encodes information in machine code by using the appropriate instructions from each set.

      • How could that be useful for stenography?
        Don't most compilers just use pick one of the redundant instructions and use that throughout? If so you just have to look for an executable that alternates betweent redundant instructions, and then you know that data is in there. At that point you're no better off than if you used plain encryption (and encryption uses less bandwidth...).
        • by Anonymous Coward
          Any sort of steg is vulnerable to "you just have to look for [insert technique that you already know about]". The purpose of steganography is not to make data unfindable, it's just to obfuscate the fact that it's there in the first place. If you know it's there, and you know how to look, finding the data is easy. That's how the extraction programs work, after all.

          Mass detection of the presence of steg with unknown techniques usually relies on on statistics over the "normal" types of files. When the LSB
          • Any sort of steg is vulnerable to "you just have to look for [insert technique that you already know about]". The purpose of steganography is not to make data unfindable, it's just to obfuscate the fact that it's there in the first place. If you know it's there, and you know how to look, finding the data is easy. That's how the extraction programs work, after all.

            Mass detection of the presence of steg with unknown techniques usually relies on on statistics over the "normal" types of files. When the LSB of
    • by Hi_2k ( 567317 ) on Thursday August 12, 2004 @04:18PM (#9952846) Journal
      I was at a SANS [sans.org]conference a while back, and the instructor, Ed Skoudis [counterhack.net], explained it as replacing certain operations with equivalents to represent bits. For example, "add 0002h" would be 0, "sub FFFEh", technically equivalent, would be 1. The more replaceable operations a program has, the more it can store. Hydan also encrypts the data with blowfish before storing it.
    • The data in executable bytestreams has a large amount of redundancy. Therefore there are lots of "wasted" bits which can be fiddled with to store data. The site's slashdotted but I assume this works by replacing certain instructions in the binary with other, equivalent instructions.

      There is a definite limit on the amount of data you can embed without increasing the file size, because a finite file can only have a finite amount of redundancy.

      Perhaps the author thought this was too obvious to mention. I c

      • *Perhaps the author thought this was too obvious to mention. I certainly didn't get the impression he was claiming the ability to embed arbitrary amounts of data in tiny executable files.*

        the author probabl never made it sound like that, but the guy that submitted this to slashdot did and the editors didn't provide explanation or better wording either.

  • Right now (Score:4, Funny)

    by Tsiangkun ( 746511 ) on Thursday August 12, 2004 @04:06PM (#9952720) Homepage
    it looks like the information is being hidden by a slashdotted executable.
  • Surely some dufus company is going to claim it's god and invented this first, and the "examiners" at the PTO will have just rubber-stamped this without any research, as usual...

    So, for when the first lawsuit against this?

  • by advocate_one ( 662832 ) on Thursday August 12, 2004 @04:08PM (#9952733)
    especially if the OS goes off and double checks the executable is legit before executing it...
  • by guanno ( 597251 ) on Thursday August 12, 2004 @04:09PM (#9952747)
    If you embed a signiature of the file into the file, this by definition changes the file's signiature. At best you can append the signiature. However if the file can be modified, so can it's signiature.

    If these folks have figured out a way of circumventing this innate paradox, I'm impressed and am dying to hear more about the technology/mathematics behind it! Can you say Nobel Prize nomination?
    • IIRC, the same way you used Hydan to inject the signature, you can remove the signature. At worst, you can extract the signature (at this point, it becomes a password or keyword...) to verify the integrity of the file. Very interesting program!
    • by jrockway ( 229604 ) * <jon-nospam@jrock.us> on Thursday August 12, 2004 @04:22PM (#9952890) Homepage Journal
      Unless you do it like this (an example is always easy to understand).

      Say you have an executable:

      1337PROGRAM

      Your signature checking routine then does this:

      1_3_3_7_P_R_O_G_R_A_M

      and computes the hash

      deadbabeca

      And then sends:

      1d3e3a7dPbRaObGeRcAaM

      To reverse, we extract the hash (deadbabeca) and the "original" executable.

      Then we compute the hash (of 1_3_3_7...) and check if it matches...

      In summary, we embedded a checksum, but we removed it before we checked it. Simple, really.
      • That's fine, except the software authors' claim is that the file size remains unchanged.
      • except the hash isn't part of the hash. if i wanted to modify it, knowing this method, all i have to do is modify the executable part, hash it again, and re-insert. at best, the checksum lets me know i got the copy intended by the sender, nothing more.

        even if the hash were part of the hash (come to think of it), having a method for generating such executables would still make tampering possible. at best, it'd make it a slow process (assuming it's not something you can generate in O(1) time.)
    • by Ioldanach ( 88584 ) on Thursday August 12, 2004 @04:24PM (#9952921)
      If you embed a signiature of the file into the file, this by definition changes the file's signiature. At best you can append the signiature.
      1. Set the swappable instructions in the program to their bitwise equivalent of 0.
      2. Calculate a signature based on that number.
      3. Swap the instructions to encode that number.

      To decode.

      1. Find swappable instructions.
      2. Determine what bit setting they're at.
      3. Set their bit setting to 0.
      4. Recalc signature based on the new bit setting.
      5. Compare to the bit setting you just retrieved.

      I would still recommend publishing a separate public key, however, and include an encrypted signature in the program. As you say, it can always be changed and re-encoded.

      On the other hand, this might be useful on a server, by encoding a public key and checker on a CD-R and checking all your programs periodically against the CD-R key. You could encode signatures in each program and be able to upgrade programs from a central encoding server without having to write a new cd each time.

    • If you embed a signiature of the file into the file, this by definition changes the file's signiature.

      It is typically assumed for these kinds of things that the signature itself is not part of the data being signed. However if the file can be modified, so can it's signiature.

      You could easily solve that by using X.509 certificates, issues by a trusted CA, similar to the Microsoft "signcode.exe" program for signing CAB files and EXEs. However, that would only prove the integrity of the binary. It's st

    • Nobel prize? (Score:2, Interesting)

      by wurp ( 51446 )
      Can you say college algebra?

      The only thing they have to solve is f(X+S) = S, where f is the algorithm for calculating the signature, X is the exe code, and S is the signature. Depending on f, it can either be completely trivial to calculate S or impossible.
    • by Anders ( 395 ) on Thursday August 12, 2004 @04:46PM (#9953117)

      [...] am dying to hear more about the technology/mathematics behind it! Can you say Nobel Prize nomination?

      There is no Noble Prize for mathematics.

  • by CarrionBird ( 589738 ) on Thursday August 12, 2004 @04:10PM (#9952753) Journal
    It's more proof that you can't stop people from having secret conversations.

    At least not without a top down Orwellian soceity where all hardware and software is controlled.

    • I once wrote a small program that would work from the simple manner: Take the file, take the key, XOR them both.

      That is a very poor encryption algorithm, but I used keys that were ~150 kbytes. Unbreakable.

      We would carry a floppy with ~10 keys on it and every time we would send a file we'll choose a different key and just send the filename of the key along the mail.

      Of course, with that kind of poor kryptography you need a strong key, so my algorithm to generate a key was initiating two different random se

    • At least not without a top down Orwellian soceity where all hardware and software is controlled.

      Isn't this basically what DRM aims to do?
  • by caluml ( 551744 ) <slashdotNO@SPAMspamgoeshere.calum.org> on Thursday August 12, 2004 @04:11PM (#9952772) Homepage
    Ever wanted to hide a message into an executable?

    Not really :)
    But I'd like to make that dog downstairs stop barking.

  • bologna (Score:2, Funny)

    by Nuttles ( 625038 )
    without changing file sizes... let me stick my pirated version of War and Piece in my Hello world application.

    sometimes you don't even have to rtfa to rip on a topic...

    Nuttles
    Christian and proud of it
    • Re:bologna (Score:3, Informative)

      by garcia ( 6573 ) *
      without changing file sizes... let me stick my pirated version of War and Piece in my Hello world application.

      According to the article that you didn't read it seems that the amount of text that you can imbed without affecting the filesize is determined by the original file's contents.

      You wouldn't be able to fit War and Peace into most files but you could fit about 1.44KB of text into a 500k file or so (according to their examples).
    • without changing file sizes... let me stick my pirated version of War and Piece in my Hello world application.

      Have you seen the size of executable the latest Microsoft compilers produce for something as simple as "Hello World"? You almost could.
    • Re:bologna (Score:4, Funny)

      by itwerx ( 165526 ) on Thursday August 12, 2004 @04:25PM (#9952935) Homepage
      "War and Piece" = 13 chars - no problem!
      Now the full text of "War and Peace" might be a different story... (Literally! Chuckle/snort :)
    • let me stick my pirated version of War and Piece in my Hello world application.

      Well, if your Hello world application was written in the style of Master Programmer [ariel.com.au] from the old joke... you can easily fit the whole collected works by Fedor Dostoevsky.
  • by ivan256 ( 17499 ) * on Thursday August 12, 2004 @04:12PM (#9952782)
    Not only a dupe, but a link to the original story is listed on the referenced page.

    Wow [slashdot.org].

    • by callipygian-showsyst ( 631222 ) on Thursday August 12, 2004 @04:52PM (#9953206) Homepage
      You're just not playing the game! I'll let you in on it:

      A bunch of folks who got pissed off that their stories never got approved on /. got together on alt.syntax.tactical and devised a plan. What they're doing is finding OLD slashdot stories and resubmitting them.

      So far, it's been moderatly successful with 4-5 dupes getting through each week. This story was particularly amusing because the article has a link to their /. mention! Good work to the folks at a.s.t!

      I suggest you start playing along too! It's fun to show how worthless the /. editors are.

  • Hydan... (Score:5, Funny)

    by Anonymous Coward on Thursday August 12, 2004 @04:13PM (#9952790)
    The message retrieval method should be called "Hydan Seek"
  • by Anonymous Coward on Thursday August 12, 2004 @04:13PM (#9952792)
    Hydan: Hiding Information in Program Binaries
    Rakan El-Khalil and Angelos D. Keromytis
    Department of Computer Science, Columbia University in the City of New York
    {rfe3,angelos}@cs.columbia.edu
    Abstract. We present a scheme to steganographically embed information in x86
    program binaries. We define sets of functionally-equivalent instructions, and use
    a key-derived selection process to encode information in machine code by using
    the appropriate instructions from each set. Such a scheme can be used to watermark
    (or fingerprint) code, sign executables, or simply create a covert communication
    channel. We experimentally measure the capacity of the covert channel by
    determining the distribution of equivalent instructions in several popular operating
    system distributions. Our analysis shows that we can embed only a limited
    amount of information in each executable (approximately 1
    110 bit encoding rate),
    although this amount is sufficient for some of the potential applications mentioned.
    We conclude by discussing potential improvements to the capacity of the
    channel and other future work.
    1 Introduction
    Traditional information-hiding techniques encode ancillary information inside data such
    as still images, video, or audio. They typically do so in a way that an observer does not
    notice them, by using redundant bits in the medium. The definition of "redundancy"
    depends on the medium under consideration (cover medium). Because of their invasive
    nature, information-hiding systems are often easy to detect, although considerable work
    has gone into hiding any patterns [1]. In modern steganography, a secret key is used to
    both encrypt the information-to-be-encoded and select a subset of the redundant bits
    to be used for the encoding process. The goal is to make it difficult for an attacker to
    detect the presence of secret information. This is practical only if the cover medium has
    a large enough capacity that, even ignoring a significant number of redundant bits, we
    can still encode enough useful information.
    Aside from its use in secret communications, an information-hiding process [2] can
    be used for watermarking and fingerprinting, whereby information describing properties
    of the data (e.g., its source, the user that purchased it, access control information,
    etc.) is encoded in the data itself. The "secret" information is encoded in such a manner
    that removing it is intended to damage the data and render it unusable (e.g., introduce
    noise to an audio track), with various degrees of success.
    In this paper, we describe the application of information-hiding techniques to arbitrary
    program binaries. Using our system, named Hydan, we can embed information
    using functionally-equivalent instructions (i.e., i386 machine code instructions). To determine
    the available capacity, we analyze the binaries of several operating system distributions
    (OpenBSD 3.4, FreeBSD 4.4, NetBSD 1.6.1, Red Hat Linux 9, andWindows
    XP Professional). Our tests show that the available capacity, given the sets of equivalent
    instructions we currently use, is approximately 1
    110 bits (i.e., we can encode 1 bit
    of information for every 110 bits of program code). Note that we make a distinction
    between the overall program size and the code size. The overall program size includes
    various data, relocation, and BSS sections, in addition to the code sections. Experimentally,
    we have found that the code sections take up 75% of the total size of executables,
    on average. For example, a 210KB statically linked executable contains about 158KB
    of code, in which we can embed 1.44KB (11, 766 bits) of data.
    In comparison, other tools such as Outguess [1] are able to achieve a 1
    17 bit encoding
    rate in images, and are thus better suited for covert communications, where data-rate
    is an important consideration. The 1
    110 encoding rate achieved by the currently implemented
    version of Hydan is obtained when we only use instruction
  • by deedude ( 615666 ) on Thursday August 12, 2004 @04:15PM (#9952805)
    Intresting. Allthough I didn't get a chance to RTFA, hiding encrypted data in an executable doees not seem all that practical to me. It may not change the filesize or functionality, but would it not also change other signature methods (like md5sums?). From my understanding, the main strength of steganography is the file with the encrypted data being indistinguishable from regular files. Since the diffrence can be detected with CRC or MD5, wouldn't that defeat the main purpose?
    • Since the diffrence can be detected with CRC or MD5, wouldn't that defeat the main purpose?

      The main purpose is to send secret data, hidden in something that doesn't seem to contain such data.

      If there's no "original" file to compare with, it'd be hard to detect the presence of the extra data. One could write a small application which seems innocent, but which only real purpose is to be used as a container for covert messages.
    • I think the main purpose is to be able to encrypt things in places people normally don't look. Breaking encryption on a wireless access point broadcasting over the air is one thing. Finding the only executable file on a laptop that includes an encrypted piece of information you need when you don't even know that it is encrypted in an executable file is another thing.

      Coupled with messages in images, this will make it quite easy to move data around without anyone else knowing. There are too many places it
  • by theManInTheYellowHat ( 451261 ) on Thursday August 12, 2004 @04:15PM (#9952811)
    They should have put their message in the web servers executable so that when it gets slashdotted it could just shit itself and we could still get how it works.
  • Given that it embeds itself in the program without changing it, how would you recover the data while being sure to prevent false positives?
  • How it's done.. (Score:5, Informative)

    by wfberg ( 24378 ) on Thursday August 12, 2004 @04:20PM (#9952882)
    The gist of it is that there are many instructions in x86 that have the same result. You can replace these, and based on which instructions you encounter you can find a hidden message.

    So much for theory. Here's an example; let's say we have a couple of synonyms, like so
    car, automobile; Robert, Bob; crashed, trashed; beer, whisky.
    Let's say we have a little story like so;
    "Bob got in his car. He crashed it, because he had been drinking too much beer. His car is now a total loss."

    Let's say we want to send a secret binary message "0110". Cunningly, we substitute the first of each pair of synonyms if we want to encode a zero, and the second for a one. So the story is now

    "Robert got in his automobile. He trashed it, because he had been drinking too much whisky. His car is now a total loss." (notice how not all key words changed).

    This is a bit harder with natural language, as many words aren't quite right to use in place of the other ("got in his automobile" just doesn't sound right), so it's actually easier to do for machine code.
  • Hashing problem: (Score:2, Insightful)

    by Tribbin ( 565963 )
    OK, you place the hash in the executable; the file is changed. Now the hash should be different...

    Problem.
  • Ever wanted to hide a message into an executable?

    No...

  • The nice part of steganography is that you don't know there is a hidden message.

    In order to make sure people can't determine any changes to a file, so preferably there is no reference material to compare the file with. Reference material like other unchanged executables.

    So this doesn't work unless you write a program for each message you want to hide.... Not? Ok. I'd think so too.

    So I'd rather take my digital camera, take a picture of a whatever and use that as an original.

    But I have to admit it's way c
  • A shareware x86 assembler some years back claimed that the author was able to tell if anyone used his assembler to distribute binaries in violation of his license. While it apparently didn't scare enough people into paying for his program (maybe they used MASM instead?), the program might be useful for busting any patents that come up around this technology.
    • I believe the program you speak of is a86. It was $50 to register it, which is why it was unpopular.

      It used the same method hydan uses. It used equivelant instructions that were "different" from the way the code was written. I'd used it a bit for myself, and noticed that was what it did when I opened the files later with debug.

      The documentation never really said how, it just said it "fingerprinted" the code.
  • How it can be done (Score:2, Interesting)

    by BinBoy ( 164798 )
    The site is slashdotted so I don't know if this is how it works, but...

    Some 8086 opcodes contain a bit that reverses the operation. For examble, with the bit set in the instruction "mov bx,cx", bx would be copied to cx instead of cx to bx. By switching the registers AND setting the bit, you effectively reverse the operation twice, creating different machine code that does exactly the same thing.

    The A86 assembler used this bit to create a fingerprint that would make it easier to detect non-paying users.
  • Now that messages can be hidden in executable files, I feel a lot better about opening .exe files that are mailed to me!
  • Nice of them to include this definition of 'Astroturfing' for the non USAian audience:
    In American politics, the term 'astroturfing' is used to describe formal public relations projects which deliberately give the impression that they are spontaneous and populist reactions. The term is a play on the description of truly spontaneous or 'grassroots' efforts and the distinction between real grass and AstroTurf - the fake grass used in some indoor American football stadiums.
  • by Derek Pomery ( 2028 ) on Thursday August 12, 2004 @04:40PM (#9953074)
    If the program has been tampered with, the most obvious thing to tamper with would be the validation mechanism.
    I'm going to stick with a separate md5sum, thanks.
  • How it works (Score:3, Informative)

    by EduardoFonseca ( 703176 ) on Thursday August 12, 2004 @04:47PM (#9953134) Homepage
    Since a lot of people is asking, here it goes:

    - How it works
    --------------

    Overview: Hydan finds sets of equivalent instructions in the binary,
    and uses that redundancy to embed data. The larger the set of
    equivalent instructions, the more bits can be embedded. For example,
    if instructions {a, b, c, d} are all equivalent, then we can embed two
    bits of information when any of those instructions are encountered.

    Embedding: Hydan goes through the application sequentially, and
    whenever it finds an instruction that it has equivalents to, it
    substitutes in the instruction that represents the bit(s) of data
    hydan is currently embedding. A simple example: "add %eax, 50" is
    equivalent to "sub %eax, -50". So this set is {"add %reg, $imm", "sub
    %reg, $imm"}. Whenever an instruction of the form "add %reg, $imm" is
    encountered, hydan can embed one bit of the message. If the bit is 0,
    it leaves it as an add instruction. Else it substitutes it to "sub
    %reg, -$imm". (and vice versa)

    Decoding: When it is time to extract the embedded message, every
    "add %reg, $imm" is taken to mean bit 0, and every sub instruction
    encodes the bit 1, and the embedded message is reconstructed that way.

    Encryption: Hydan first prompts the user for a passphrase before
    embedding or decoding the contents of the application. In the case of
    embedding, hydan prepends the length of the message to the message,
    encrypts that with blowfish in cbc mode, and embeds the result into
    the application. When decoding, hydan extracts all the possible bits
    from the application (since it does not know how long the message is
    a-priori; that information is encrypted). Hydan then decrypts the
    message properly since it is in CBC mode and need not know the total
    length first. The lenght is then used to truncate the message to
    size.

    Instructions: For a complete list of the sets of equivalent
    instructions, please refer to hdn_insns.c.

    - Attacks
    ---------

    There are three classes of attacks that are applicable to hydan:
    overwriting, detection, and extraction. The overwriting attack refers
    to the ability to overwrite the message embedded in the application,
    whether its presence was detected or not. An attacker should also not
    be able to detect the presence of a message in the application, nor
    decrypt it.

    The overwriting attack: hydan currently has no means to protect
    against this type of attack. Since hydan embeds the message
    sequentially, starting from the top of the application, an attacker
    could re-run hydan with a bogus text and embed that on top of the
    original message. The intended recipient of the application would
    thus be unable to retrieve the original message. One way this could
    be solved is to add an error correcting code to the encoding of the
    message, and distribute the message throughout the application in a
    passphrase specific manner. This way only parts of the original
    message would be overwritten, and the original may still be
    reconstructed. Of course, there is nothing that can be done if the
    attacker insists on overwriting with a message size that is the
    maximum embeddable in the given application. However, the computation
    required to overwrite each application on a large scale might be
    prohibitive enough to discourage this as a routine behaviour, at an
    ISP for example.

    Detection: Binaries produced by hydan should not exhibit obvious
    patterns. At the most superficial level, this is accomplished by not
    embedding any marker or other easily recognizable token. At best, the
    embedded data looks random (which is why it is bf encrypted). At the
    assembly level however, the current version of hydan makes no attempt
    at mimicing the original distribution of instructions in the
    application, and is thus vulnerable to statistical analysis. Indeed,
    although all the instructions are equivalent, some may appear more
    frequent
  • Embed in Viruses? (Score:4, Interesting)

    by Embedded Geek ( 532893 ) on Thursday August 12, 2004 @04:49PM (#9953155) Homepage
    My initial reaction was that it'd be a silly thing to try this because of the risk of virus infected executables. I mean, who that's tech savvy enough to need steganography would feel comfortable downloading intentionally tampered .EXE's, even if they're not intending to run the dang things?

    But then I started thinking about how effectively viruses are distributed by non-techies who do click on the attachments in their EMAILs. Perhaps viruses or spyware could be used to "broadcast" a message this way to different cells in a covert organization (terrorists, organized crime, chess club members, whatever). All you'd need is an unprotected PC to act as a tethered goat and catch all those infections for later reading.

    For that matter, a sender could "neuter" a virus by disabling its reproductive code and then embed a message in it and send it through some anonymizer (either a formal anonymizer or using a shell account). When the recipient stores it in a quarantine directory, it would look just like an infected EMAIL that had been cleaned up by your antivirus program, not a covert message. Some variation of this using spyware infection would be even more effective as they tend (in my limited experience) to have even more variants than viruses - the obfuscated message would be more readilly confused with normal variation. Instead of posting your tampered executables to some usenet forum, you would simply have the reciepient visit a site running the spyware. New messages would be sent and old ones sterilized when the spyware reinstalls itself.

    Just my 20 mills.

    • I mean, who that's tech savvy enough to need steganography would feel comfortable downloading intentionally tampered .EXE's, even if they're not intending to run the dang things?

      I feel no anxiety about EXE files. Maybe it's because I run Linux.

  • by iamacat ( 583406 ) on Thursday August 12, 2004 @04:50PM (#9953168)
    This guy [eji.com] wrote his assembler to generate unusual form of MOV instructions at least 10 years ago. In this way, he can find out if a program is generated using an unregistered version of A86.

    Any CPU that has an instruction to exchange two registers will have some redundancy, but for X86 even basic mov (as well as add, sub, cmp and so on) specifies both two operands and a flag that specifies which one is source and which one is destination. The significance is that both operands can be registers, but only one can be a memory reference.

    A much more impressive use would be a program that reads its own code as data to save the last few bytes, especially if it has a real purpose, like fitting a game into a fixed-size ROM.
  • Stenography (Score:2, Interesting)

    Note that as far as I remember, stenography by definition is supposed to make it impossible to prove that there is data hidden there - one step further than normal encryption. It's not so much as about hiding the data as being able to deny its existance.
    One reason for this is if you have encrypted data on your disk, then courts can demand the password for it. Stenography allows you to insist there is no hidden data.
    • One reason for this is if you have encrypted data on your disk, then courts can demand the password for it. Stenography allows you to insist there is no hidden data.

      No, stenography [wikipedia.org] allows the court to keep accurate protocols when they ask you for that hidden data ;-)

      Now steganography [wikipedia.org] on the other hand...

  • by jamonterrell ( 517500 ) on Thursday August 12, 2004 @04:55PM (#9953233)
    A new virus is quickly spreading across the internet. Experts say it started at Defcon with a demonstration of a program that allows users to add a secret text to an executable file without altering it's filesize. Apparently the program also attached a message of it's own... don't run programs demonstrated at defcon!
  • Now, that "block all executables" setting that I can't turn find or off in Outlook will prevent terrorists from exchanging secret messages embedded in trojan executables that are attached to emails purporting to be great pornography!

    It's not an annoyance; it's a *feature*!
  • by BlueBiker ( 690984 ) on Thursday August 12, 2004 @05:18PM (#9953432)
    This is a fascinating approach. One thing I didn't see mentioned at all in the documents is the possible change in performance characteristics by changing instructions which have the same effect but which have different pipeline, execution unit, or cache properties.

    Modern optimizing compilers spend an awful lot of effort generating efficient combinations of instructions which try to make the most out of CPUs having complicated rules. For example, add eax,eax and shl eax,1 might both produce the same desired effect but yield significantly different runtimes depending on the presence / absence of barrel shifters or the ability of particular instructions to pair in a given CPU.

    Naturally the above would only matter if the modified code is in an inner loop, but it could happen.
  • 'Copyright SCO' and 'Darl Rules!'

  • Steganography (Score:5, Informative)

    by SiliconEntity ( 448450 ) on Thursday August 12, 2004 @05:32PM (#9953530)
    In cryptography, steganography has a particular meaning. In the same way that the goal of encryption is to prevent the message from being read, the goal of steganography is to prevent the message from being detected. A successful steganographic embedding is one in which a third party would not be able to find out if it is there. If you gave him two files, one with an embedded message and the other unprocessed, he should not be able to tell them apart.

    For a method to truly be steganography, it's not enough just to embed some data into another. That's possible any time there's redundancy. The requirement is to make it so clever and/or subtle that there is no way to distinguish a processed file from an unprocessed one.

    I doubt that this new method passes the test. Generally, while there are many synonyms possible in code, both in single instructions and in short sequences of instructions, the statistics of how these are distributed in unprocessed files are probably not random. Chances are that one synonym is used more than another. If you embed random data in a straightforward way, you will then have equal usages of both alternatives. This is a highly unusual condition, and to someone in the know, files like these will be easily distinguished.

    Only if they have found a kind of synonym which already has purely random statistics, or where they are careful to precisely mimic the statistics of the original file as they add their data, can this truly be considered a form of steganography.

    • the statistics of how these are distributed in unprocessed files are probably not random.

      Sounds to me like my home computer system needs to be compiled with anti-optimization to throw in some more of these synonyms randomly into my executables.

    • A successful steganographic embedding is one in which a third party would not be able to find out if it is there. If you gave him two files, one with an embedded message and the other unprocessed, he should not be able to tell them apart.

      Close, but not quite. He should not be able to tell which one contains the message. Important distinction, and what you meant I think.

      I doubt that this new method passes the test. Generally, while there are many synonyms possible in code, both in single instructions

    • I'm not sure I agree. While it might not be very _good_ steganography, it is still steganography -- it is an attempt to hide a message somewhere where it won't be discovered.
  • A86 did this... (Score:4, Interesting)

    by don.g ( 6394 ) <don@[ ].org.nz ['dis' in gap]> on Thursday August 12, 2004 @06:16PM (#9953894) Homepage
    The documentation for the shareware DOS assembler, A86, claimed that the set of opcodes it chose to emit for various instructions was unique (i.e. those exact choices weren't made by any other assembler). Therefore if you released software assembled with A86 without registering it, if the A86 author ever got hold of that software, he'd know you had used his assembler to produce it. So the steganography in this case encoded a one bit value: "I used A86".
  • It would be much more efficient to encode data into the empty blocks of allignment space inside an exe. The new file is different from the original, no matter where you put the data. The only thing that will be the same is the file size, so you can just as well use a simple method...

    KISS

Almost anything derogatory you could say about today's software design would be accurate. -- K.E. Iverson

Working...