Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Technology

Are Buffer Overflow Sploits Intel's Fault? 269

Bruce Perens submitted a story he wrote for his website on overflows and who's fault they are. I'm pretty skeptical of almost every point raised in this story, but it's an interesting read. [Updated 21:13 by t] As Sea Monkey points out, Bruce has now taken down the article, with a brief note: "I've withdrawn this article after enough people convinced me that I didn't know what I was talking about. It happens sometimes. Thanks." What if everyone displayed such grace?
This discussion has been archived. No new comments can be posted.

Are Buffer Overflow Sploits Intels Fault?

Comments Filter:
  • People could very well start using segmentation again without so many horrors. The problem before was that the segments had a size of 64kB, whereas the total RAM was measured in the megabytes. With the 386, the address space of the segments was the same as the address space of the entire machine. So using segments isn't terribly different from just using straight pages. You need 15MB on the stack? Not really a big problem. Therefore, there should be no worries about changing segments all the time, because all your data is in one segment.
  • I normally code in Java..but for Linux/UNIX coding you *have* to use C. why ? because C doesnt require overhead (support files for VMs etc like Java). C programs are standalone. C can be recompiled for any architecture. C is FAST. C is SMALL. C can use 100% of your CPU and memory without bothering about overhead. C can manipulate strings faster than any other language. Allocation of arrays and storage space in C is trivial. Looping thru arrays is blazingly fast. C has curses based screen manipulation. C can have inline assembly for those REALLY tight ops.
    In short - Performance, Portability, Control, Simplicity.
  • Please don't moderate up his post despite sounding good. It's almost entirely wrong.

    Doesn't matter how many times I recompile, bud. Sorry. That's kindof a moot point. Look at all the different distributions which use different compiler sets and different optimization parameters and all of them end up getting bit by the same bug. A recompile, if you're lucky, will prevent some exploits from being effective, but that's because of bad exploit writing.

    There's also nothing to prevent the MacOS from executing a buffer overflow. You have relocatable code, just like everyone else. Almost all successful exploits rely on offsets, not absolute addresses (because those can change on a reboot, not even a recompile).

    Finally, internal datatypes and bare pointers are neither necessary nor sufficient for buffer overflows. For starters, there are languages and platforms without internal datatypes and with bare pointers that can't (assuming you don't trip a bug in the language itself) result in an exploitable buffer overflow because of other protection systems. Second, there is always the possibility to trip a bug in the compiler/interpreter which results in an executable buffer overflow in a language where it shouldn't be possible.

    I will agree that it will, almost entirely, go away if we used pointer-free languages, but unfortunately, in many cases (i.e. an OS kernel) that isn't particularly attractive to do.

  • Bruce Perens submitted a story he wrote for his website on overflows and who's fault they are.

    "Who's" is possessive. It should be spelled "whose"
  • by Raven667 ( 14867 ) on Saturday July 29, 2000 @10:15AM (#895027) Homepage

    I disagree with this, making it the developers responsibility to write bounds checking code every time they deal with input is why we are in this mess today. Not a week goes by without annother buffer overflow sob story on Bugtraq. Asking software developers never to make any mistakes, ever, is not a realistic solution and assigning blame isn't going to make the problem go away.

    There are a few possible solutions, none of them really easy.

    1. Change the libraries to always do bounds checking on all functions. This would break most current software and still wouldn't solve problems inherent in language. It would still be possible to write insecure code if you decided to shoot yourself in the foot, or the libraries were used in new and unusual ways.
    2. Change the language to one that is type safe. This probably won't realistically happen any time soon but Java is making some inroads in popularity. Other languages, especially the LISPy ones, will probably never be popular, but who knows? Still doesn't fix old software, or programmers who refuse to learn.
    3. Change the OS with something like StackGuard. Will break much existing software and OS implementations, and there are still many ways around it for creative programmers. This is just a patch that doesn't solve the inherent problems of writing insecure code.
    4. Change the hardware platform. Certain platforms make it much easier to create safe code, and protect you from some nastiness. This might work, and for legacy, current and future software but isn't an area I am much familiar with so I don't know all the implications. It should be pretty hard to get around hardware restrictions but backwards compatibility features might provide a way to run old, insecure, code in old insecure ways.
    5. Buffer overflows and other common security problems have been with us for over thirty years and still aren't in the "solved problems" bin. This is inexcusable. If people are going to rely on computers in their daily lives, the computer have to be reliable and having the possibility of security comprimise using 30 year old techniques does not a reliable computer make.

  • Hrm, interesting. Do you have a *real* Deja link?

    At least gcc's features are open --- If people are worried about gcc accepting non-standards-conforming code, why not hack it into -pedantic or -ansi themselves and release a patch?
  • by jetson123 ( 13128 ) on Saturday July 29, 2000 @11:45AM (#895029)
    Oh, you found me out: I'm stupid and careless. And there are a lot more people like me around, at Microsoft, Sun, and even some (I hear) in the free software community.

    And to help half-wits like us still do something useful with software, it might be nice to give us tools we don't hurt ourselves with too much. You know, scissors instead of a Samurai sword. A Toyota Camry instead of a Formula 1 race car. 110V household current instead of 50KV "professional" power.

    The fact is, C/C++ is too powerful for me. Oh, I understand it just fine, and with enough time and effort, I can make something work semi-reliably in it. But what's the benefit I get for that self-flagellation? Should I spend all that extra time because finding the bugs "hurts so good"?

    The fact is: writing code in C/C++ is a lot of unnecessary work. The only reason why people put up with it is because everybody else does it, so it's the path of least resistance if you want to use the "standard" compiler on your platform, use a language other people are likely to understand, and use other people's C/C++ libraries. But make no mistake about it: C/C++ is successful these days in spite of its cumbersome design, not because of it.

  • No, you don't need to put in bounds checking manually. Bounds checking is a property of the *compiler*, not the *language*. There are many C interpreters (and even many C compilers) which will implement bounds checking.
  • Hello? Funny? Didn't you mean 'insightful', moderator?
  • Yes it was what I meant. I always thought "type safety" was kind of a continuum. There's complete type unsafety (perhaps not unlike machine language); there's complete type safety (of a language unknown to me), and then there's all that in between. In order to do inheritance and stuff like that in C, you have to use generic pointers, which means that your compiler can not easily (though it is still possible) to catch type errors. So in that respect, C++ is "type safer" than C is, though I agree it is definitely not completely safe.
  • And in such systems, you pay a lot of overhead if you try to generate code at runtime, because it always involves a system call. That may be fine for 1960's style COBOL and C/C++ code, but it is not acceptable for 2000's style languages and programming.
  • Buffer exploits can be be solved by more liberal use of segments. You can have one segment for VM from 0-2G, which is Ex/Re, and another for VM from 2-4G, which is Re/Wr. The problem is that OS'es like Windows and Linux are very primitive, and just use four segments, code and data flat selectors for each privilege level. If this scheme was used, buffer overflow exploits would be literally impossible, but it requires advanced OS'es which have not been developed yet.

    It is also the fault of the language. For its first 10 years (when it did not have C), there was literally never a buffer overflow bug on any program in VMS. For example, the internet worm exploited only Unix hosts, and couldn't do anything to VMS hosts. However, in the last 10 years, VMS has adopted more C programs, and the number of buffer overflow exploits has risen from 0 to a siliar amount on Unix. The string and array methodologies under C are incredibly fragile and primitive (as well as very low performance - you need to access byte at a time, which is VERY SLOW on modern architectures), and more advanced languages have much more high performance and more secure methodologies of dealing with data.

  • C++ is type safe unless you use constructs which make it non-typesafe. As long as you don't use those constructs, like casting, it's a typesafe language.
  • You have a very narrow view of "bugs". Remember the old system() bugs? I don't know if Java has a direct analogue to system(), but it would still be possible either way. I'm talking about logic errors, not "programmatic" (for lack of a better term) errors. Every language or standard library that allows you some control (e.g. all current programming languages that I'm aware of) is going to give you the possibility to grant crackers root.
  • GCC is an aggressive 'embrace and extend' language. It puts Microsoft's Java to shame.

    This is misleading -- verging on FUD -- as GCC will happily build ANSI-compliant code (AFAIK). Furthermore, most of the extensions can be easily removed to port to other compilers. True, the Linux kernel makes heavy use of extensions and would be difficult to port to something other than GCC, but that's a *very* special case.

    Here's some of the extensions I find useful, so you can judge their evilness for yourself:

    • Varargs macros. (Makes it easy to wrap varargs functions (e.g., printf) in macros.)
    • Zero-length arrays. (Put one at the end of a struct, and growing the struct becomes trivial.)
    • Argument type checking for printf/scanf-style functions. (Compiler warns if you read a '%lf' into an 'int *'. Catching this at run time is a royal PITA.)

    Maybe *you* want your languages handed down from the mighty and revered standards committee. That's fine, but don't try to keep me from using neat, helpful features.

  • Old archives (prior to 2000 it seems) are currently not existent on Deja. Anyway I don't see the problem with some GNU extensions and functions not being able to be disabled, maybe because I already have an understanding as to what's standard and what's not. Oh well.
  • by Tom7 ( 102298 ) on Saturday July 29, 2000 @10:23AM (#895040) Homepage Journal

    '' Oh? I'm fairly sure machine code is /also/ "unsafe", and that's what your pretty source code ends up as. How do you prove that your oh-so-wonderful language is still safe when rendered into raw machine code? ''

    I'm sure you're thinking that you've got me beat, but in fact this is a great question!

    The old, less-satisfying answer is that compilers are less likely to have bugs than the programs they're given. This is probably true (I don't recall any exploits due to compiler bugs, though to be fair I do recall some Java VM exploits).

    The new exciting answer is: We use a type-safe subset of the target machine's assembly code.

    In our TILT [cornell.edu] compiler for instance, we take ML code and put it through a series of transformations. At each transformation or optimization we also translate the types (a static proof that the program can't crash) until we get to machine language. This catches a lot of compiler bugs, and helps propagate safety properties to the raw code.

    The result is that we have machine language which can pretty easily be checked for type-safety. This allows us to do some other cool things, like ship the proof along with the raw machine code, to be executed on someone else's machine. They don't have to trust us (just the proof), and it doesn't suffer sandboxing costs like java. Wow! Read about Proof Carrying Code [cmu.edu] .

    The second answer isn't really usable today, except in theory. The first is absolutely practical, though. (even if it fails due to compiler bugs, we'd still cut down on a high percentage of errors -- and we'd only need to fix bugs in one place).


  • What part of what I'm saying is hard to believe? Though we can translate Java or other safe languages to C, and the resulting C program is "safe", we aren't able to verify that about the C program unless we have around the Java program as well. (or we avoid using some unsafe C features).

    You're certainly right about the compiler doing work for you, and really that's at the root of my point. Though it's possible to write safe C code, why not let the compiler verify (and indeed, prove) that your code is safe? (Clearly C programmers still make the kinds of mistakes we think are easy to not make...)
  • Arggh.
    If the Code Segment is not marked readable, you cannot read the code.
    You can't jump into the stack. You can jump to a "far return address" that is stored in the stack. You can jump to an address in the Code Segment which just happens to be coincident with something in the Stack Segment.

    With NO CONTEXT SWITCHES, there are 8000+ local code spaces, 8000+ global code spaces, 8000+ local data spaces, 8000+ global data spaces, 8000+ local stack spaces, 8000+ global stack spaces. All these code and data spaces. All independent. All in the same context. But the OS has to handle them. No OS wants to bother. They all effectively point code, data, and stack to the same memory with bounds set to all of memory.
    There are four protection levels, but only Multics that I am aware of has ever used more than two levels.
    The hardware design of the 80386 is targeted at a Multics-like OS. Current i386 code is using separate code data and stack spaces, with bounds checking on all references to memory. Unfortunately, code, data, and stack point to the same memory with the bounds being all of memory.

    Speed. Loading a segment register is comparable to integer multiply, the instruction, not the addressing shortcut.

    Unless Operating Systems, Language designers, and programmers are willing to give up the nice flat 32-bit address space and go back to the old 8086 DOS segments and offsets as a programming paradigm, things are not going to improve. The problem with the old DOS is that segments were used to break the 64k barrier in a flat address space rather than used as segments. With the 386 architecture, segments are no longer limited to 64k, and there is no correlation between segment number and physical address.
  • Actually type safety has everything to do with buffer flows. The problem is that the types available are either unsafe or unchecked.
    What is the type of a pointer to a 120 character buffer? If the effective type is a pointer to max 64k bytes starting here, then it is unsafe.
    Strongly typed languages are easier to write programs in that do not have certain kinds of bugs. They are harder to write programs in that are buggy and seem to work OK.
    Will things get any better? No. There are too many buggy programs that sorta work well enough that will be stopped dead by anything that enforces safety. What I'd love to see is a good free production-quality Algol68 compiler. The 68 stands for the year 1968. Progress? nah. :(
  • The same is true for Modula-3, Oberon, Sather, Eiffel, and other languages. Yes, they don't have the support or user community that C has, but on technical grounds alone, they are fast, small, have no overhead, etc.
  • Very interesting post this one of yours, I could agree with you 100% in the first line, and gradually go down to 0% on the last line.

    If I read you correctly, the burden to avoid buffer overflows should be put on the programmer, not on the hardware or on the machine. Then you start advocating that the compiler/library should take care of this! Isn't that like just laying the safety net a couple feet higher?

    As a general rule, I think dumb tasks should be left for the machine, the noble ones for the programmer. Checking whether a given input validation is a potential door for an exploit is the programmer's responsibility.

    Another thing that you seem to suggest is that quick and dirty test code should be the basis for production code, thus holes could migrate to the final product. I think the ideal solution is to discard test code altogether and start from scratch, but who does that, right? One other approach, which I actually use and believe many do, is to check for errors (at least return values) from the very beginning, so as those parts which will inevitably be cut and pasted to the production source are structured in a way that adding extra checks would be much easier (favouring using heap memory, encasing calls in try/catch blocks in C++, for example)

    Just my $0,02
  • See subject..;)
    Altho, I would have LOVED to see what he wrote!
    Anyone mirrored?
  • Buffer overflows are NOT just gets and scanf. Those are impossible to secure
    It's been a long while since I've written C, but I remember being frustrated by library functions such as this that were "almost useful" but for some major flaw. Why do they even exist if they can't be secured?
  • C++ is "of course" not a good language? Why? You can do everything you can do in C in C++, and then some. If you don't like templates, don't use them. If you don't like operator overloading, don't use it. Unlike certain other languages *cough*Java*cough*, it doesn't force any particular paradigm on you.

    You can write object oriented code in any language you like. You can write procedural code in any language you like. Object oriented langauges simply facilitate the use of object orientation by providing you with tools that take advantage of it. You can write oo basic, and you can write procedural java (just make everything static and put it in your main class). The paradigm is not being forced on you. You simply ignore some of the features. But don't take my word for it, look at the tour if C++ in Starstroup's The C++ Programming Language.

    Polymorphism, encapsulation, and inheritance are simply properties that fall out of the OO paradigm and are implemented to take advantage of it of its implications. Ultimately the right tool for the job in any given case may or may not implement these.

    --locust

  • by nweaver ( 113078 ) on Saturday July 29, 2000 @08:31AM (#895051) Homepage
    Buffer overflows are the fault of the LANGUAGE. Important system utilities need to be written in bounds-checked languages. Some compilers, no matter the architecture, will write executable code on the stack: "trampolines". Unfortunatly, this is common enough that the OS can't blindly turn off the executable bit on the stack pages. And non-executable stack pages don't stop all buffer overflow attacks, they just require a 2 part attack: A heap buffer to write the code, and a stack buffer to overwrite the return address. The heap buffer doesn't necessarily even need to be overflown, the attacker just needs to be able to deduce the address. And one can't set heap-addresses to be nonexecutable, simply because there are MANY language environments which do create code at runtime, such as interpreters, JITs, etc etc etc.


    Nicholas C Weaver
    nweaver@cs.berkeley.edu
  • Not if you want to parse command lines, for example, or use many of the C library functions (e.g., printf). Of course, in your own, 100% new code, you can use STL and stuff like that, but when you're interfacing with existing code or trying to use (a few of) the language's features, you have to use the unsafe constructs at least a little. That's particularly true when it comes to user input, which is (naturally) where buffer overflows come from.
    --
    -jacob
  • Everything that you said about C applies to C++ (except for being forced to use it under UNIX!)

    A lot of (bad?) programmers do a lot of unnecessary work with their strings because they don't know whether they need to be copied or not, so they always do it. C++ offers string classes that make this irrelevant. Reference counting leading to copy-on-write ensures much greater simplicity and often better performance. A decent string class implemented using a bridge pattern will only be 4 bytes (same as a char*) - it will just consist of a pointer to its implementation and so it can passed around on the stack very quickly. They also have the benefit of being guaranteed '\0' termination. Much easier to handle embedded NULs too.
  • First, it's not because of the CPU. Hell, the first well-known stack 'sploit was in the RTM worm, which worked for two flavors of Unix on a VAX cpu.

    It's because MacOS uses one big-ass shared memory space to run everything in that it's safe from being taken over by buffer overflows. Well, gee, if it's all unprotected, why is it so safe? Because while you can still crash a program with a buffer overflow, you can't predict the stack address. And the critical part of a stack overflow exploit is to get the program counter pointing to the exploit code on the stack.

    And even if you could, what would you do with it? There's no shell (at least not until OS X, but that's a completely different OS) to give commands, and no root privs to exploit (actually you are "root" at all times!)

    Intel is relatively low on the fault scale here. A bigger problem is the number of people running Linux distros with the same binaries in them. If you compile your own code, the stack addresses will be less predictable (though not completely unpredicatable), and you'll be in the same boat as MacOS: without a predictable stack address, there's no way to run the 'sploit code!

    If we simply had more people compile code their own binaries, the problem would be reduced.

    But at heart, the fault is one of languages that let you stick things into memory without any sort of range checking. Get too much data or lose the null terminator from a C string and your stack is toast.

    And most of these problems happen inside of a library routine. But you can't blame the library routine when it has no way to know the size of the destination buffer. The best it can do is know where the frame pointer is and to not write past it.

    If C strings were more than just bare buffers with only a lone null to save you from oblivion, the library routines could be smart enough to save your ass. So I blame C and its strings as the primary problem causing buffer overflow exploits.

    Use a language with internally checked datatypes and no bare pointers like Java or Perl, and this type of exploit will go away.
  • If C strings were more than just bare buffers with only a lone null to save you from oblivion, the library routines could be smart enough to save your ass. So I blame C and its strings as the primary problem causing buffer overflow exploits.

    Use a language with internally checked datatypes and no bare pointers like Java or Perl, and this type of exploit will go away.


    Programmers writing SUID programs should also be capable of using pointers without creating buffer overflow exploits.

    Or do you think the problems with buffer overflows outweigh the potential gain from using pointers in the first place ??

    I strongly prefer the additional power of constructs in C that are provided by pointers. I do not think that a higher level language is likely to be any safer. Sure, the language may conceptually be without overflows, but the increased size of the compilers/interpreters makes those much more difficult to check, and still prone to overflow.
  • Unfortunatly, the semantics of C do not allow easy general bounds checking, since arrays are semantically defined as pointer arithmatic, eg foo[i] is equivelent to *(foo + i). There is no actual array type in C!

    A C-compiler can attempt to funge things to try some bounds-checking like operations, or hacks like stackguard which detect a class of misoperations or purify which does the same thing in a different manner, but the language itself semantically doesn't allow bounds-checking in general, due to how arrays and pointers are typed, without going through serious hacks like what Purify does. Of course, Purify ends up costing more then the bounds checking in a proper language would cost.


    Nicholas C Weaver
    nweaver@cs.berkeley.edu


  • An AC says:

    "duh, this one is really easy to solve. wrap delete with an inline function that checks if the pointer is NULL, deletes, then sets it to null.

    no more multi delete problems. (of course, you really should fix the bug that causes you to delete the pointer twice, but you can't really blame the language for your own ineptitude...) "

    I boggle, and respond:

    Wow, now THERE is a severely misinformed post.

    #1. I'm contending that we use more advanced languages because they make life easier for both good and bad programmers, not merely compensate for "ineptitude".

    #2. Your solution is 100% wrong.

    C * a = new C();
    C * b = a;
    delete a;
    delete b;

    'b' is a different memory location, which isn't set 0 when 'a' is deleted. Unless you're implicitly suggesting that we move to memory handles (which is going to give you MUCH worse performance than modern languages which just don't let you write this kind of program), your solution doesn't solve anything at all.
  • So having separate address spaces for code and data might cause problems. self-modifying code still exists, etc.

    But the cool thing is -- you can just make the page tables regarding the code address spaces point at the same pages as the data (and vice versa), and do this at page granularity. So you can make an 8k buffer that is read/write/execute (but accessed with different linear addresses when used as code or data) for your genetic programming, but keep the rest of your program safe.

    VM tricks are so much fun. ^_^
  • by gad_zuki! ( 70830 ) on Saturday July 29, 2000 @12:39PM (#895107)
    What if everyone displayed such grace?

    Or what if everyone bothered to do some research before writing and self-promoting some inane rant.
  • When intel came out with the 808[68], they wanted to make it as compatible with the 8085 as possible. It really was 16-bit. The data code and stack segments were 64K wide. Segment registers that gave you overlapping segments with a 16 byte granularity.

    If you used the segment registers, the result was basically a highly non-linear address space. In a lot of ways, it was an 8 bit processor with 16 bit registers and hardware bank switching (for those of you that remember bank switching).

    as a result, there were a few 'standard' memory models that programmers used:

    • Small address space: All segment registers the same. don't touch them. This gave you a flat 16bit (64k)address space, turning the machine into a glorified 8085/Z80 -- almost completely source code (assembler!) compatible. It also gave a slight speed advantage, since all pointers and integers were 16 bits wide.
    • Intermediate address space: segment registers point to disjoint spaces. not too much difference but you get some breathing space since the code and data don't share the same (tiny!) 64K address space.. pointers are still 16 bits, but you now have to remember which segment you're talking to.
    • 'large' address space: all pointers are 32 bits wide. (include both segment registers and then pointers within the segments). This gives you access to the full 1M address space. (the 640K limit was because 380K was reserved for I/O space).

      SERIOUS performance hit. If you allow arrays >64K then just about every array access requires you to calculate and load the segment register. address math sucks because if you have two 32 bit addresses A and B, A != B does not necessarily mean that they don't point to the same memory, and *X++ can require some serious work to do the exepected thing.

    The 80286 allowed people to break the 1M barrier without doing bank switching (EMS?), but it turned the segment register/pointer problem into a serious horror story. Unless you were seriously masochistic (or just plain desperate) you just made it look like an 8086 that ran a bit faster.

    When they came out with the '386 you now had segments of 4GB each. This was at a time when a 2GB ram module could have been camouflaged as a desk and would have required a 15KW watt power supply.

    Most programmers and OS designers just set all the segment registers the same (the '386 equivalent of the 'small memory model', and forget about them (I called this traumatic amnesia).

    So, yes: Intel has a Segment model that could be used to provide security, but few people are brave/stupid enough to risk the horror stories/ flashbacks that enabling it might entail.

    Intel: Just short of intelligent.

  • First of all, yes, C does have an array type. Arrays are not pointers, and pointers are not arrays. Arrays have a nasty habit of decaying on you, but the second you forget and treat them as pointers is the second you run into trouble.

    From the view-point of dealing with stack smashes specifically, this can be helpful. Arrays, like any other statically or automatically allocated objects, are declared on the stack, whereas dynamically allocated "arrays" (the target of pointers usually) are declared on the heap. Of course getting rid of stack smashes isn't quite as interesting as getting rid of buffer overruns altogether (both stack smashes and "heap" smashes), but the difference can be useful.

    As I mentioned in another post, though, you can interpret C as much as you want. There's no reason for a C implementation to have a compiler, and even a human being can (in silly situations) be considered a compliant implementation. If you interpret the hell out of it, so that for everreference to an object, you trace its history throughout the program, you can be fairly safe. It might take an hour just to go through a single instruction; "hello world" might take six years to run on the fastest Cray, but it will give you everything you ever wanted to know. Of course these are silly examples: usually C interpreters only slow things down by a magnitude of a few hundred times (or less) when compared to compiled code. But you only have to interpret once and get an "everything is OK" before you can take that same code, turn around, and compile it.

    This is, of course, possible with other languages. There are political and legal obstacles with Java, I think. But for all complex languages (of which I'm saying C is not), making your own interpreter is too much trouble for such little reward.
  • Well, first of all, you seem to agree as far as C is concerned since you advocate wrapping all those unsafe C interfaces into better C++ interfaces.

    Yes, C++ is somewhat better than C because it does allow you to build abstractions that perform more checking and automatic resource management. But even if you do that C++ is fundamentally unsafe. Why?

    C++ still uses the C pointer model and adds a similar reference model. And C++ still uses manual memory management for dynamic allocation. You cannot, in general, address those problems by creating safe abstractions. If you try, you end up severely limiting language semantics, and as soon as you face any outside library, you have to convert to raw pointers anyway.

    And C++ still does not guarantee fault isolation among modules or any way of determining from the source code of a module whether that module is safe or not. That is, any piece of code you link with can cause arbitrary problems in any other piece of code, and you have no way of telling. Perhaps you think that's inevitable, but it is not. None of the other languages that I mentioned have that misfeature.

    Arguing that one should not bother fixing those problems because there are lots of other ways in which people can make mistakes is wrong. The problems C/C++ creates for programmers are easily avoided, without performance penalty or other drawbacks. A day lost trying to chase some avoidable pointer bug in a C/C++ program is a day that could have been spent on testing and fixing some conceptual security bug.

    I have been using C for 20 years and C++ since before its first public release (nearly 15 years?). I still use them a lot because that's what interfaces best with the software that's out there. At the time, they were reasonably good tradeoffs. But this is the year 2000, and tradeoffs that were good then are not good anymore.

  • Hey dudes! I'm kinda new to this slashdot thingy here, but could some sweet, loving guy please explain the buffer overflow to me. It seems like a pretty cool idea, is it like sorta like going out with too many guys or something? help me please!
  • Other potentially unsafe constructs in C/C++ are pointer dereferencing, calling a function pointer, array subscripting, and call-by-reference. Even for casts, neither a compiler nor a reader can tell whether the cast is safe or not.

    So, if you don't use pointers, address-of, array subscripting, call-by-reference, or any library functions that do, yes, then you can write type-safe C++ programs. Too bad that you also can't do much in that subset of C++.

    There is no problem with providing those unsafe constructs in a systems programming language. In fact, lots of languages do, just like C/C++. The problem with C/C++ is that the safe constructs and the unsafe constructs are indistinguishable, and that means that even the 99.9% of a program that can be written nicely with the safe constructs use the unsafe ones, and as a result are much more likely to crash.

  • by jetson123 ( 13128 ) on Saturday July 29, 2000 @11:16AM (#895128)
    Buffer overflow exploits don't necessarily require you to be able to run code from the stack segment. Often, it's sufficient just to change the return address (which necessarily is stored on the stack) to some useful routine, or to change a bunch of bits that are used for a security check.

    The underlying problem is that C/C++/Objective-C do not have mechanisms to protect against these kinds of problems. In fact, it's impossible to write substantial programs in those languages that use only "safe" constructs. This is a peculiar and fundamental bug in the C-family language design.

    There are excellent alternatives around. Modula-3, Oberon, Ada, Sather, and Eiffel all have efficient, free, open source implementations around, they all provide access to unsafe features when needed, and one of them should satisfy anybody's programming needs. Java is an excellent applications and server programming language, although it has a bit more overhead and no access to low-level features.

    So, folks, get with the program and stop writing servers and other applications in C/C++.

  • by tytso ( 63275 ) on Saturday July 29, 2000 @08:38AM (#895130) Homepage

    This is a very old debate, and it's been raised on the kernel list several times. The problem is that it seems pretty clear that given a buffer overrun attack which can be exploitable without the stack-exec patch, it's possible to transform that attack into an exploit which will work with the stack-exec patch present.

    It may require more work to create the exploit, but it's the sort of thing which only one person needs to do and then share with 100,000 of his best friends on some cracker web site. Hence, such a patch only provides the illusion of security, and it adds crap to the kernel. (There's all sorts of kludges you have to put in there to make sure that trampoline code doesn't break, etc., etc.)

  • by bored ( 40072 ) on Saturday July 29, 2000 @08:38AM (#895134)
    The problem isn't Intels fault because the arch has an execute bit in the segments. The original idea was you put your code in a separate code segment from your stack and data segments. The real problem is OS designers who for various reasons decide that the x86 arch's segmentation should be ignored and set the code segments equal in size to the data segments and stack segments. It then becomes a simple matter to just jump into the data or stack segment and begin executing code.

    Of course since most of the OS's don't properly use the protection mechanisms Intel has provided, I guess it becomes Intels fault if they don't extend the arch to support a feature and potentially break downward compatibility with other OS's using the current paging system.
  • The problem is not the presence of unsafe constructs in C++, the problem is that the unsafe constructs are indistinguishable from the safe constructs.

    In C++, this might look like:


    char *p = unsafe::allocate(char,100);
    char c = unsafe::ref(p,10);
    float f = unsafe::castref(float,p,0);

    Of course, C++ would also need to eliminate the unsafe constructs it has outside namespace "unsafe", and in some cases add safe constructs to replace them. Then, you could limit the use of unsafe constructs to only the few places where you actually need them. That greatly reduces the probability of making errors. Languages that do this exist: Modula-3, Oberon, and others. There are no such languages in widespread use yet that look like C or C++, unfortunately.

  • by Tom7 ( 102298 ) on Saturday July 29, 2000 @08:40AM (#895136) Homepage Journal
    Blame the language! C and C++ continue to be inappropriate for security-critical work.

    Aside from speed-critical stuff like kernels and Quake 3, I don't see the need to write programs in C and C++ any more.

    Let's start using modern languages with type safety. They're easier to write programs in (because debugging is easier) and not that slow.

    I know that I'd gladly take the 2x speed hit on my security-critical apps (mail daemon, web server, ssh, etc.) to know that they cannot have this kind of bug in them, because they were written in a language like ML, Eiffel, Haskell, or even Java.

  • Can you name any significantly large piece of software written in [Modula-3 or Ada] that I could examine?

    For Modula-3, go to www.m3.org and follow the links (a whole operating system with various network services has been written at it and is still in use at U. Washington). For Ada, you'll have to talk to your local defense contractor; lots of real-time military systems are written in it, so it definitely gets the performance (Ada is a bit too pedestrian for my taste, so I don't follow it much). For Oberon, the entire Oberon operating system, drivers, compiler, web browser, and other things are written in it, and they are open source.

    you must be talking about run-time fault-tolerance and correction. When you need to do this sort of thing, then people most certainly break things down into modules seperated by process lines. A dataabase server or the TCP stack are both great examples of this.

    If you haven't noticed, relational database performance on UNIX sucks. It takes milliseconds to get any data out of the damned thing, which is why everybody is using stored procedures, which means they put code back into a single process. I have not seen any TCP/IP stack under UNIX systems that is "split up along process lines"; if the Linux or SystemV TCP/IP stack crashes, so does the whole kernel. Both of these examples make my point.

    Although it can be done to some degree, it would of course be incredibly difficult to create a class of smart pointers for each and every type of pointer use. Also, as you note, you would have to use those pointer types everywhere, which is another big strike against them. This kind of wrapping is not what I'm talking about.

    But it's the kind of wrapping I'm talking about, and it's the kind of wrapping that is necessary to make C++ a safe language.

    The claim was that you can have other pointer models and not sacrifice any speed. I'm having a hard time imagining this, as "the C pointer model" mirrors the hardware very closely. I'm very, very willing to be corrected.

    The problem with C/C++ pointers is not that they model the hardware very closely. The problem is that they aren't typed sufficiently finely: heap and stack allocated arrays, displaced arrays, references to stack variables, references to locations in data structures, heap allocated data structures, etc. are all represented by a single type, a "pointer". Other languages use different static types to represent those different constructs. And those constructs are different because they are associated with different object lifetimes and different opportunities for (optional) runtime error checking.

    Adding those extra static types to distinguish the different meanings of "pointers" has no runtime overhead at all. And if you (for some reason) need to convert among the different static types, a call to "unsafe::convert_..." would do the trick, again, with no overhead, and would indicate that something funny was going on.

    The way it is in C/C++ right now, the compiler just doesn't have the information to give meaningful compile time warnings or errors. It also doesn't have enough information to insert efficient runtime error checks if the programmer wants that for debugging. Ultimately, the lack of static type information is even counterproductive for generating efficient code on upcoming architectures.

  • Pointer arithmetic and lack of bounds checking are the ways in which programmers in languages other than C traditionally did casts. Consider the equivalent of (I have seen these in old Fortran and Pascal code):

    struct Cast { int x[1]; float y; };
    int float_bits_as_int(float f) {
    Cast c;
    c.y = f;
    return x[1];
    }

    Here is another example:

    int float_bits_as_int(float f) {
    float *p = new float;
    *p = f;
    delete p;
    int *ip = new int;
    int v = *ip;
    delete ip;
    return v;
    }

    These "logic errors" are related to type errors: they allow the bits of an object of one type to be interpreted as the bits of an object of another type. A system that's type safe guarantees that that doesn't happen.

    In any case "type safety" doesn't just mean compile time type safety. Java has a lot of runtime type safety, where type errors are caught at runtime, not by the compiler. That's still fine for many purposes. C++ has neither.

  • I wouldn't blame the intel architecture, but..

    There are architectures (Gould/SEL-32/xx is one) that allow for and, in some cases, insist on strict divisions of code and data pages. The code sections are read only, and will generate a fault if an attempt to write to it occurs from a non-system level.

    The data sections are read/write, but you cannot branch there.

    It makes it a bit difficult to write self-modifying code, but not impossible if you really need to.
    ---
    Interested in the Colorado Lottery?
  • The article putes quotes around the words "code" and "data" and that is the problem. In i386, CODE segments can either by read/execute or execute only. DATA segments can either by read/write or read only - not execute.

    So what is an Intel-baesd, flat-mode program to do? It sets up two segments - one data, one code - pointing to the same memory. Goodbye hardware security.

    Of course the VM doesn't protect against execution - that is the segmentation system's job. Linux (and anything else that assumes a 68k or VAX flat address space) just blows it off.

    Simple solution - bring back seperate code and data. Excuse me, I and D. Just like the PDP-11 UNIX grew up on.
  • >Some compilers, no matter the architecture, will write executable code on the stack: "trampolines".

    I don't think this is "no matter the architecture". The need for trampolines is chip-architecture dependent, and any compiler writer who uses trampolines where they're not absolutely forced to should be shot.
  • >There is a new breed of C programmers, though. They don't assume Unicisms, PDPisms or TheirPlatformisms,

    Instead, they have their very own set of blind spots and biases and bad habits. Non-viable mutants are a "new breed" too, but that doesn't mean they're an improvement.
  • I was thinking that it would be unlikely to be a particularly x86 problem, because I know buffer overflows are also a problem on other CPUs and OSs even ones that are supposed to be secure.

    Thanks for that.

  • The underlying problem is that C/C++/Objective-C do not have mechanisms to protect against these kinds of problems. In fact, it's impossible to write substantial programs in those languages that use only "safe" constructs.

    I should note that the above use of the word "impossible" is misinform{ed,ing}, and bias{ed,ing}.

    Management of buffers is a low-level detail that only needs to be taken care of once. Hiding details such as this is something that a good C++ programmer can do with elegance.

    If you write a C++ application that uses intelligent string classes such as the STL's std::string, then it becomes very possible to write buffer-safe programs. In fact, it's kind of hard to use std::string in such a way that it does break. (Of course, you have to deal with the fact that all the syscalls use char[]'s rather than string objects, but most of those syscalls should also probably be wrapped in classes -- See the ACE lib).

    As mentioned in other posts, it's often just as easy to create security problems (and other data-integrity-destroying bugs) in other languages.

    If there is a "fundamental bug" that causes thesee problems, it's the low standard that we currently hold software to. Not even bringing in certain high-profile/buggy pieces of software into the picture, I can speak from personal experience on this one. Quality is a distant second to rapid deployment in many domains. Well-written and bug-free code is a subtle thing to enjoy, but one that I think will eventually come into greater popularity.

    So if we could all, as engineers, begin to refuse to produce sloppy code in the face of deadlines and such, I think this type of situation will eventually fix itself.

    Just my thirty pieces of silver's worth.

  • >You can't practically write OO code in languages that do not directly support it.

    Totally wrong. I and thousands of others have been doing this for years, and it seems quite "practical" to us. Languages that directly support OO are only marginally more convenient than non-OO languages such as C; the OO languages' advantages have more to do with standardization of notation than with actually enabling an OO code structure.
  • by Animats ( 122034 ) on Saturday July 29, 2000 @06:20PM (#895178) Homepage
    One particular problem with C is a history of unsafe standard library functions, such as sprintf, strcat, and such, which just happily output into an array of char with no checking whatsoever. Those functions were deprecated back in the 1980s, and there are safe versions with output size limits like sprintf, but there is still far too much code that uses the old ones.

    They need to be deprecated more forcefully. All the unsafe functions should be pulled from the standard C library and moved to something like "deprecated_unsafe_library.h". All set-UID programs need to be purged of those functions. Now. Any manufacturer shipping a system with those functions in a security-critical program should be sued for gross negligence.

  • I know that I'd gladly take the 2x speed hit on my security-critical apps (mail daemon, web server, ssh, etc.) to know that they cannot have this kind of bug in them, because they were written in a language like ML, Eiffel, Haskell, or even Java.

    You might be willing to take that hit, at least in theory, but it is rather doubtful that most people would. If I said to a manager that simply by switching languages we could double the speed of our product, at the expense of needing to code more carefully, he would leap at the chance. And I would back him up. I recall a chart in a course at school which plotted four areas. I don't recall what they all were, but I do remember that we are currently at the point where speed in all its incarnations is what matters: time-to-market, execution speed, response time . When the market begins to tighten, then things will necessarily change. There may even come a time where computers are so fast that they can do all the silly stuff like bounds checking properly with no noticeable slowdown. I rather doubt it; as processing power increases, so too will demand for that power.

    What we need are better-trained programmers. It's not very hard to ensure that arrays are the appropriate size, to check the length of input strings and to make sure that subscripts are within range. This can all be done quite easily in C, and still be much faster than one of the 'lame' languages (I use the term lame to mean a language without the ability to do low-level twiddling, not as a term of derision).

  • Now this is an interesting topic. Are pointersafe languages more or less safe from a hacking perspective?

    Normally, the easy answer is more safe, in a paretro-optimal sense of the word. However, there is always the urge to go from the [presumably expensive] address-space switching architecture to the [presumably cheaper] thread switching architecture when you go pointer safe.

    You can do this because you trust the compiler, so separate applications will be unable to access each other's memory.

    Of course, if it turns out that the compiler was untrustworthy, then you have a real problem on your hands. Now I don't mean that the compiler is compromised (a-la the wonderful gcc/login hack), but rather that a bug in its overflow protection code enables malicious explioits.

    Since compilers are difficult to write, and typically not (?) audited for security, I'd be interested if anyone had had any experiences along these lines?

  • Here's a pretty good description of how to write a buffer overflow: ftp://ftp.technotronic.com/rfc/phrack49-14.txt Jim
  • Some languages have the length of a buffer coded into the buffer and simply won't allow you to put more into a buffer than it has available (either that or they'll re-allocate for more space). With languages like that -- unless the implementation code is buggy, or you do direct system-call hacks that overwrite the internal structures -- it isn't possible to overrun buffers.
  • Well, first of all, you seem to agree as far as C is concerned since you advocate wrapping all those unsafe C interfaces into better C++ interfaces.

    Yeah, pretty much so. Well-written C is attainable, of course, but the benefits of C++ are certainly worth switching.

    C++ still uses the C pointer model and adds a similar reference model. And C++ still uses manual memory management for dynamic allocation. You cannot, in general, address those problems by creating safe abstractions. If you try, you end up severely limiting language semantics, and as soon as you face any outside library, you have to convert to raw pointers anyway.

    Both the "C pointer model" and the manual memory management are easily wrappable. "Initialization is resource acquisition" is elegant and easy to stick to. You can also easily wrap the interfaces to the other libraries, as I said before. It can all be done very safely if you keep direct pointer operations to a minimum and thoroughly check them as you do. I feel that this extra bit of planning can be worth the power that is gained.

    Using string manipulation as a specific example, exactly what about std::string "severely limits language semantics"?

    And C++ still does not guarantee fault isolation among modules or any way of determining from the source code of a module whether that module is safe or not. That is, any piece of code you link with can cause arbitrary problems in any other piece of code, and you have no way of telling. Perhaps you think that's inevitable, but it is not. None of the other languages that I mentioned have that misfeature.

    Sure there is. See man 2 fork, or CORBA [omg.org]

    Arguing that one should not bother fixing those problems because there are lots of other ways in which people can make mistakes is wrong. The problems C/C++ creates for programmers are easily avoided, without performance penalty or other drawbacks. A day lost trying to chase some avoidable pointer bug in a C/C++ program is a day that could have been spent on testing and fixing some conceptual security bug.

    No, good programming practices are what prevents bugs. IMHO there isn't much difference between the various type-safe languages in bug prevention, but there is a huge difference between the various programming practices and quality standards. I have talked with plenty of folks who claim that there's "no performance penalty or other drawbacks" to not having the power of C/C++'s low-level control, but I haven't seen anyone back that up with proof. I would really enjoy for someone, anyone, to prove otherwise.

  • Intel's 386 protected mode has "code" segments and "data" segments. Writes to code cause a segfault. Branches into data cause a segfault. But if the OS points the code and data segments at the same area of RAM, the 386 doesn't care.
    <O
    ( \
    XGNOME vs. KDE: the game! [8m.com]
  • The Tao of Buffers
    Escapes most but plagues many.
    Has Intel caused this?
  • Stroustrup once said something like "I find almost every use of the term 'C/C++' to be indicative of ignorance."

    Well that's his own fault, isn't it? Stroustrup could have called it anything, but he chose `C++' as the name.

    So he makes a language that has syntax strikingly similar to C with a name which was a homage to the old language (one part `C' and one part `a C operator'), then calls people who lump them together as similar languages "ignorant."

    Please!

    --

  • by QuoteMstr ( 55051 ) <dan.colascione@gmail.com> on Saturday July 29, 2000 @08:57AM (#895196)
    Like a system, and langauge can be as secure or insecure as you can make it. One can write an extremely tight program in C++ while writing one in Perl or Java that leaves gaping security holes open.

    Educate the programmer on *why* things like sprintf, strcpy, etc. are Bad Things, don't force them to use a Bondage and Discipline-style language like RPG or Java that forces the programmer to do what the *language designer* thought to be The Right Thing.

    Instead of using a new language that probably wouldn't be suited to the task, why not write something like lint, but for security holes?

    Also...

    *PEOPLE SHOULD ENABLE ALL WARNINGS ON THEIR COMPILERS. WARNING CATCH MANY BUGS AND OTHER NASTY THINGS THAT WOULD OTHERWISE BE IGNORED.*
  • by mikpos ( 2397 ) on Saturday July 29, 2000 @08:57AM (#895197) Homepage
    You should be blaming the programmers from 10 years ago. The *only* C programs that I have seen that have one of:
    - gets()
    - scanf()
    - poor use of *scanf() or str*()
    are programs that have (a) been written 10 years ago; or (b) written by programmers who learned how to code in C 10 years ago. Yes, there are stupid technical school and high school teachers too, but normally, before anyone gets to a level where they can do something useful, they will have found a clueful reference to get rid of all that nonsense.

    There is a new breed of C programmers, though. They don't assume Unicisms, PDPisms or TheirPlatformisms, and their eyes actually burn a whole through their skull upon the mere sight of gets(), scanf(), or any of the other death traps.

    C is a good language. C++, of course, is not, though (hey, if you can get modded up for ignorant flamebait, why can't I?)
  • The former, of course, but just because some of the buffer overflows are fixed in the former doen't mean it is invunerable to attack. Better to fix the latter so as to keep both performance *and* security.
  • If I had mod points left, I'd call your post "Funny"!


    --

  • If you don't like C, fine. But I do like C and I use C. Again, fine. Different code for different folks. Where you're wrong is making a blanket statement as to why people use C. I won't deny that you could find people who actually believe that reason. I use C for entirely different reasons. But I won't tell you what the reasons are, because I'm not flaming you about specific reasons; I'm flaming you for generalizing totally inappropriately.

  • OK, I'm back from a day with family, offline. I withdrew my article about 1PM after it became clear that there was a lot I'd not understood when writing it that the kernel list folks, etc., could tell me. I think I'd like to write this one again when I understand it better.

    Meanwhile, does anyone remember the IAPX 432? It was a flop for several reasons: ADA flopped, its performance was bad, it was a real departure from the architectures of the day. But it had some real innovations - every function ran in its own protected space, using message passing for communication, and your program could protect itself from itself. Is it time to revisit that sort of architecture?

    Thanks

    Bruce

  • For a great deal of technical data on how buffer overflows work, and how to stop them, read this paper [immunix.org]. While I appreciate the plug that Bruce gave me for StackGuard [immunix.org], it does seem that he has not researched this topic very well:
    • Make the stack non-executable: Yes, this works, and security-conscious people will use Solar Designer's Kernel Patch [openwall.com] to do that. It works great.
    • Make the data segment non-executable: This works a whole lot less well. Too many UNIX programs depend on being able to execute code in the data segment. This is UNIX's fault, not Intel's fault.
    • Use the MMU For Enforcement: Ancient Burroguhs mainframes (the 6500 IIRC) actually stored each array in a separate segment. They also ran like a dog compared to modern RISC(y) architectures. We tried the MMU approach for StackGuard in 1997, and it imposed an 8000% overhead to do it that way. Read about it in this paper [immunix.org].

    Crispin
    -----
    Immunix: [immunix.org] Free Hardened Linux
    Chief Scientist, WireX [wirex.com]

  • C is a good language. C++, of course, is not, though (hey, if you can get modded up for ignorant flamebait, why can't I?)

    I'd rather not start a language flamewar, but I must defend C++.

    C++ is "of course" not a good language? Why? You can do everything you can do in C in C++, and then some. If you don't like templates, don't use them. If you don't like operator overloading, don't use it. Unlike certain other languages *cough*Java*cough*, it doesn't force any particular paradigm on you.

    That said, OO programming has many advantages over traditional C-style procedural programming, and using an OO language for it, furthermore, has many advantages over using hacked-up C to emulate it, ala GTK+.

    Yes, some of the features in C++ can be implemented in C, but the resulting implementations are *more* prone to bugs.

    Polymorphism can also be used for security --- If you want to add something in a procedural C program, you must update the code wherever that new things apprears or affects something else. If you want to do the same in OO-C++, by virtue of polymorphism, the new object can be used whereever the old one was.

  • Both the "C pointer model" and the manual memory management are easily wrappable.

    They are wrappable only in special cases. They are not wrappable in general.

    For example, you can't "wrap" call-by-reference or the address-of operator because smart pointers aren't really pointers--inheritance gets screwed up.

    Using string manipulation as a specific example, exactly what about std::string "severely limits language semantics"?

    You can make strings reasonably safe, but strings are probably the easiest case. The hard part is data structures that refer to each other.

    Sure there is [a way to achieve fault isolation in C++]. See man 2 fork, or CORBA

    That's not fault isolation in C++, it's fault isolation in the operating system. That kind of approach has a lot of overhead. That's why programs like Netscape or Emacs are multi-megabyte executables composed of dozens of libraries, rather than a bunch of small, communicating, separate processes.

    It also doesn't help a lot, since even running in separate processes, it's really easy for a process to corrupt (e.g., array index error) the data it then sends on to another process via CORBA; you still can't tell who did it in C++.

    Whether you believe me or not that fault isolation through processes can't be used in many cases, the fact is that people don't use it, and that alone is indication enough that we need a different mechanism.

    No, good programming practices are what prevents bugs. IMHO there isn't much difference between the various type-safe languages in bug prevention, but there is a huge difference between the various programming practices and quality standards.

    There isn't much difference among the various type safe languages in terms of bug prevention, but there is a huge difference between them and C/C++.

    I have talked with plenty of folks who claim that there's "no performance penalty or other drawbacks" to not having the power of C/C++'s low-level control, but I haven't seen anyone back that up with proof. I would really enjoy for someone, anyone, to prove otherwise.

    Go use Modula-3 or Ada or any of the other languages. If you want to, you can write code in them that is identical to C++, including all the unsafe constructs you want: C++ constructs translate one-to-one into those languages.

    The difference is that those languages have a well-defined, safe subset that people can use to write programs in.

    Of course, the compilers and tools for those languages are inferior to C++--because of their smaller user community. Many Modula-3 compilers don't do cross-module inlining, for example, so they will look worse on benchmarks. But that's a chicken-and-egg problem: because of the small user communities, they compilers don't get better, so everybody keeps suffering with C++.

  • GNU Sather and SmallEiffel compile to C, and there are versions of the DEC Modula-3 compiler that compile to C as well. Java, of course, also has implementations for many platforms.

    As I said myself, those languages have smaller user communities, and that means they have fewer libraries and tools for them. But you can call C and C++ code from them and they are quiet usable, in particular on Linux.

    I don't recommend doing everything in them, but give them a try for some projects, in particular open source projects for Linux. That's the only way this chicken-and-egg problem of moving beyond C++ will get addressed.

  • All that does is change the exploit from a stack overflow, to a slightly different kind of stack overflow. The bandaid would work at first, but after a few months, all the kiddie exploit tools would work in a new way.

    Linus Torvalds (and other kernel designers) have posted exaclty how to defeat a Solar Designer patches kernel as an example. Read the kernel traffic [linuxcare.com] and then realise that this sort of thing is a not a long-term success measure. The programmer must always be careful when designing a program, and no anal-retentive language or kernel-patches can protect you from bad code.
    ---
  • As someone who sees attemplts against his own system on IRC every day and sees newbies announce to the world at general "Hi I'm running wu-ftpd on Redhat 6.0 and ARRGHH what the hell just happened?" who's this guy on my box!?" I decided to investigate the feasability of providing safe versions of commonly run services.

    Libsafe is quite good but cant catch everything and breaks quite a few programs if you set it up in ld.so.preload.

    The Stackguard compiler is definitely more robust and seems to work well during the course of my tests.

    I've prepared RPMS of BIND8.2.2pl5 and Wu-FTPD 2.6.1 with Stackgaurd 2.0 Stout for RH Linux 6.2.
    As I've just prepared the rpms this week on my Slackware 7.1 system I dont know how well they perform as they haven't recieved a great deal of testing (No bug reports so far tho)

    The RPMS are available at http://indigo.ie/~fowler/ELSL/ with more daemons to come soon and hopefully DEBS. Try them out and mail me with any problems (My email address can be read from the reuslts of rpm -qpi file.rpm)

    Good luck and keep safe on the net
    Gnubie_ Efnet #Linux
  • _Many_ (if not most?) security attacks involve buffer overflows. You have to _work_ and _think_ to free yourself of buffer overflows in C/C++. In other languages, this protection comes for free.

    What about that huge chunk of interpreter, written in C or C++? Have you audited that, too?

    In a higher-level language, the simplest code can have side effects that might provide a security hole, so to reassure yourself you're going to effectively have to audit the behaviour of the interpretation of your program, not just the program itself. In C, at least, you know when you're making a function call, and you can be reasonably confident everything your program does is done explicitly rather than being hidden.

    I went to a rather informative lecture by a person whose business is selling security services and also works on OpenBSD. His view was that C programs, calling a minimal set of libraries (excluding GUI libraries, amongst others), are the only things that should ever be suid root.

  • Don't blame the hardware, don't blame the language, blame the programmer. Relying on the hardware to fix bad programming style is like a parachutist relying on a safety net.

    Any input operation that overwrites memory it is not supposed to, is bad programming style. Ideally the programmer does not know what hardware their code will run on, maybe it will be a flat memory machine with 0 memory managment hardware.

    In C scanf("%s",foo) is nice and handy for little programs. But it is not production level code. Production level code should instead always use limited length routines. So it is a little harder, maybe the first implementation has to be audited to remove the screw ups. This is like checking the return values on printf and scanf nobody does in the test code, but damn well better be done in the final code.

    Programmers need to limit themselves to limited input routines or at the start of the project development build a little library of limited input routines.

    What someone really needs to do is come up with a "no-overrun libc" that does not included any unlimited input functions and spits horrible messages whenever a standard input function is called with arguments allowing unlimited input.

    You link and run development code against this library and fix any place where it screams.

    I really am suprised big development houses don't do anything like this. But of course no one has time to do it right.
  • Meanwhile, does anyone remember the IAPX 432? It was a flop for several reasons: ADA flopped, its performance was bad, it was a real departure from the architectures of the day. But it had some real innovations - every function ran in its own protected space, using message passing for communication, and your program could protect itself from itself. Is it time to revisit that sort of architecture?

    Very possibly... before the iAPX432 came out (I still own the original architecture manual), there was the Burrough x700 mainframe architecture. It had 48-bit words, each with a 3-bit tag... one of the tag values was reserved for executable code. It also used an indirect base+displacement pointer architecture which allowed hardware bounds checking.

    I did quite a lot of kernel hacking on that as we had the full source code. Unless you did something stupid with the process dispatcher, it was impossible to overwrite memory or hang the system.

    Elliott Organick wrote an excellent book on the architecture. I think with some modifications this would make an excellent (and fast!) Java machine...

  • On the contrary, blame the programmers.

    Yes it is completely true that C and C++ have unchecked pointers and array bounds that are frequently the source buffer overflow issues. However, any serious C programmer *knows* these issues exist.

    The problem is that many programmers do not take the time to harden their programs against such attacks. They are often too focused on the program's real purpose to bother with security.

    I should also point out that there are vulnerabilities in programs written in Perl (I'm not trying to pick on Perl specifically BTW). Often a CGI script can be passed a syntactic boobytrap in a parameter. Programmers can be lazy and forget about good input validation the same way they forget about overflow issues.

    Another situation that programmers frequently miss is how the program behaves in low memory conditions. It is all to easy to assume that a memory allocation (implicit or explicit) will succeed. This may not result in a vulnerability, but it points to the larger problem.

    The bottom line is that making programs resistant to attack requires attention to detail. Since this usually requires more effort on the part of programmers, we shouldn't be surprised that it is usually ignored.
  • In all seriousness, however, how DO you write OO Basic?

    Lets say you have three variables that define the properties of some object and you have two operations (methods) on those three variables that are then done to/performed on that object. You then stipulate that no access will be made to those properties, except through your two methods. Now when you write the rest of the code you think of these three properties and two methods as one thing.

    This is a not as precise an answer as I would prefer to give. The problem being that it all has to do with what's going on in your head as the programmer.

    --locust

  • by GGardner ( 97375 ) on Saturday July 29, 2000 @09:22AM (#895236)
    I can't believe how many posts say that languages with automagic bounds checking (Java, Perl, Lisp, etc.) have too much overhead. Instead, these posters say, we should all manually insert bounds checking code by ourselves, and somehow that will magically have less overhead, and be more secure than having the compiler do it.

    Ick. That's just the sort of mundane task I want a compiler for. As a programmer, I already have too much to worry about -- bounds checking is one simple task that I'd just as soon have the compiler do.

    In most cases, the bounds check can be hoisted out of loops, so there's almost no overhead. In a perfect world, I'd like to see a compiler that, when given a high enough warning level, warns that it can't hoist bounds checks.

  • by Speare ( 84249 ) on Saturday July 29, 2000 @09:27AM (#895241) Homepage Journal

    Blame the developer!

    Sure, some operating systems or languages or chips hold the coder's hand and make some dangerous things impossible or difficult to do.

    It's still the programmer's fault for not knowing what the (void*) they're doing.

    This is the same argument as "C++ is slow!" It's only slow if you don't bother to learn what code a C++ compiler generates, using lots of mechanisms without realizing it. C++ implements its mechanisms as tightly as it can, but every mechanism you use takes some time to operate.

    Back to buffer overrun security: If you are gonna accept data from an untrusted source, why are you (1) putting it on the must-be-kept-inviolate stack, (2) not doing everything in your power to accept no more than n bytes that have been allocated?

    If the compiler docs specifically say "data in auto variables will never be put into an executable address space," and it does, then it's time to fix the compiler or docs. Likewise if the docs belie the behavior of a chip, time to fix the chip or docs.

    Don't blame a microprocessor for your mess. Don't blame a language for your mess.

    You have only yourself to blame.

  • Here, because someone was confused about this, are the safe functions you should be using.
    • int snprintf(char *str, size_t len, const char* format, ...)
    • char* strncat(char* s, const char* append, size_t count)
    • char* strncpy(char* dst, const char* src, size_t len)
    Note that len argument, which is the length of the destination area. With a correct value of len, these functions cannot cause buffer overflows, unlike sprintf, strcat, and strcpy, which are totally unsafe.

    Want some cred in the open source world? Grep for those in major projects and fix them.

  • In fact, my first thought on this is that while it's pretty neat that Bruce Peren's is capable of admitting that he was wrong, it is not at all cool that he yanked something that he published. Even if what he said was wrong, it's now part of the public discourse. The fact that we can take something down off of the web after it's been published is a bug, not a feature (I keep hoping that someday the WWW will mutate into something more like Xanadu...).

    Try this hypothetical: what if, instead of doing public speeches, polticians took to publishing their opinions in articles on the web? That way, if anything they say produces a bad reaction, they can just edit it away, and no one will be able to figure out what the complaints were about. Very convienient, eh?

    My take: If you publish an article, and then later recant, the thing to do is to add a link at the top pointing to your later thoughts on the subject.

  • Microsoft didn't release source code and encourage everyone else to make the same changes and/or suggest new ones.
  • by Greg Lindahl ( 37568 ) on Saturday July 29, 2000 @09:30AM (#895248) Homepage

    Almost every 386+ OS has not used segments the way Intel intended. So yes, they've had quite a few years (more than a decade) to add an execute bit, if they actually cared.

  • the Harvard architecture.

    There's a separate data bus, and a separate instruction bus. I don't think it is strictly required to have a separate data memory area and a separate instruction memory area, but I think it's usually implemented that way. There are a number of microcontrollers that use this architecture, storing the program in a ROM and accessing a RAM chip for scratchpad area.

  • and their eyes actually burn a whole through their skull upon the mere sight of gets(), scanf(), or any of the other death traps.

    IIRC, gcc issues warnings if gets is used. Personally, I would like to see it redifined in the standards to: 'print insults to stderr at runtime and segfault immediatly'. That would help to identify offending code more quickly.

  • I see a great many posts throughout this discussion that say things like "blame the language", "real programmers wouldn't ever make a mistake like that - it's only schmucks who write programs with buffer overflows". I think these posts are completely missing the point.

    First, even if you had, say, a perfect language in which you couldn't possibly overflow any buffers (in theory), there still might be bugs in the compiler for that language which could be exploited. In the end, all programs are machine language. As long as it is possible for an executable to write to executable memory, it will be done. The only way of eliminating this problem is through hardware (and an OS that uses the hardware features).

    Second, even great programmers make mistakes. Have you ever written a bug free program of any reasonable length? Having a language that prohibits a totally unnecessary feature (e.g. writing beyond the end of an array) means that programmers cannot make that mistake (though it is still possible that the compiler might screw up the bounds checking...).

    Reality is such that C and C++ are the dominant programming languages, which means that buffer overflows are significantly more likely, which means that it is even more important for the operating system to police it, if you want good security.

    Good program design prohibits (or at least makes very difficult) that which shouldn't be done while making it easy to do the things you want done. It can (and should) be done in the language. It can (and should) be done in the OS.

  • Why?
    Because a lot of people are forced to program in C...

    I'm not a fan of C, the library has some horrid things (like routines without buffer-overrun checking), and the language is very low level. But when working with other people, sometimes C is a necessary evil
    What to do? You can get some higher level programming using BetterC [usc.edu], a C library that gives you Eiffel-like exception checking with a minimum efficience penalty, and without leaving your favorite C compiler.
    It's my Nirvana, I don't use debuggers anymore...
  • Sure, you can blindly turn off the executable bit on stack pages. It simply requires that the program turn it off if they're doing on the fly code or self-modifying code.

    It could also be done as a compile-time switch. This would prevent breaking random code that doesn't know that the rules have changed.

    A 'dumb' question -- given that I've never bothered to program Intel assembler (I looked at the 8086 model and got sick to my stomach!).
    The stack, data and code segments are logically separate, aren't they? wouldn't it be possible to make them disjoint spaces so that you simply couldn't jump to the stack? -- kinda like security by obscurity.
    Stack and Data may want to share code space, but why share the spaces for data and code, unless you're doing self modifying code. If you need modifiable code space you would do a call to tell malloc to give you some dual-mapped memory. Once you were done modifying it you could unmap it from the data space

    The backwards-compatible model would be: Any segment address would map to the same real address if it maps. It would not, however, necessarily map to all (or any) memory spaces.

  • by Tony-A ( 29931 ) on Saturday July 29, 2000 @09:47AM (#895272)
    This is from an Intel 80386 Programmer's Reference manual, a bit dated, but should be still valid.
    There are 4 privilege levels. Does anything other than Multics use more than 2?
    Code, Stack, and Data exist in completely seperable address spaces.
    A running process has access to some 8000+ local 32-bit address spaces and some 8000+ global 32-bit address spaces.
    A selector can specify a buffer to byte granularity.
    Basically, the problem is that in current systems, CS, SS, and DS all point to the same nice linear 32-bit address space. Away with segmentation. It's a bit more complicated than that, but in general, most code can be read and written, most data can be executable.
    The problem has been solved many times. I think some of the old Burroughs computers did some neat things. Unfortunately the good ones die because of a 5 or 10% performance lag, or they refuse to run certain bugs without complaint.

    Executable code on the stack??? Seems like a very bad idea.
    Null terminated strings. Nice trick, but miss a \0 or try to handle raw binary, and you have severe problems.

    Will anything get solved? Probably not. Any solution will break existing code, or more realistically, discover that the existing code was already broken, only nothin knew or cared about the consequences.
  • by Tom7 ( 102298 ) on Saturday July 29, 2000 @09:50AM (#895273) Homepage Journal

    Lest you be confused by the +1 funny on my post, let me say that I am not joking.

    2x slower is the most conservative estimate for the speed of modern safe languages against C code. (In practice I've seen much better. Does anyone trust benchmarks?) My point is, even if it is 2X slower, I'll gladly take it and sleep a little more soundly at night knowing that my linux box isn't being hacked due to 20 year-old issues. 99% of my box's CPU time is spent at Nice -19 trying to find big primes for the GIMPS project.

    Modern languages (take java if OO is your thing, but there are more intersting languages around) have SOLVED this problem with buffer checking (or static proofs that checking isn't needed). Without having to worry about this type of common security hole, programmers can spend more time on things we REALLY need: documentation, maintainable code, asymptotic speed increases, and the other possible security holes (ie, not escaping shell metacharacters in user input).

    See my thread on Functional Languages for what I think is a convincing argument about modern typed languages in general. I know my position is extreme, but that doesn't make it a joke.

    http://slashdot.org/comments.pl?sid=00/07/01/232 1210&threshold=1&commentsort=3&mode=thread &cid=145
  • a non-executable stack does nothing, you just return either into your data segment or into libc. this has all been hashed out before on various mailing lists. all of these patches only disable a particular method of exploitation, but the overflow still exists to be exploited in some other way!

    this is security through obscurity, plain and simple.
  • by electricmonk ( 169355 ) on Saturday July 29, 2000 @11:29AM (#895281) Homepage
    is it like sorta like going out with too many guys or something?

    No, I think that's more akin to what a packet sniffer does. But close!

  • by adubey ( 82183 ) on Saturday July 29, 2000 @11:31AM (#895283)
    Like a system, and langauge can be as secure or insecure as you can make it. One can write an extremely tight program in C++ while writing one in Perl or Java that leaves gaping security holes open.

    This statement troubles me. C/C++ addict who have little exposure to other languages have little knowledge of what they're missing.

    _Many_ (if not most?) security attacks involve buffer overflows. You have to _work_ and _think_ to free yourself of buffer overflows in C/C++. In other languages, this protection comes for free.

    Yes, it's possible to make a secure program in C/C++. But it's just a hell of a lot easier in bounds-checking languages.

    So there.

  • There are lots of ways in which you can create security holes. Many of them are shared between Java and C++. But buffer overruns are a huge risk in C/C++ because the language forces programmers to use unsafe constructs at every turn. And, an even bigger problem than the lack of security is just the lack of reliability of C/C++-based systems. C was a great design in the 1970's when it needed to run on 64k PDP-11's. In 2000, it's an anachronism. And the C/C++ family is not fixable without serious incompatible changes.

    In part, people like you are the problem who view this as an issue of "being forced" by "bondage and discipline-style languages". In fact, languages like Modula-3 don't "force" you to do anything. You can cast, convert, and crash as much as you want. But unlike C/C++, when you do, you use different constructs from those you use for normal programming. By default, Modula-3 checks, and that's by far the more useful case. The same is true for most other system programming languages: they provide both safe and unsafe constructs, but you can actually do something useful staying within the safe constructs. It is only C/C++ that forces programmers to use unsafe constructs and makes it impossible for compilers or lint-like tools to do checking.

    Languages like C/C++ only hang around because they have a user community. They are entrenched in the same way the Windows 3.1 architecture or other bad, outdated standards are entrenched. It's hard to break out of that because all the tools and APIs are created for C/C++, so that anything else is at a disadvantage and appears cumbersome in comparison.

    But realistically, if you want to write reliable, safe programs without expending a huge amount of time and effort on tracking down bugs, you need to dump C/C++ and switch to one of the many alternative, existing, high quality systems programming languages. The sooner we make the change for the industry as a whole, the better.

  • Why wouldn't the stack-exec patch do what it was supposed to do and prevent any buffer overrun attack?

    The only thing changed is the strategy of the overrun. The attack on a non-exec is accomplished by overruning the buffer so that you set up the parameters for a desirable syscall (exec /bin/sh comes to mind) and adjust the return address on the stack to point to a syscall in the daemon's code (best place is in libc somewhere nearly everything will have that linked in). Now, the function returns and your syscall executes just fine in the code segmnent.

    It's not so much like putting a stronger door in as it is putting in a lock that turns the other way. A moment of confusion is all you will create. Soon, lock pickers will be wise to the trick and wont even be inconvienianced.

  • Where am I forced to use an array or raw string in C++? I can use STL containers and std::string, respectively. They are *much* safer.

Happiness is twin floppies.

Working...