Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×

Is Code Verification Finally Good Enough? 87

Jonathan asks: "As someone who has been following the development of software verification technology, a recent trend has intrigued me. It seems that the formal method people have finally come down off their high horse and are offering code verification as just another tool. This approach shows up in recent Java and C# based code verification tools that aren't aimed at 'proving correctness' so much as finding potential errors. Now it seems that such an approach is beginning to find its way into XP methods[pdf] as another verification tool to supplement unit tests. Given the current speed and effectiveness of tools like the Spec# verifier, is code verification via automated theorem proving finally going to make its way into the mainstream?"
This discussion has been archived. No new comments can be posted.

Is Code Verification Finally Good Enough?

Comments Filter:
  • For what? (Score:3, Interesting)

    by Control Group ( 105494 ) on Friday September 29, 2006 @01:02PM (#16247539) Homepage
    What does "good enough" mean? Good enough for it to be included as a general-use software development tool?Good enough to guarantee that released code is bug-free? Good enough to guarantee that released code has no glaring security-affecting errors?

    It makes a difference which "good enough" you're asking for.

    (The answers to those questions, to channel one Urban Chronotis, are yes, no, and maybe. Not necessarily in that order)
    • In my experience, the "good enough" that matters is "good enough to catch errors without forcing a developer to have to think." When it's click and drool, and yet still useful, it will be "good enough" for widespread adoption.
    • Before tools [wikipedia.org] can check the code (eg, static analysis [wikipedia.org]) we need to know what constructs they need to flag. The formal verification people will claim that checking a program's source against its specification is the solution. However, that is a very expensive and time consuming process. There is an ISO group looking at vulnerabilities in programming languages [aitcnet.org].
      • Do theorem provers prove that what the user told you they wanted made sense?
      • ... that what they wanted would solve the problem they were trying to solve?
      • ... that the problem they thought they were trying to solve was really the problem they *needed* to have solved?
      • ... that what they wanted when you started was still what they wanted when you were done?

      Uh... no.

      But they can probably be considered to be a newer and better version of lint, and that's a good thing.

      • Mod parent way up.

        Code-checking is a good thing, and provably correct code is especially good, though what's provable is still narrower than the range of behaviors of any interesting system I've seen.

        But the most difficult part of any complex system development project is getting valid requirements and then implementing a process of systematically and iteratively confirming that the delivered system is what the users need, not what some bunch of managers or analysts asked for on their behalf. This is
  • I think it's legitimate to expect verification tools to make their way into mainstream software development, but not as any sort of "catch all".

    Developers will have to analyze the specific tools and decide if it is a good supplement to current verification efforts or if it can replace some steps. I think though once it worms its way into mainstream (if only light) use, the tools will continue to improve and give us more options, and become a more integral part of verification.

    I know from work I've done, th
  • Mankind's ability to write software is far in advance of mankind's ability to figure out what it does or does not do.

    If Microsoft ship 100 million copies of Windows Vista to 100 million people on public Internet, where the people have mutually-incompatible agendas, then at least some of the people will end up unhappy.

  • I dunno, as more tools to verify code become available, businesses will hire cheaper and sloppier programmers and just force all the code through the verifiers and hope for the best.

    These aren't tools to make the best code better, they are tools to make the worst code okay.
    • by chroot_james ( 833654 ) on Friday September 29, 2006 @01:49PM (#16248335) Homepage
      they already do! I came across this the other day:
      private getThis() { return this; }
      • Re: (Score:3, Funny)

        by kirun ( 658684 )
        Seems like sensible future-proofing to me. Sure, the variable this is called this right now, but who knows what tomorrow will bring?
        • by dgatwood ( 11270 )

          It got funnier the more times I read it. The first time, I read it as a private accessor function that returned a private class member. That alone would be funny, since the only reason to make such a function would be to allow read-only exposure of a private member to non-class code, and by making it private, that eliminated the usefulness altogether. Then, I realized precisely which member was being returned. When I realized that it wasn't returning an explicit class member, but rather, a pointer to th

          • by lokedhs ( 672255 )
            The method could be overridden in a subclass. That would make it useful.

            Or perhaps the .getThis() method was specificed by an interface as a way of retrieving some functionality, but the functionality was implemented by the class itself. This design is used in table cell renderers in Java for example.

      • Tell me that's a joke, and no one actually wrote that in a serious program.

        Or at least, if they did, its in a language that just happens to look like Java, but where "this" isn't a reference to the object the object the private method is in. Because that makes my brain hurt.
      • I've never seen anything quite that bad, but I do see this a lot in older code:

        private getThis() { return get_value(foo,bar); }

        The explanation for that is often that someone will create a function to go fetch a single piece of data, getHeight(). Then the next programmer comes along, and adds the ability to getWidth(). The next guy starts to add getDepth(), but then thinks that this is a bad path to go down, so writes a new function called get_value() that serves the more general purpose of fetching data

        • by mindriot ( 96208 )

          This brings up a somewhat interesting point. I can see the point of using stuff like getFoo() just to access an attribute foo. I think it has its purpose (read-only access), but I can also see that it is rather ugly. Further, in C++ code you sometimes see the use of friend just to provide full access to certain attributes and methods otherwise kept hidden or read-only (through getFoo()) to certain privileged classes and such.

          I think sometimes it might be more useful to provide proper access control f

      • that wouldn't compile anyhow, so it's not surprising that we don't know what it is for.
    • by neutralstone ( 121350 ) on Friday September 29, 2006 @06:18PM (#16252873)
      Regardless of the poor decisions of some recruiters, I've found this to be true: the most competent and disciplined engineers tend to use as many checking techniques as possible (e.g., regression tests, unit tests, multiple static analysis tools, etc.). They use these techniques not as a substitute for a disciplined approach to design or to implementation (that would be foolish), but as a sanity check on a finished product. They use multiple checking techniques because they know they're human, and humans -- even the most skilled, most experienced, most disciplined ones -- make mistakes, and they want to do everything they reasonably can to catch mistakes before a product ships.

      So to that end, model checkers (and other forms of checking) are Good Things(TM). The possibility that some will misconstrue checking facilities as a substitute for competent and disciplined engineers is a separate issue, and it's not the fault of the checking scheme. The *organizational* problem of poor recruiting cannot be solved by getting rid of model-checking systems, and if your recruiters think that talent can be replaced by a "verification" system, you've got bigger problems than merely the shoddy designs that will come down the pike.

      (Please note that I prefer to use the term "check" rather than "verify", because the latter term, when used to describe a program, too easily connotes the notion that a "verified" program is free of any bugs (which is, needless to say, always a dubious claim). "Check" on the other had usually connotes the idea that there is a specific bug (or set of specific bugs) that is tested for. Of course, TFA mentions they don't want to imply that (verification == no bugs). But IMHO too many people make that mistake.)
      • I've found the complete opposite to be true. Rarely do I find tons of checking tools. Mostly it is design review the arch, implementation, and code. Perform unit testing. The good engineers do better unit testing, thinking about the interactions, wake up at night realizing something is screwed up, and fix their problems proactively.

        Everyone 'wants' to use tools but the reality is the overhead is pretty significant so they are avoided. Only in large companies do I see this methodology and even that is f
  • Finding errors /= Proving correctness. Part of proving a program, imho, should include making sure the program only does what it claims to do. No more. (Say goodbye to tag-a-long malware!) Of course, this is a pretty idealistic outlook. We could all just program in ML, instead.
  • Not possible (Score:2, Insightful)

    by Intron ( 870560 )
    The problem with any verification tool is that it can't know what you expect the program to do unless you tell it. Why should this be any more correct than the program itself? In other words, if you can write a description of what the program should do that is more accurate than the program, then that description should replace the source code.
    • Re: (Score:1, Funny)

      by Anonymous Coward
      Exactly right. How can a program know what another program is *supposed* to do unless the first program has available to it an exact description of what the program it is checking is supposed to do? And if that exact description is available, there should be an automatic code generator which turns the description into code.

      In other words, writting the exact description of what you want should be your last step in development, after which a code generator creates the program. This turns coding into the pr
      • Re: (Score:2, Funny)

        by l33td00d42 ( 873726 )
        Alice: "Try to get this concept through your thick skull: the software can do whatever I design it to do"
        (pause)
        user: "Can you design it to tell you my requirements?"
        boy, that's easy:

        print "This software is required to print this statement.";
    • Re:Not possible (Score:4, Interesting)

      by rblum ( 211213 ) on Friday September 29, 2006 @05:16PM (#16251879)
      That - in an other discipline - is called "double bookkeeping". You specify things twice, independently. Once in the actual source code, once in the description.

      That way, you'll know when the two disagree. It doesn't prove that it does what you *intend* to do, but it proves that spec and source say the same thing. If they're written by different people, you can now have a fairly high degree of confidence.

      Does it make your program bug-free? Only if you can write a bug-free spec. But it's a useful concept to reduce the likelihood of problems.
      • There some programming lanugages (Ada, SPARK and Eiffel spring to my mind) which have the verification build in (to some degree). Just in case you wonder how that works here my favorite Ada example:

        type Day_Of_Month is range 1 .. 31;

        Neither Java nor C# allow that kind of controll over integer types. Mind you there is A# (Ada for CLI).

        Martin
        • by bodan ( 619290 )

          There some programming lanugages (Ada, SPARK and Eiffel spring to my mind) which have the verification build in (to some degree). Just in case you wonder how that works here my favorite Ada example: type Day_Of_Month is range 1 .. 31; Neither Java nor C# allow that kind of controll over integer types. Mind you there is A# (Ada for CLI).

          Well, they don't quite allow this control on integer types. But you don't need to use integer types, you can use an enum in Java (with methods for operations), and I _th

          • Well, you are right - you need a Date class. It is all about refined. As long as you don't know the month you can't check the date further the 1..31. It's kind of a first step.

            Once you know the month inside a Date class you can do more checks. But you won't need to check for the date to be >0 or 32 any more - theese cases have allready been handled.

            Martin

    • Re:Not possible (Score:4, Informative)

      by Coryoth ( 254751 ) on Friday September 29, 2006 @05:40PM (#16252249) Homepage Journal
      The problem with any verification tool is that it can't know what you expect the program to do unless you tell it. Why should this be any more correct than the program itself? In other words, if you can write a description of what the program should do that is more accurate than the program, then that description should replace the source code.

      It is often easier to describe what a program has to do than it is to describe exactly how to do it. I can describe what a square root finding algorithm has to do: given x it has to find y such that y*y =x; without coming close to writing something that actaully finds the square root. All the prover then has to do is ensure that the code instructions really do produce a result that satisfies the stated constraint - which is potentially entirely possible. Writing contracts for code can be as easy (or easier) than writing unit tests for code. If you write a contract instead of a unit test then that potentially gives the prover all it needs to verify that the code will work for all possible test cases instead of your predefined unit test cases. If the prover can't manage that then just feed in some test data and if the contract fails you've found an error (that is, at worst you've got a unit test already). Working this way can be remarkably efficient at weeding out errors - you verify code when it works, and essentially do unit testing for the code that can't be verified. Perhaps if you actually tried it you'd see.
    • by Peaker ( 72084 )

      The problem with any verification tool is that it can't know what you expect the program to do unless you tell it. Why should this be any more correct than the program itself?

      Many aspects of the program can be described, and not all of them must be proven.

      A program can be roughly divided to its requirements specification and the implementation. The requirements leave a lot of "freedom" to the implementation, and the choices in this area are not important to prove.

      As a simple example, a "sort" algorithm has

  • by Anonymous Coward
    Since when did this move from the realm of undecidable problems into solvable with a Turing Machine?
    • by Fahrenheit 450 ( 765492 ) on Friday September 29, 2006 @02:44PM (#16249249)
      Well, it's quite possible to do things like restrict the problem to a decidable subset of the original problem and only let your solution work on those subsets. Or you can work on problems that are undecidable in general, but in most typical cases are not only decidable, but efficiently solvable. For example, most type systems work in one of these two ways.

      You see, you should have learned before you graduated that just because a problem is hard (or impossible) in general, that doesn't mean the average case is hard at all.
    • Since when did this move from the realm of undecidable problems into solvable with a Turing Machine?

      About as long as we've being using finite state machines as approximations to a Turing Machine.

  • No :) (Score:1, Troll)

    The thing you have to remember about methodologies is they are all bullshit. In my experience, the people who cling to methodologies are generally not clue-full.

    The paper you linked to -- I just finished reading it. The "methodology" they provide has more comedic value than anything else. The steps in his methodology are counter-intuitive and therefore stupid. He wants us to iterate over each requirement, re-writing a function as the most simple program which passes the tests? What possible reason fo

    • The thing you have to remember about methodologies is they are all bullshit.

      Not every methodology sucks all the time. A methodology is just a named collection of practices.

      In my experience, the people who cling to methodologies are generally not clue-full.

      Humans inherently crave religion. Convince someone that something works and they're likely to cling to it to the bitter end, be it Code Verification, atheism, voting Republican or whatever.

    • Re:No :) (Score:5, Insightful)

      by Control Group ( 105494 ) on Friday September 29, 2006 @02:52PM (#16249393) Homepage
      The steps in his methodology are counter-intuitive and therefore stupid

      I agree with your general statement, but this isn't good evidence. Plenty of things are counterintuitive, yet correct and useful.

      I present exhibit A: the whole of statistics. It is counterintuitive that after the tenth straight "heads" result, the eleventh flip still has even odds of coming up heads - despite the fact that the odds of eleven straight flips coming up heads are 2048:1. It is counterintuitive that you should always switch doors in the Monty Hall problem.

      Or, if you're scientifically minded, I give you relativity. It is counterintuitive that no matter how fast you're moving relative to a light source, the speed of the light from your point of view will never change.

      It is counterintuitive that gravity causes hot air balloons to rise, or that the way to escape Earth orbit is to accelerate along it, not away from Earth.

      Counterintuitive doesn't necessarily mean that something's incorrect.
      • by Profound ( 50789 )
        >> It is counterintuitive that gravity causes hot air balloons to rise

        That's buoyancy.
  • by Animats ( 122034 ) on Friday September 29, 2006 @02:50PM (#16249347) Homepage

    The main reason program verification didn't catch on was that it was hopeless for C and C++. The semantics of those languages were so messy that formalizing them was nearly hopeless.

    Java and C#, however, are good enough. (So were Pascal, Modula, and Ada.)

    Here's the manual for the Pascal-F verifier [animats.com], a system written by a team I headed back in the early 1980s. This was a proprietary system done internally for Ford Motor Company. Take a look at the example real time engine control program beginning on page 155. It was painfully slow back then; it took 45 minutes of VAX 11/780 time (1 MIPS) to verify that program from a cold start. Today, it would take about a second.

    What's being proved in that example? First, that there are no subscripts out of range or arithmetic overflows. Second, that all loops terminate. (Yes, you can prove that for most useful programs; the halting problem applies only to pathological programs.) Third, that the following constraints hold:

    • fuelpumpon implies (tickssincespark < (1000*ms)); if fuel pump is on, spark must occur within 1 sec.
    • (enginespeed < rpm(1)) implies (not fuelpumpon); fuel pump must be disabled if the engine is not rotating
    • cylssincespark <= 1; a spark must be issued for each cylinder pulse

    Useful stuff, the conditions needed to keep the engine running.

    This is "design by contract" with teeth. Each function is checked to insure that it always satisfies its exit conditions if its entry conditions are satisfied by the caller, and that the function doesn't overflow, subscript out of range, or fail to terminate. Each call is checked to insure that its entry conditions are always satisfied. The end result is a guarantee that those properties hold for the whole program.

    This is a very valuable check. It insures that caller and callee are in agreement on how to call each function. That's the cause of a huge number of software bugs - the caller made some incorrect assumption about the function being called, or the function didn't check for something which it needed to check. Both of those can be statically machine checked.

    It's not easy to get a program through formal verification with a verifier like that one. The verifier does almost all the work on easy sections of code, but where correctness depends on anything non-trivial, you have to work with the theorem prover to get the proofs through. This isn't easy. The DEC Java checker and Microsoft's Spec# checker aren't as hard-line.

    • Well at least Ada still is: http://www.praxis-his.com/sparkada [praxis-his.com] and very successfull as well.

      And Ada isn't dead, it just got another rejuvenation in the form of Ada 2005: http://www.adaic.com/standards/ada05.html [adaic.com].

      So the use of past tense was not appropiate.

      Martin
    • by Peaker ( 72084 )

      The main reason program verification didn't catch on was that it was hopeless for C and C++.

      Java and C#, however, are good enough. (So were Pascal, Modula, and Ada.)

      What is the semantic difference between C and Pascal?

      As far as I can remember, the only real semantic difference was Pascal's lack of casting.
      Is this difference really the breaking point of C's provability?

      I would expect pointer arithmetic to be the source of problems, but if I recall correctly, Pascal's pointers have arithmetic as well.

      • by Animats ( 122034 ) on Saturday September 30, 2006 @12:34PM (#16258733) Homepage

        What is the semantic difference between C and Pascal?

        The big problems with C/C++ from a formal methods standpoint:

        • void* (You lose all the type information)
        • Too much casting. (Again, you lose all the type information)
        • Array size information isn't carried along with arrays or passed through function calls. (Biggest design mistake in C).
        • The "array is a pointer to the first element" convention. (Is this an array or a single object?)
        • Pointer arithmetic semantics are ill defined. (Long story.)
        • C++ iterators don't have an explicit tie to the collection, so detecting iterator invalidation statically is hard.
        • Hard to detect aliasing. (A consequence of the free use of pointers.)
        • Too much "undefined behavior" that isn't prohibited.

        I would expect pointer arithmetic to be the source of problems, but if I recall correctly, Pascal's pointers have arithmetic as well.

        No, they don't.

        Pointer arithmetic really is obsolete. Compilers have been generating code for subscripting that's as good or better than pointer arithmetic code for well over a decade now.


        • I would expect pointer arithmetic to be the source of problems, but if I recall correctly, Pascal's pointers have arithmetic as well.

          No, they don't.


          That depends entirely on what you mean by "pascal". The pascal language as spec'd doesn't. But nobody uses that, so this not a very useful definition to adhere to.

          Most users of "pascal" use/used Borland Turbo Pascal or Delphi. In those languages you *can* do pointer arithmetic. Sure you have to jump through more hoops and casts than with C (it can be considered
      • Others answered the question but I can add practical experience here. For A while I tied splint [1] and after after a while it became clear that the greatest problem whas the lack of array bound.

        I tried hard, realy hard to get splints bound checking to work - but in the end it did not work out.

        The project is not stalled since 2004 [2] and for me it is now clear that the "Secure Programming Lint" was a hopeless endeavor.

        As for casting: Ada has about 3 different ways of type convertions and is still provable.
        • by Peaker ( 72084 )
          PS: I have programmed in Ada, C, C++, Java and Pascal and my favorite language is now Ada.

          I have programmed in Assembly, Python, C, C++, Pascal, some Lisp and a tiny bit of Java, Haskell and Ada.

          I dislike Ada because I believe there are far too many primitives that should be libraries (semaphores and other concurrency features).

          I now love the other edge of the spectrum: Python. The "eww whitespace" barrier seems to block/filter a lot of closed-minded programmers, but don't let it prevent you from learning
          • But Ada does not expose the use of Semaphores - The task model of Ada is based around Rendezvous.

            There are some attempts of Rendezvous based threading for other languages but it proves difficult since you need some inter-task parameter passing mechanism. Remember that a Rendezvous is a bit like a function/procedure call only the parameters are passed from one task to another.

            And parameter passing is a language feature and quite difficult to implement as a library. I know because I tried using by using Obser
  • by dargaud ( 518470 ) <[ten.duagradg] [ta] [2todhsals]> on Friday September 29, 2006 @03:16PM (#16249755) Homepage
    "Beware of bugs in the above code; I have only proved it correct, not tried it." —Donald E. Knuth.
  • logic bug (Score:3, Funny)

    by hey ( 83763 ) on Saturday September 30, 2006 @04:54AM (#16256721) Journal
    How could a program ever find this bug:

    if (vote == "democrat")
    {
            republican++;
    }
    • by vidnet ( 580068 )
      Simple, it would see that "democrat" was a literal and thus it would be highly unlikely that 'vote' would happen to have that same pointer value, and it would suggest if(strcmp(vote,"democrat")==0) instead.

      (haha)
    • by Brown ( 36659 )
      It's not a bug, it's a feature!
  • by icepick72 ( 834363 ) on Saturday September 30, 2006 @04:02PM (#16260317)
    Verification is speeding along because of managed languages. Verification for managed languages like Java and C# is easier to implement. Not so much for C/C++ where pointers are raw. (Note: C# allows unmanaged code but by default it's managed and I assume most developers work in C# in managed mode, except where performance or interopability is required)
  • ``Is Code Verification Finally Good Enough?''

    Please verify that

    (define (foo n)
            (cond
                    ((= n 1) 1)
                    ((even? n) (foo (/ n 2)))
                    (else (foo (+ (* n 3) 1)))))

    returns 1 for any positive integer n.
    • Anyone can come up with a list of algorithms for which the answer to all inputs is unknown, of which the 3n+1 problem is one (so far). Even though we can't prove that the algorithm halts we can still prove that it's well typed, which in practical terms is much more important because it prevents buffer overflows and any number of other programming errors. In addition, we can also prove that the algorithm actually does solve the 3n+1 problem and not some similar but slightly different one. For instance, if yo
      • ``Even though we can't prove that the algorithm halts we can still prove that it's well typed, which in practical terms is much more important because it prevents buffer overflows and any number of other programming errors.''

        For preventing buffer overflows, you will have to prove that some value being used as an array index is an integer in the range from 0 to the size of the buffer minus 1. In this case, the problem is to prove that the result is an integer in the range 1 to 1. Most type systems aren't exp
        • For preventing buffer overflows, you will have to prove that some value being used as an array index is an integer in the range from 0 to the size of the buffer minus 1. In this case, the problem is to prove that the result is an integer in the range 1 to 1.

          Generally, proofs for this are expressed as loop invariants inside of functions, and as predicates on the entry and exits of functions. If a function takes an array index as a parameter, there is a predicate that 0<=i<n, and it is the responsibi
          • Ok, I have a program that, if it halts, causes a buffer overflow.
            Otherwise, it doesn't.

            Can your type system prove that there is no buffer overflow?
  • Formal Verification has become a staple of hardware design. It basically comes in two forms: equivalence checking (used a LOT to compare two similar designs) and model checking (which is slowly making inroads; used to compare a design against a set of rules, another model, etc).

    I have often wondered why more formal techniques aren't deployed in the software world.

    As has been mentioned here, the thing about formal is that you will often "stump it" and so you may need to "help it" in complex, real world desi

"Money is the root of all money." -- the moving finger

Working...