Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
Check out the new SourceForge HTML5 internet speed test! No Flash necessary and runs on all devices. ×

Replacing Humans with Software Inspectors 90

An anonymous reader writes "What if you were able to perform a portion of your code reviews automatically? In this first article of the new series 'Automation for the People', development automation expert Paul Duvall begins with a look at automated inspectors like CheckStyle, JavaNCSS, and CPD. The piece examines how these tools enhance the development process and when you should use them." From the article: "Every time a team member commits modifications to a version control repository, the code has changed. But how did it change? Was the modified code the victim of a copy-and-paste job? Did the complexity increase? The only way to know is to run a software inspector at every check-in. Moreover, receiving feedback on each of the risks discussed thus far on a continuous basis is one sure-fire way to keep a code base's health in check automatically!"
This discussion has been archived. No new comments can be posted.

Replacing Humans with Software Inspectors

Comments Filter:
  • by tcopeland ( 32225 ) * <tom@@@thomasleecopeland...com> on Friday August 04, 2006 @01:49PM (#15848074) Homepage
    If you've got a reasonably recent JVM installed you can run CPD via Java Web Start here [sourceforge.net].

    There are some examples of CPD output there too - like the duplicate code chunks that it found in the JDK source code and in the Apache httpd source code.

    For much more on CPD, see chapter 5 [pmdapplied.com]!
  • by Anonymous Coward
    from the article:

    ...way to keep a code base's health in check...

    Uhhhh, maybe you should try an english language cliche parser instead?
  • by just_another_sean ( 919159 ) on Friday August 04, 2006 @01:51PM (#15848091) Journal
    I would say that, like any other code generator, wizard, druid, whatever you want to call it, this has a great potential to help with programming. As long as developers using it don't get lazy about checking their own code.

    As a tool to point out obvious code flaws, catch conflicting code in large projects, etc. this is great. But I've found over the years that if you don't understand how to do something manually and aren't able to second guess a tool when it makes a mistake then these types of things can end up being more trouble then they're worth.

    Just my $.02

    • I view these things kind of like the 'grammar checker' in Word. Yeah, kind of useful once in a while maybe, but its not going to help you if you _dont_ know what you are doing, and its probably not going to be of any use if you _do_ know what you are doing.
      • by bunions ( 970377 ) on Friday August 04, 2006 @01:58PM (#15848139)
        So which are you - the guy who didn't know he was supposed to put an apostrophe in "don't" or the guy who did? Seems like either way, a grammar checker might have helped.

        My point isn't that you made a typo and that you should feel bad, it's that everyone makes mistakes, including people who know what they're doing. That is why tools like this are useful.
      • by phantomfive ( 622387 ) on Friday August 04, 2006 @02:01PM (#15848159) Journal
        Personally I don't like the Microsoft grammar checker, but I've checked a lot of papers from ESL students, and the ones who use the grammar checker write far more clearly than those who don't. So it is certainly useful for something. I am not sure how grammer checkers affect their English in the long run, though. Perhaps it makes them lazy so they never learn their English? I can see how this tool would also make programmers lazy, thinking their code is good enough (and managers agreeing) when it really is garbage.
      • I would view this in the same light as I view integrated development environments. Developers should be forbidden from using them until they have significant experience developing / debugging by hand. If someone learns to develop on an IDE prior to learning the language, all one is doing is poking buttons, not learning why what you just did works. Same should go for things like this. Until you understand what it is doing, why it's making the decisions that it makes, and when to use it, it should be forb
        • also, people should have to start with assembly before they're allowed to use higher-level languages, because if you just start with Perl, all you're doing is using tools and you don't really understand why things work.
        • Yes, exactly. My initial comment stems from me starting my adventures in programming with Visual Basic and VBA. These are handy tools for prototyping an application quickly but when they do something wrong or just aren't capable of something and you don't anything beyond the IDE itself then you're stuck. I quickly learned not to trust wizards and started to expand my knowledge by learning and using the Windows API from within VB.

          But I really never felt like I understood what I was doing unitl I started writ
          • Hell, I'd go one step further, and say I wasn't a competant programmer until I started writing C / C++ for Linux instead of Java. Now I understand what's going on at a deeper level. I actually don't mind using a well designed IDE, after I learned the language (Eclipse with Java, NEdit for C & C++).
            • You are only more competent in that you know how to do things you shouldn't do manually anyway.
              • I'd like to clarify based on some of the somewhat sarcastic "yeah real programmers code in assembly with cat!" replies.

                I'm not saying tools, IDE and even M$ style code generators are a bad thing. But it is important to understand them under the hood. They are each a great asset when they work correctly or completely fill your needs. When they fail or can't do that last 5% because the tool maker didn't anticipate your particluar need then your only recourse is to open the hood and fiddle with the bits. I cou
        • I disagree. If you want to learn how to program, use a IDE. Setting up makefiles and using gcc cli arguments gets in the way of actually learning the concepts and syntax.
    • Unfortunately I imagine tools like this will make the managers at the top of the food chain expect more productivity and significantly less human checking. Automation can only take us so far, and there need to be experts who examine these things manually. Like the parent says, over relying on a tool like this could be very harmful to the overall project.
    • I totally agree. I like to compile my projects by hand sometimes just to make sure I'm not reliant on my C compiler.
    • I've heard that automation is highly prevalent in the IT industry and that it's taking even more jobs than offshoring. I kinda laughed out loud at the idea that it's taking a lot of coder jobs, but now I see this foolish trend actually exists.

      Automation can only go so far. This is a rubber band phenomenom that will snap back into proper size when automation turns out to fail.

      Failure is a guaranteed and major part of any effort to convert analog to digital. Programming is highly analog, believe it or not - i
      • Inevitably these errors pile up and it'll come down to a human - or team of humans - to dig into the guts of the code and find out what's really wrong. If all this automation actually happens, who will get the job experience to fix all the bugs that got by the automated software checkers?

        The people who end up being the ones to fix all the bugs that got by the automated software checkers?
      • This is why pure machine language code will always be faster - even if more horribly difficult - than any high level code. Having a machine - programmed by fallible humans - checking fallible code - is just another path for potential failure.

        Umm, that's really rather off what most people observe in reality. While in the limit the directly-written machine code (MC for short) might be better than anything out of a compiler, the level of guru-dom required to do this on modern architectures is really ridiculou

      • Some informal studies have shown that software engineering is about 20% clerical (typing, etc) and about 80% intellectual. This suggests that the limit of computer automation of software engineering is about 20% until computers are able to think. Our jobs are quite safe at present.
  • ...has done a couple of short podcasts [stelligent.com] on continuous integration and whatnot too.
  • How about FindBugs? (Score:5, Informative)

    by iapetus ( 24050 ) on Friday August 04, 2006 @01:55PM (#15848114) Homepage
    These tools are very limited in their scope - FindBugs [sourceforge.net] is a very useful and powerful tool for locating bugs or potential bugs in Java code, and I've used it to find some potentially serious bugs in large, relatively mature pieces of code before now. Using it to help find potential failures in newly modified pieces of code seems like a good idea.
    • Yeah, I've been very impressed with some of the stuff that FindBugs finds: loads of localised sillinesses, of course, but also some deep high-level stuff like problems with threading, security, or OO design. On that showing, maybe analysing at bytecode rather than source level is the way to go.

      Not that source-level analysis is useless, of course; there's a lot of low-level style that it can check. But IME, it can be more trouble than it's worth, because it doesn't understand that there are times when it

  • The only way (Score:5, Insightful)

    by phantomfive ( 622387 ) on Friday August 04, 2006 @01:55PM (#15848116) Journal
    The only way to know is to run a software inspector at every check-in.

    Or you can hire people who are good and who you trust. A real human will do a better job of 'code reviewing' every time, and if you hire good people, then you don't need to worry about what they commit. An occasional check should be enough to make sure you haven't accidentally hired a loser. (Also human code reviews, if done correctly, are great because they help everyone become better programmers).

    This is easy to do in small companies, but somewhat harder in big companies. Still, if I were CEO, and my managers started using this tool, I would get worried and start thinking seriously about how to change the company culture. You don't want your company to become like Qualcomm, or Novell, where bureaucracy rules the day.

    • Re:The only way (Score:3, Insightful)

      by tcopeland ( 32225 ) *
      > Still, if I were CEO, and my managers started using this tool,
      > I would get worried and start thinking seriously about
      > how to change the company culture.

      I'm not sure that using static analysis tools are a sign of a bad corporate culture. I think they're just one more safety net you can use to find code problems. This is especially true for something like CPD, which finds duplicated code anywhere in the codebase - checking for this sort of thing manually is pretty difficult.

      That said, I agree t
      • Re:The only way (Score:5, Informative)

        by Bender0x7D1 ( 536254 ) on Friday August 04, 2006 @03:09PM (#15848594)
        I agree with your first point. I used to work at Motorola as a Senior Software Engineer and during my time there we integrated a lint tool into our process. It didn't replace the formal inspection process, but before the inspection moderator would sign-off on the inspection, the developer had to show a clean lint report.

        Now, lint tools aren't always right, so there were many places where we had to add comments in the code to get the tool to ignore the next lines. The important thing about the tool is it is "double checking" that you meant to do it that way. If you do, you add the ignore comment, and get to discuss it in an inspection. In this way it enhanced the inspection process instead of detracting from it.

        Fortunately, I worked in an area where the quality of the code was considered important by the developers and management and if code wasn't ready, it didn't make it into the build. Simple as that. Of course, if you were going to miss the targeted build, management wanted to know why, but most of them would listen to you. (They might also ask you to work on the weekend to get it done...)

        Now, replacing the mundane, manual inspections with tools is just stupid. Yes, in some places it can be done, but for the most part it is a horrible idea. Humans are better than software at inspecting code. Tools may be faster, but humans are better. Humans can catch mistakes like passing an incorrect variable or returning the wrong value. They can also examine any requirements or design documentation and determine if you are doing the correct thing. If nothing else, at least they are familiar with the overall application (or should be). Maybe you are making a window too big. It looks fine on your machine but there is a requirement that it works at 1024x768. You don't notice since you have a big monitor. The tool can't notice it since it doesn't read requirements documents. The requirement may not exist outside your memory of a short chat with the customer.

        The reality is that software inspections SAVE time. No one believes it. Or, if they do, they forget about it because of "crunch time". Sorry. Unless you are coding trivial or simple applications it doesn't pay off. You can argue all you want that you can get around it, or there are better ways, but I don't believe you. I have seen the data from Michael Fagan's study at IBM, and inspections work. Motorola actually published the results of their switch to the Fagan Methodology and found that development time was reduced, fewer bugs were introduced and that more features could be added. After that, they made it company-wide policy to use the Fagan Method.

        So why does this method work? Well, Fagan conducted over 11,000 inspections when he was at IBM to develop this methodology. It took a few years to conduct them all and analyze the results, but he found a great way to reduce bugs, cost and development time. So, unless you have the formal data to backup your claims, (anecdotal evidence doesn't count), I'll keep claiming that inspections are better. Proper inspections take preparation, focus and effor, but they make you better off.
        • The point of using a tool like CheckStyle is that the automated test takes care of the "simple" stuff like formatting and high cyclomatic complexity and naming standards and confusing names and calling super in finalize and a dozen other things, so that when you bug the guy in the next cube to break his concentration and look at your code, he can check the requirements and the screen size and all that, and know that there aren't dumb mistakes that even the computer can catch.

          It's simple division of labor: c
          • when you bug the guy in the next cube to break his concentration and look at your code

            This is not an inspection. It is a desk check. A proper inspection, by Fagan standards, requires 1 hour of preparation for 100 lines of code. If they aren't noticing or looking at the "simple" stuff, then they are going to miss "real" mistakes. Also, you don't hold an inspection after you finish coding your stuff, you hold it after you have gone over the checklist created for "inspection ready". This may include th
        • The requirement may not exist outside your memory of a short chat with the customer.


          If the requirement isn't captured somewhere, it's not a requirement... it's a suggestion. Other than that, I agree completely with the rest of your post.
      • I'm not sure that using static analysis tools are a sign of a bad corporate culture.

        Considering TFA states "The only way to know is to run a software inspector at every check-in", this is a sign of using it as a crutch. That and the fact that the GP states that they had to add comments to get their tool to ignore "valid" pieces of code, that's an even bigger issue.

        And just exactly what are you checking with the "code-inspector"? Are you being a "grammar nazi" and checking for the ever important whitespac

    • Re:The only way (Score:5, Insightful)

      by bunions ( 970377 ) on Friday August 04, 2006 @02:42PM (#15848410)
      "Or you can hire people who are good and who you trust. "

      Or you could do both!

      Automated QA tools are cheap insurance against mistakes, and I'm surprised by the resistance to them I see in these comments. No, of course no one likes out-of-control bureaucracy, but that's not an argument against using automated tools to check your code.
    • I completely agree. I used to work for a small tech company that was writing custom software. A typical test run would take a few hours of manually checking each item. My boss requested that I look into some automated tools to shorten the process, and after trying out several different kinds I recommended that we avoid them and stick with manual checks. There was just too many interface changes to each build to keep an automated test up to date.

      So of course they spent thousands to buy a test package any

    • If I was CEO and my developers were not using these tools, I'd be looking to hire some high quality developers.

      Fact: Code reviews improve code quality.
      Fact: 80% of the cost of bespoke software occurs during maintenance.
      Fact: No software developer is perfect.

      Maybe you're arrogant enough to think you don't need these tools. I say that you do need them, your teams need them, and that you should be doing proper code reviews as well.

      One of my software engineers has just updated the automated code build (which ru
    • I actually have the opposite experience: the better the programmer, the more likely they are to make mistakes that static analysis will find, because they're thinking about the more subtle issues, and don't notice that they've messed up a variable name. Most of the time, they write first versions of code which is well-designed, does things efficiently, and doesn't compile due to a couple of trivial typos, which they fix quickly by looking at the compiler output (which is, of course, a form of static analysi
    • Competent people make simple programming mistakes once in a while too.
    • and if you hire good people, then you don't need to worry about what they commit. An occasional check should be enough to make sure you haven't accidentally hired a loser.

      This is still a commonly held belief, but it's just not true.

      Allow me to quote from a 2004 OOPSLA paper: "we have found that even well tested code written by experts contains a surprising number of obvious bugs." (Link to entire paper is here [sourceforge.net].)

      Even very good programmers make mistakes sometimes, and some of them are simple enough to

    • Which is why you're not a CEO.

      Real humans make mistakes. The best developers may be less likely to make big mistakes and introduce logic bugs, but even they will commit simple errors from time to time. So there is always a need for review, be it human or automaton.

      It is not easy to find "the best" developers, or to distinguish them from above average developers (and average isn't very good). It was recently shown that "the best" people in their field consistently rate themselves lower than the set of

  • by bozzy ( 992580 )
    ...welcome my new automated, software-inspecting overlords.
  • "What if you were able to perform a portion of your code reviews automatically?"

    You mean like "gcc project.c"? Without that, I'd have to be in marketing :).
  • In Soviet Russia, something inspects something, and it's really funny!
  • "Was the modified code the victim of a copy-and-paste job?"

    I want the code to be copy/paste every time, if it works and is maintainable, rather than sparkling new code. Why would I want some robot enforcing some need to reinvent the wheel every time I need to roll?

    If this thing is going to be smart, it would look at code to replace with code elsewhere in the repository. I'm tired of doing that myself, and not copy/pasting enough. By which I mean factoring common code into its own scope, then pasting a refer
    • Generally you should move that code into a common function and share it, rather than letting it exist as something that's copy pasted.

      However, there are plenty of times that copying and pasting chunks of code makes good sense.
    • I often think of programming as a form of compression.

      Instead of writing long lines of if-then-else we try to write something shorter that can do the same thing, faster and with hopefully less time and effort spent (writing and maintaining). A squeezing from different directions (with different priorities).

      I think copy and paste coding is fine, if you suspect that the "duplicates" are likely to need to become different later, and especially if you're not sure in what way they are going to be different, just
      • gcc is the first round of code review. It tells me whether my code is good enough to execute - what it does when it executes is another question :). But the next round is "tar-zcf project.v1.c.tgz project.c; tar -zxf project.x.v1.tgz project; ls -l project.?.v1.tgz", then comparing the ratios of the series of compressed source to executable files. The closer the compressed source size is to its compressed executable size, the more efficient is the code. And the more ineffecient is the process of other peopl
  • Though it is Java based, they left out quite a bit.

    Instead of depending on style checkers for formatting, just get a good programming editor
    and configure it for the style. This may not take care of all of it, but it can help.

    I'm a big believer in lint and PCLint , which also can be used for style checking. I don't know how good JLint [artho.com].

    The piece skipped out on automated testing -- ie. Purify.

    They missed BoundsChecker.
    • > I don't know how good JLint.

      It's pretty good, although I had problems getting it to compile with recent Fedora releases. I think I ended up compiling it in a RH9 VMware partition or something. But once it's working, it's cool - it does some good dataflow analysis stuff and is nice and fast.
  • by dpbsmith ( 263124 ) on Friday August 04, 2006 @02:31PM (#15848334) Homepage
    We've experienced those brief and tempestuous infatuations with flowcharts, Warnier-Ott diagrams, top-down programming, structured programming, Jackson structured programming, source code control, the waterfall model, Royce's Final Model, the spiral model, the sashimi model, object-oriented programming, CASE tools, Rational Unified Process, SEI's Capability Maturity Model for Software, SEI's CMMI, feature points, function points, agile methodologies, and Extreme Programming, and... well... they were just trips to the moon on gossamer wings.

    But this. This is different. Totally different. It's the real thing this time.

    • I don't think anybody's claiming this is a silver bullet.

      What this does represent is an opportunity for incremental improvement.

      Or just keep doing the same stupid thing that doesn't work very well. Your choice.
    • by Anonymous Coward
      Person A: Hey, did you hear about the new version of Eclipse that just came out? It's got this nice feature where...

      dpbsmith: Oh, good, the silver bullet at last. We've experienced those brief and tempestuous infatuations with flowcharts, Warnier-Ott diagrams, top-down programming, structured programming, Jackson structured programming, source code control, the waterfall model, Royce's Final Model, the spiral model, the sashimi model, object-oriented programming, CASE tools, Rational Unified Process, SEI'
  • Be smart (Score:4, Insightful)

    by Mock ( 29603 ) on Friday August 04, 2006 @02:44PM (#15848427)
    The thing to remember about automated code inspectors is: be smart about it.

    Don't trust the code inspector to enforce a policy (except maybe coding style).

    There's a lot of boilerplate code that goes into a program, and a code duplication monitor will cause all sorts of headaches in this area.

    The same problem exists with comment checkers. If code is written properly, there is usually very little need to comment inside most methods. The name and javadoc will give more than enough description of what the method will do (you DO use javadoc, don't you?)
    It's only as the method's complexity increases to a point where it's not blatantly obvious what's going on inside that you need comments at that level.

    I've had too many managers force me to comment like this:

    // Iterate over all files
    for(Iterator iter = files.iterator(); iter.hasNext(); /* nothing to do here */)
    { // Get the next file
            File file = (File)iter.next(); // If the file has a modification date later than now
            if(file.lastModified() > new Date().getTime())
            { // Throw an exception stating that the file is modified in the future
                    throw new ModifiedInFutureException(file.toString() " + has a modification date in the future");
            }
    }


    Ok it looks much uglier after running through Slashdot's dumb filter, but you get the idea.

    The above code in reality needs no comments whatsoever, except perhaps a single line at the top saying "Disallow modification dates in the future", but a bad policy caused ALL code checked in to comply with silly regulations, resulting in countless hours lost.

    In truth, the date check code itself should have been implemented as a policy class to be added to the verify method, but I digress.
    • I've posted before [slashdot.org] about comments. And I'd heartily agree that the sort of comments you show are worse than useless -- they actively make the code harder to follow!

      In that situation, I'd ask why all those comments are needed. If the answer isn't "To make the code easier to understand", then I'd refuse. And if that is the answer, then I'd point out that no half-way competent developer would have the remotest trouble getting all that straight from the code, so none of that is achieving the state aim. If

  • I haven't read the article true /. stylee, but I'm going to go ahead and assume it's about using code checkers to check code is valid. I use W3's markup [w3.org] and css [w3.org] validation services for this purpose. If I get a problem (or even if I don't), as a first step, I run the code through their validation services.
  • Article is right on (Score:3, Informative)

    by bokmann ( 323771 ) on Friday August 04, 2006 @04:21PM (#15848984) Homepage
    Slashdot's title of 'replacing humans' aside, this article is right on. We do peer reviews of our code, but before we do, we make sure it meets our checkstyle conventions, and fix at least some of the egregious things that PMD, Findbugs, and a couple of the other reports we can get out of our Maven build system without any real effort.

    This makes our reviews more productive, because people don't get caught up in discussions over curly brances, finding copied code, issues with contructor and exception idioms, etc.

    The slashdot crowd is going to envison pointy-haired bosses basing performance reviews on this kind of stuff, which is a legitimae fear and shouldn't happen. Used in the hands of real software engineers though, this is similar in spirit to the woodworker's adage "measure twice, cut once". Loosely applied, "You know what you are doing, but be your own safety net".

    -db
  • The trouble with most of these tools is that they're aimed at local coding style issues, not global problems.

    Typical global problems that are potentially machine-checkable before execution are:

    • Object re-entry Object A calls object B, which calls object A again, entering object A with the object not in its stable state. This is a constant problem with callback-oriented GUI systems. Microsoft research has addressed this in their "Spec#" [microsoft.com] effort, which is worth a look.
    • Unlocked access This is more of a
    • That is incorrect. Some of these tools: e.g. Findbugs, pmd and lint4j are not about code style at all. Instead they look for real antipatterns and situations that usually indicate a bug. For example these tools will identify potential deadlock problems, obviously wrong uses of the synchronized keyword, draw your attention to null dereferences (npe at run time), streams that you forgot to close properly (i.e. in a finally block), etc. Some of these things can be quite hard to find in code and will slip throu
  • Quibbling over formatting is silly. White space formatting doesn't affect functionality of the code at all. (Unless you're using python and that's a whole different discussion) What should really happen is my editor should display the source the way I'm used to seeing it. I should be able to configure the view style and the actual in-file formatting remains unchanged. My cube mate across the way see's the source in HIS style. He's happy. I'm happy. This doesn't seem hard to me.

    -=Robo=-
    • yes, but when 1 guy adds 16 space to the front of a line, and some other guy adds 11, and some othern person likes 3, it can play merry hell with editors, and be painfull to read.
      Now add tabs, carriage returns, etc it becomes painfull.

      Plus, having a standard way to format code makes maintenance difficult, and can foobar automated documentation.

    • Quibbling over formatting is silly. White space formatting doesn't affect functionality of the code at all.

      fyi, FindBugs [sourceforge.net] doesn't look at the source code at all, only at the compiled bytecode.

    • Quibbling over formatting is silly. White space formatting doesn't affect functionality of the code at all. (Unless you're using python and that's a whole different discussion) What should really happen is my editor should display the source the way I'm used to seeing it. I should be able to configure the view style and the actual in-file formatting remains unchanged. My cube mate across the way see's the source in HIS style. He's happy. I'm happy. This doesn't seem hard to me.

      It might be harder than you

  • http://packages.debian.org/stable/admin/vrms [debian.org]

    Virtual Richard M. Stallman

    The vrms program will analyze the set of currently-installed packages on a Debian GNU/Linux system, and report all of the packages from the non-free tree which are currently installed.

    Future versions of vrms will include an option to also display text from the public writings of RMS and others that explain why use of each of the installed non-free packages might cause moral issues for some in the Free Software community. This functionali

Science may someday discover what faith has always known.

Working...