Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Education

Cheating Detector from Georgia Tech 941

brightboy writes "According to this Yahoo! News article, Georgia Tech has developed and implemented a "cheating detector"; that is, a program which compares students' coding assignments to each other and detects exact matches. This was used for two undergraduate classes: "Introduction to Computing" (required for any student in the College of Computing) and "Object Oriented Programming" (required for Computer Science majors)." Cuz remember programmers: in the real world you are fired if you consult with a co-worker ;)
This discussion has been archived. No new comments can be posted.

Cheating Detector from Georgia Tech

Comments Filter:
  • by alen ( 225700 ) on Wednesday January 16, 2002 @02:39PM (#2849761)
    Your Hello World program is exactly the same as Johnny's. You fail. You're kicked out of school. Good bye.
    • Re:You're caught (Score:5, Insightful)

      by invenustus ( 56481 ) on Wednesday January 16, 2002 @03:08PM (#2850063)
      You all laugh at that joke, but in my friend's Operating Systems class this fall, the "cheat script" flagged half the students, very few of whom were actually cheating. My friend's group didn't hand in one part of the assignment, and the script detected similarities between the nonexistent file and the whitespace in other groups' code. Duh. And of course, instead of first LOOKING at the similarities, the professors went ahead and accused my friend of cheating, and told him he had to come to an "appeal session" THAT SAME DAY.

      Students shouldn't cheat, but professors shouldn't toss around those accusations lightly either.
      • Re:You're caught (Score:5, Insightful)

        by CmdrPinkTaco ( 63423 ) <emericle@ch[ ]erware.com ['ubb' in gap]> on Wednesday January 16, 2002 @05:31PM (#2851145) Homepage
        I personally have had experience being accused of cheating in a computer science class. At the uni that I attended, the prof for the Programming Languages class was in his tenure and had already (mentally) retired. His concern for the class was minimal at best. An example of this was the time that myself and another student went into his office to talk to him because he was late to class and we had some questions on an assignmnet that was due the following class. When we poked our heads in his office he was in his chair - asleep. If that isn't enough - he completely forgot to show up for the final exam.

        I don't claim to be a model student by any means, but in this class of 16 I had the highest grade in the class and had done every assignment to the best of my abilities. It came as quite a shock to me when I got my grades and noticed that there was an F for my Programming Languages class.

        I promptly called the professor and he said that this was an issue that was best dealt with in a face to face manner - so I went to his office and he claimed that myself and 3 other students in the class had cheated. He pulled up the source code and showed the very striking similarities. When I explained him that some of the problems that were assigned were out of the scope of this class and that he offered no help, I informed him that I had worked with these students to get a solution to the problem. We did not copy any prior works, and all worked together to complete a tough assignment. I admitted that we shared code, only because we had shared ideas and had all come to the solution together.

        To make an already long story short (Im forcing myself to leave out details), it ended up getting appealed and overturned, and the professor is now on probation and only teaching 100 level courses

        The moral - cheating and sharing of ideas are different concepts and should be handled seperately. I don't agree with programs that flag cheating based on similarities in code because sharing of ideas is typically encouraged in a university setting as long as they are obtained legitimately - a program as such cannot sufficiently distinguish the two.
  • by Anonymous Coward
    You're fired for copying someone else's work and passing it off as your own, particularly if it's a competing product.

    If your employer has any integrity at all, that is.
  • How the hell are all those lonely CS majors supposed to get in good with the Education majors now?
  • by sben ( 71467 ) on Wednesday January 16, 2002 @02:41PM (#2849786)
    CmdrTaco:
    "Cuz remember programmers: in the real world you are fired if you consult with a co-worker ;)"

    Yes, but one of the goals of a CS department should be to produce programmers who are capable of doing work themselves. Would you want to work with (or supervise) a slacker who couldn't code his way out of a paper bag, but who graduated anyway because he cut-and-pasted the work of his (harder-working) classmates?
    • No, but they have this little process called 'firing'.

      Anyhow, the employment market will vet real workers from pretend ones. I think the motivation in this case, from the universities perspective, is to not have their degree and academic rep devalued. Obviously, the better the grads, the more money and brains go into the school ... so I can't really blame them for what they're doing. I think their heart is in the right place, even if this solution does strike me as somewhat dangerous.
    • by feed_me_cereal ( 452042 ) on Wednesday January 16, 2002 @03:06PM (#2850043)
      Good point. There's another reason why CmdrTaco's comment represents flawed thinking: I've been in several group projects in school in which we had to collaberate, and it certainly did not consist of copying answers off one another. I don't think that cheating can ever be construed as "consulting with a co-worker." In most of my CS classes at ohio-state, my professors encourage us to work together toward understanding the problems, but to actually turn in our own solutions.
    • Maybe someone should write similar software for slashcode... it might prevent them from posting similar stories:

      http://slashdot.org/article.pl?sid=01/05/09/198259 &mode=thread [slashdot.org]

  • Erm. (Score:5, Interesting)

    by Dr. Sp0ng ( 24354 ) <{moc.liamg} {ta} {gnopsm}> on Wednesday January 16, 2002 @02:41PM (#2849788) Homepage
    This is new? They used something like this when I was at the University of Maryland a few years ago. And it did more than just check for exact matches, it compared parse trees and so on to check for similar program structure (any matches were, of course, double-checked by a human before ringing the cheating bell). It caught quite a few people I knew.
    • A better approach (Score:5, Informative)

      by BluesMoon ( 100100 ) on Wednesday January 16, 2002 @02:59PM (#2849984) Homepage

      Ok, check for exact match: diff source1.c source2.c
      great, I just wrote a program to check for exact matches in source code, and it took me three seconds. Maybe I should apply for a patent for my ingenious approach (maybe I'd get it!!)

      At my organisation, (in India) we've been developing something like this for quite some time for our internal tests.

      While most of the work isn't (and probably won't be) publicly released, we can look at a systematic approach to building a better detector.

      1. run indent on all source files to standardise white space usage:
        indent -i8 -kr
      2. Remove excessive white space within statements (students tend to add extra white space:
        sed -e 's,\([^ ^I]\)[ ^I]\+,\1 ,g'
      3. while you're at it, remove blank lines too:
        sed -e 'g/^[ ^I]\+$/d'
      4. Remove repeated lines, or lines that match
        i=i;
      5. Run diff/cmp on the files and check the %ange difference

      You may also want to first strip all #include <> statements (not #include ""), and run the code through the C preprocessor first to take care of #define, and conditional compilation

      There's more obviously that I'm not sharing with you. These are the basics that anyone could figure out in a few minutes - not years.

      • by Tyrall ( 191862 ) on Wednesday January 16, 2002 @03:21PM (#2850155) Homepage
        Actually, the white space is what tends to get the cheaters caught.

        If there's 6 extra spaces at the end of a few lines and there are exactly the same extra spaces on the same lines (variable names aside), then there's an extremely good chance it's the same code, or a cut and paste of that section at the very least.

        In addition, you'd want to strip comments in your above example.
  • by SnowDog_2112 ( 23900 ) on Wednesday January 16, 2002 @02:42PM (#2849796) Homepage
    Remember, folks -- you may not get fired for consulting with a fellow programmer, but if you never learn how to do anything but copy & paste other people's code, you've lost out on a LOT of problem solving skills.

    There's a difference, a huge difference, between collaborating and cheating.

    In the real world, you _would_ get fired for taking credit for someone else's work, trying to pass it off on your own. Heck, you'd probably also violate a bunch of licenses, too :).
  • by Unknown Bovine Group ( 462144 ) on Wednesday January 16, 2002 @02:42PM (#2849801) Homepage
    Maybe Slashdot could use this technology to stop reposts.

  • The article talks about a program that has been around since 1993 and merely detects exact duplicates of code...
    Not really a big deal... Our school uses some University's program and database which not only detects fragment duplication, but also permutations of the code (such as changing variables, white space, etc.). Not sure which University though....
  • Don't get me wrong, I understand that cheating on a test is wrong. I'm concerned that this sort of thing may help promote the "wasn't built here" syndrome (I believe it's called something else.)

    I'm just hoping that this is balanced with a few lessons on reusing and sharing code, for practical purposes.
  • How exact? (Score:2, Interesting)

    by Stiletto ( 12066 )

    It better check for exact duplicates only, down to the variable names. Many undergraduate CS assignments are programs so basic that there are really only a few ways to implement them. It would suck to be a student who from scratch used the same algorithm as another student, and have them both flagged as cheaters.
    • Re:How exact? (Score:5, Informative)

      by Junta ( 36770 ) on Wednesday January 16, 2002 @02:59PM (#2849980)
      You would be surprised how well this works in practice, even in intro classes. When I was a freshman taking intro to cs, they used one of these programs and got few false positives. If it matched exactly, down to the variable names, then it would be completely pointless. The one that my college was using back then matched regardless of variable/function names, or any source formatting. Essentially, it examined the overall algorithm and run-time execution paths to determine if there was likely cheating.
      Besides, even if the system turns up a high match between two programs falsely, it is ultimately a human who gets to review the case and make the call, after (presumably) discussing the matter with the student before actually doing anything that would leave a mark on the record.

      And as an answer to the knee-jerk reaction of "that's not how it works in the real world!" I tend to agree, but not completely. As an instructor of mine once said you have to learn to dribble before you can play with other people in a team in Basketball, and as such one needs to develop his or her own personal programming skills independently before he or she may work effectively in teams.

      Of course, some could argue that learning in teams would be more effective and perhaps more useful, but the point is there needs to be a mix of team and independent projects. Without independent projects at all, it is difficult to be sure that everyone is competent to pull their own weight, and part of the role of Universities in the world of business is to certify that a graduate possesses a good skillset, and without both team and individual assignments, this is impossible.

      Of course, as is the case with everything, this doesn't stop cheating. If one collaborates with someone completely unrelated to the class, it can't catch that, but then again, there aren't that many people inclined to work their butt off at no benefit to them just to help some other person get a good grade.... Of course, I have seen the case where a guy goes way out of his way to help a pretty girl, but that is another story entirely...
    • Re:How exact? (Score:4, Interesting)

      by Brownstar ( 139242 ) on Wednesday January 16, 2002 @03:03PM (#2850021)
      I actually had that happen to me. I was taking an assembly course where the teacher wanted us to reverse the order of values in a list.

      He gave us a long complicated piece of c code to do this, but instead I just used a stack (we didn't "learn" about those in class untill a few weeks later). Well, it just so happened 1 other student felt like writing the 11 line stack implementation, rather than the 100+ line one the teacher recommended. The teacher then said we cheated.

      Fortunatly we were both able to explain how our code worked
      • Here's 7 lines of C to reverse an array. The assembly would be more or less identical. I don't feel like dredging up my memories of 8086 assembler... it would probably end up screwing up my Perl for the next hour or so :-)

        int list[] = {0,1,2,3,4,5};
        int i,j,len=sizeof(list)/sizeof(int);
        for (i=0; i < len/2; i++) {
        j = list[i];
        list[i] = list[len - i - 1];
        list[len - i - 1] = j;
        }

        Reversing a linked list would be marginally longer, but a doubly linked list would be just as short or shorter than this. Only a real novice would take 100 lines of C to do it. BTW, how could you possibly learn assembly before learning what a stack is?
    • Re:How exact? (Score:3, Interesting)

      by gmhowell ( 26755 )
      Not necessarily. I was at a college [goucher.edu] that I won't name where I was on the Academic Honor Board. Essentially, suspected cheats were brought before the board to decide on guilt/inocense (sp) and give punishment.

      Computer cases were the most common (4 of the 5 cases I sat in on). One day, we had three cases, with different defendants in each one. All programs, from about 15 students were essentially identical. What were the differences? Capitalization of variable names, and indenting style. That's it. So, while they were not 'exact' copies, they were close enough in my mind to merit guilt.

      They were fairly trivial programs. I think a total of maybe 150 lines of code or so. Can't remember if it was some form of basic, or C (I really think it was the former). There were a few ways to do the problem (I think it sorted words or something). But the striking thing is that the variables were typical CS100 nonsense names (variablefoo, variablebar, but NOT simply 'i' for iterator or 'x') of four-five characters in length, differing only in that some students had all uppercase, and others all lowercase.

      Now, I suppose that if the instructor had said 'use these variable names' there is a defense. But that was never mentioned.

      I think the ultimate answer was that almost everyone admitted that they did some amount of copying, and all got zeroes on the assignment. I can't remember if any failed the class (and no, nobody was tossed from school).

      But this is the interesting thing: Each of the three cases was about the same instructor, with the same program. But they were brought as three cases. We were presented the hard copy evidence for all three cases at the beginning of the morning. During a break after the first case, I flipped through the other evidence packs. I saw that the copying was very, VERY similar in all three cases. In fact, there were more similarities between program A in case 1 and program B in case 2 than between Program A in case 1 and Program B in case 2. To my mind, it was clear that the cheating was much broader than indicated. However, I was ignored. Our power was only as petit jury, judge, and executioner. We had no room to act as grand jury. (In addition, this was my first real world experience with a judicial system unable to understand technical issues. I was a chem major. Roommate was a CS major. I was the only hard-science guy on the board. The others were various history/business majors.)

      Anyway, the point is: exact copies are probably always cheating. But near copies are also sometimes cheating.
  • Reuse? (Score:2, Redundant)

    I thought the point of OOP was reuse ???

    ;-)

  • by gergi ( 220700 ) on Wednesday January 16, 2002 @02:44PM (#2849824)
    A few years ago, when I was a 2nd or 3rd year at Virginia Tech [vt.edu], some professor implemented a cheating detector into the automated grader for a class called Intro to C++.
    Prior to that year, VT had an average of 75 cheating violations for the WHOLE university (25000+ students). For that one class, on one assignment, 150 students were found cheating by the cheating detector... out of the 500 or so students in the class.

    Funny as hell
    • Carnegie Mellon has been doing this for years. Not only does it compare your source against other students' sources, but the CS department has solutions for every student in the past 10 years.

      From what I understand the method used involved comparing source and generated assembly code for similarities.

      And while I'm on my soapbox, this is another article posting a supposed "new and newsworthy" technology to slashot thats really not so new. Check your facts, and find out if this is really a "first" why don't you?
    • Yep, I was there that year. It was Dr. Walker who implemented the system. Walker is a fast-moving but excellent programming teacher. That one class did more for the quality and structure of my programming than anything else I have done either before or since. (I'm now leading a team of programmers for the DOD doing a enterprise level java application, btw.)
  • by KingAdrock ( 115014 ) on Wednesday January 16, 2002 @02:45PM (#2849830) Journal
    I always found that it wasn't easy to cheat. If I copy and pasted somebodies code, I had to go back through and change it all around so that I couldn't be caught cheating. This often proved to be more difficult than actually doing the project myself would have been.

  • I use code I find on the internet all the time, first thing I do when I start a new routine as a matter of fact. I search for preexisting code and use/modify that accordingly. Of course only freely available code but there are tons of sources for that. How would this program detect that? I would trust some code I got from the net over something another person in class wrote, but then again I know enough to be able to see if the code is good or not and to modify it if I need to.
  • Mod me down for this, but consulting with a co-worker at a job and obtaining code from a fellow student is NOT the same thing. The purpose of going to school is to learn and therefore they want your work, not your friend's. At a job, they just want the work to get done, they don't care how you do it.

  • Comment removed based on user account deletion
  • So, when the instructor foolishly gives me an assignment that has a verbatim solution later in the book, and both me and Joe Student who I don't know and never talk to (or worse, my best buddy in the class) both turn in that verbatim solution, obviously we've cheated, right?

    That's sarcasm for those of you unfamiliar with the stuff.

  • This is not news. (Score:2, Informative)

    by bharath ( 140269 )
    There are many programs out there for exactly the same purpose. For example, moss [berkeley.edu] at berkeley lets you do this over the net.
  • Cuz remember programmers: in the real world you are fired if you consult with a co-worker ;)

    Well, actually, if you take work from another co-worker and pass it off as your own, you'll be fired and prosecuted.

    Don't Plagiarise - it's the law. (And judging from the snide comment, probably the reason that CmdrTaco never finished college.)

    • by JatTDB ( 29747 ) on Wednesday January 16, 2002 @04:16PM (#2850574)
      Nah, you just have to be really clever about it, such that the original programmer gets fired, and then the code makes the company billions in the video game industry, and you become a senior executive vice president of the company, and the original programmer is reduced to an arcade manager, and then he tries to hack into your systems, and then your mainframe decides to digitize him, and he helps a small group of rebels free the system.

      Ok, maybe I've watched Tron a few too many times...
  • the real world (Score:5, Insightful)

    by bcrowell ( 177657 ) on Wednesday January 16, 2002 @02:46PM (#2849844) Homepage
    Cuz remember programmers: in the real world you are fired if you consult with a co-worker ;)
    In the real world, you are fired if you steal code from someone else without their permission, pretend it's your own, and incorporate it into the app you're writing for your company. In the real world, people give credit where credit is due.

    A lot of the stories I hear about students plagiarizing each other's code is done without the other student's permission. Many systems have files readable by other students by default, and students don't bother to read-protect their files. Students will take printouts out of the trash. And of course, it's always convenient for students to claim they didn't know the other person copied their work.

    It's better for students if professors have an accurate way of detecting cheating. The worst thing is if the method is inaccurate, and innocent students get accused. This method sounds accurate.

    • you are fired if you steal code from someone else without their permission, pretend it's your own, and incorporate it into the app you're writing

      So it's OK to plagiarize on an exam, as long as you get permission and attribute the original author!

  • This isn't anything new (well, the thing said '93). The University of Toronto has used this for some time. I remember often after assignments were handed in, people would be called to see the TAs or the prof because they're assignments were 'too similar.' :)

    The software that U of T uses was developed somewhere else, I thought it was MIT, but I could be wrong.

    They didn't actually tell us that they were using this stuff, I found out after I graduated from reading it in the newspaper. So it's probably in widespread use, it's just not something CS departments brag about (I guess catching cheaters is fun).

    • And to make it clear, this is not diff. I mean duh, how stupid is that?

      The software that U of T used was apparently quite intelligent, and could tell if two algorithms/problem solutions were too similar, not just the text of the stuff that was handed in.

  • by Ionizor ( 175949 ) on Wednesday January 16, 2002 @02:47PM (#2849853) Homepage
    • Cuz remember programmers: in the real world you are fired if you consult with a co-worker ;)


    So how exactly does consulting with a fellow student (or co-worker) result in both parties having identical code? I have to say that this is the most ignorant comment I've seen attached to a slashdot story ever.

    At my University they have the same code policy but they encourage you to work with others! Under no circumstances are you to copy their code line by line but you can certainly ask for their help or use a module or two. The only condition to all of this is that you credit them on the cover sheet of your assignment.

    Sorry for the flame but I saw that comment and it made me quite irate.

  • When I was a TA, if I saw two assignments that looked suspicious, I'd hold them up side by side and cross my eyes to get the stereogram effect. If it was a bad cheating job, there would be an almost perfect match and my eyes would be able to focus on them, with the differences jumping out at me.

    Of course, with freshman assignments, they tend to be pretty damn similar even without cheating (write C code to implement bubble-sort using the pseudo-code in the book as a guide). And usually, the students that were cheating would fail the tests, so there was little need to do anything special.

    Why someone would pay tens of thousands of dollars to learn nothing and end up with a job that pays well but that they'll get fired from within a few weeks...
  • Okay, like others have mentioned, think of diff:

    * determines exact matches

    umm, sure, diff does that

    * written in 1993

    I checked man on my machine and got this date: "22sep1993"

    Heh.

    -ted
  • by Infonaut ( 96956 ) <infonaut@gmail.com> on Wednesday January 16, 2002 @02:48PM (#2849858) Homepage Journal
    to stop cheating, will GT bust them for plagiarism? ;-)
  • I wonder if they will start using this on the resumes of their coaches from now on?
  • A Sad Fact (Score:2, Interesting)

    by CTalkobt ( 81900 )
    but there is a lot of cheating in undergraduate courses.

    I was one of the better students in my comp-sci classes and so other students looked for me for help etc. I would routinely point them to my own finished assignments as example of how to do something or provide listings in which we would discuess the assignment and how to do things.

    This worked well until I got called before the teacher in regards to two students having taken my listings and typed them in ( with practically no modification whatso-ever ). I explained the truth - that I provided it for purposes of instruction not stealing and managed to escape. The other students were forced to retake the course.

    After this incident I kept my eyes wider open and noticed more students "copying"...

    It happens. Whether this program is really needed or not I think is more an indication of how well the teacher stresses the students on final exams and such.
  • Heck, even in the open source world, I can't copy someone else's program verbatim and claim I wrote it.

    Even if two people work together on a project, as long as they write their code separately, the code will be significantly different enough that it shouldn't be recognized as cheating.

    Probably what this will catch is the last minute "Quick, let me copy your program" right before it's due. And this DOES happen, and I find nothing "right" about that at all. That IS cheating, plain and simple, and should be stopped. In a class of 30 students, the instructor (or TA's) will probably be able to notice similarities. In larger classes, its easy for these things to slip by, especially if the grading process is split amongst multiple TA's.

    -Restil
  • Cheating (Score:5, Interesting)

    by Carnage4Life ( 106069 ) on Wednesday January 16, 2002 @02:50PM (#2849886) Homepage Journal
    CmdrTaco says:
    Cuz remember programmers: in the real world you are fired if you consult with a co-worker ;)

    As someone who TAed classes at GA Tech, I take a lot of offense at this comment. There is a difference between working as a team on project based classes (of which GA Tech has a good number off including classes where we got to hack the Linux kernel and another where we got to deliver a product to a customer) once you've shown you understand the basics of programming and wholesale copying of other people's work in entry level classes where you are supposed to be learning to program on your own.

    Beginning programmers need to learn how to program, find information from MAN pages & API docs, and come up with solutions on their own before being introduced into team based environments. If not they never learn how to be self sufficient or even if they are cut out for programming at all.

    It is true that in the real world no man is an island but on the flip side, how many people have worked with co-workers who completely clueless about how to perform their jobs but held degrees or certifications that implied they shoould be knowledgeable about programming? These are the kind of people who hid behind the work of others in team based projects and submitted others work on individual projects.
    • TOO MUCH Cheating (Score:5, Insightful)

      by Anonymous Coward on Wednesday January 16, 2002 @03:35PM (#2850263)
      Pardon me for posting anonymously [slashdot.org], but I've got to let a little venom loose.

      I was a TA at an prestigious, well known computer science program. The professors there always unfurled elaborate anti-cheating policies. Cheating of any kind whatsoever would be brought before the Dean, where you are going to be subject to a wide range of punishments, including possible expulsion from the University. They purported to be using a script very similar to the one being described here. Yet many of my classmates cut and pasted their way through all the entry level classes while i labored away at every assignment. How did they get away with it?

      That one issue -- that people who did no original work whatsoever got scores at least as high as mine -- has been dismissed as "a fact of life" by friends and family, and I tend to not think about it too much. Why? Because I'm the one getting the education, not them, and in 5 years when college GPAs don't matter a fraction as much as intelligence, experience, and work ability, I'll get sweet (or is it l33t?) revenge.

      But in the meantime, I thought, let's become a TA and be on the other side of the fence for a change. Let me do my part to bring all these cheaters to justice.

      And you know what? The reason they all got away with it was not because the previous TAs slacked off, but because the professors, when push came to shove, just didn't care. They lied about using the script.

      When I brought identical assignments to their attention, they didn't pounce, but gave me options such as taking off some points or letting it go.

      As it turns out, we have a very forgiving Dean, and any cheaters brought to his attention will get no more than a slap on the wrist. For that, professors get to do a lot of paperwork, cast themselves as the bad cop in making the case, and get a poor repuatation with students who are used to the status-quo of a cheat friendly environment. They don't want to do any of that , so they put on the pretense of being tough on cheating and hope it all goes away.

      Slashdot is mostly a young crowd, and young are naive like I was, so let me break some bubbles: maybe at GA Tech profs let ethics take precedence over apathy, but not everywhere.

      And to all your cheaters out there: yes, you're off the hook for now, but wait until we're co-workers.

      Can someone please mod this out of the anonymous doldrums? Thanks.
  • My high school CS teacher once found a few identical programs. He printed out the source code to some transparancies, then lined them up, one on top of the other, on the overhead projector. The only blurred spot was the comment with the students' names.

  • by TokyoJimu ( 21045 )
    When I was taking programming classes in the mid '80s at UCLA [ucla.edu], they had a rather clever cheating detection program. It didn't look at the source (Pascal or C) code, but rather at the produced assembler code to see if students were copying others' algorithms.

    So you might obfuscate your copied code by moving it around, changing variable names, etc. but it would still catch you.
  • by kramer ( 19951 ) on Wednesday January 16, 2002 @02:51PM (#2849894) Homepage
    Some more info on the cheating detector from a Georgia Tech Alum of the CS program.

    1. The cheating detector is not new. It's been in place for years. When I took intro programming in 1994 they mentioned it, and it wasn't new then.

    2. Everybody at Tech knows about it. They tell you about this script the first day of class. Nobody here should be suprised they were caught. The fact that they were caught only shows them to be some of the stupidest people at Tech.

    3. It catches people every term. Usual numbers are below 5% range. The fact that it caught someone isn't news. The fact that it caught 10% of a class is news.

    4. These classes are cake. There is no reason anyone should need to cheat to pass these classes. They are the most basic concepts of programming.
  • For all of you who posted : "gee, they invented diff again", it's a little more involved than just "diff". I'm sure other schools have similar cheat-detecing programs as well. Also, why Yahoo decided to pick up on this now and pass it off as news is beyond my comprehension. Maybe they had nothing else better to pass off as news. In my entire 4 years at GA Tech, I only heard about this program once and it's not a big deal. "There's nothing here to see people. Please move on with your lives."
  • Would you be so stupid as to copy it exactly? I mean theres 20 different ways to do the same thing in just about any langauge, how stupid would you have to be to copy someone eleses code without changing variable names and statements to be slightly different.
  • by rblancarte ( 213492 ) on Wednesday January 16, 2002 @02:52PM (#2849922) Homepage
    CS programs at schools are not out to end colaboration with students. They are aiming to produce students who know how to program. In the real world YES, you can just copy the code directly from someone else, but what does that teach you? Nothing. Well how to copy/paste.

    I mean, we talked about this today in class, if a guy gets a degree and makes it out of school riding the coattails of others his degree is worthless. Once he is out in the real world, he also drags down everyone else who has the same degree from the same school because employeers will think - Guys from school x don't know jack.

    CS departements are not evil, but they are trying to uphold the principles of school. Don't misinterpret actions such as these as some sort of action to "keep people down".

    -RonB
  • One of my CS profs created a program to do something similar for himself. It would take two programs and compare them and give a similarity score between 0.0 and 1.0. Seeing anything up to 0.6 in intro courses was considered normal, since the assignments were easier, but much above that and things got suspicious fast. Of course, any red flags were hand checked. Seeing as this is the prof that taught the compiler courses, I don't think there were many false positives. :-)

    It caught a few guys that I know. When confronted they tried to say that they didn't cheat. So the prof does the only sensible thing that a CS prof should do when dealing with cheating intro students: Single out a common line of code in their programs and ask them what it did. Hint: How many of you knew the ternary operator in your first forays into C? :-) Having a URL to an identical file from an algorithm archive helped too.
  • by Thellan ( 187645 ) on Wednesday January 16, 2002 @02:56PM (#2849950)
    I just graduated from GaTech in December and I was a Teaching Assistant for the Into to Computing class for 2 and a half years at Tech. The students are told on the first day of class that cheating is not allowed and that if you are caught you will be punished. They are told about the program and whether they believe or not is their problem.

    The students are told it is ok to discuss the homeworks and project with each other and that it is ok to discuss the concepts. However it is NOT ok to copy each other's code.

    The program does not just compare the text of each student's homework which is what some people seem to think it does. The program gets rid of variable names, function names and things like that because a person cheating can simply change those. It compares the style of the code and it is not given common code to look at. The only code checked is the code from problems that generally generate unique solutions.

    In the time I spent there I know of over a hundred cheating cases caught by the program. In some of those cases if you had of given me the 2 pieces of code I never would have said the people were cheating but when asked the students confessed. I have never heard of someone being falsely accused. Most of the time when the 2 cheaters are asked separately they admitt to it.

    Once again, Tech does not have any problem with people helping each other understand concepts like the way pointers or a vector works or the differences between stacks and queues. What they have a problem with is when each studen does not do his own work on an individual homework.

    Eventhough some of the problems may seem not worth it, like writing your own version of strcpy, it is still necessary so that students understand how the library functions work even if they will never be writing library functions in their life.
  • I wonder how this system compares with a program developed at Berkeley called Moss to serve the same purpose. Moss is free and available as a web service. It is really pretty neat, for those of you advocating the use of 'diff' Moss is quite a bit more complicated than diff. It will match up lines of common code and also compare the choice of token names within the program. Learn more here: http://www.cs.berkeley.edu/~aiken/moss.html
  • by jmaslak ( 39422 ) on Wednesday January 16, 2002 @02:57PM (#2849965)
    The responses here, at least the ones along the lines of "But collaboration is allowed in the real world" sicken me. I would (and HAVE) fired programmers who couldn't program simple stuff on thier own. The collaboration in industry is not anywhere near the level of syntax and elementry algorithm design.

    A University degree is supposed to signify that you demonstrated knowledge in certain areas.

    Cheating is not demonstrating knowledge.

    Undergraduate level programming assignments do not require even consultation with other students, IMHO. They are too simple. If you can't code an undergraduate programming project without extensive "consulting", then you can't program. Period.

    I am sickened by the number of people with CS degrees only because of "teamwork" and "consulting". I would guess, from my experience, 95% of people with CS degrees can't write a sort routine. Widespread use of these kinds of programs might fix some of this. As would harsher grading. In the real world, you don't get partial credit for a program that only dumps core or doesn't meet any of the design objectives. (in my opinion, any program which doesn't properly run a set of tests, provided to the students in the project instructions, should receive an "F" grade)

    No wonder the software industry is such a mess. I've seem CS *GRADUATE* students who couldn't use malloc(). Note that I did not say "who use malloc() wrong - no, these students could not even figure out how to call malloc() nor explain what it did. There's something strange happening (I call it cheating) when someone can graduate with a CS degree yet never use dynamic memory allocation knowingly...
    • I would guess, from my experience, 95% of people with CS degrees can't write a sort routine.

      I'm not a CS grad [rather, Maths], but I would not consider this a problem. What is a problem is the 95% of CS grads who don't know how to find the sort routine in the standard libraries of the language they are using.
  • school != real world (Score:2, Interesting)

    by Noonian ( 226 )
    First, the standard disclaimers: my comments are my own and should not be taken as necessarily representative of the GA Tech administration.

    There's a much better and more accurate article on the topic at the AJC [accessatlanta.com]. Take the AP version with a grain of salt.

    The fact that GA Tech uses software to detect possible cheating should not come as a surprise to anyone. Such systems have been in use at many schools across the country for many different disciplines besides CS. Nor should anyone be disturbed by the use of such systems: their purpose is to detect possible cheating, which according the AJC article was clearly verboten to the students in the class.

    In the real world, a completely different set of rules may exist, but the fact remains that if your boss tells you he wants you to do something on your own, then you'd damn well better do it on your own. When a teacher instructs a student to perform a task on his own, he so instructs not to make life more difficult for the student, but to ensure that the student is capable of independently executing the skills necessary for the completion of the assignment. When that student eventually enters the real world, he has demonstrated the ability to perform the skills to be expected of him in the real world, so when he then has the ability to collaborate with his peers, he can actually contribure to the group's performance. A student who has always relied on others to get by will offer minimal assistance to a group and will typically act as a hinderance.

    So sure, in the real world you won't be fired for collaborating with your peers, but you will be if you can't get anything done without collaborating with your peers.
  • by Apreche ( 239272 )
    we've had this at RIT for a long time. The teachers coded it. It's really really tricky. Basically they have the attitude of, if you can get by the cheating program, then you know what you're doing and deserve the grade you get.
  • This has been implemented several ways by several people. The best known is MOSS [berkeley.edu]. I've implemented a script myself to do this and it was fairly successful, catching 5 or 6 pairs from a class of 75. Not all of the cheaters probably, but it was the worst of them.

    The hard part is turning up the "sensitivity," so you get not just exact copies, but also people who have taken parts of a program or made some trivial modifications.

    The problem is that it's hard to find info about these sytems for the very good reason that this is one instance where security by obscurity makes sense. If students know how the systems work, they can re-implement them and check to see if they'll be caught.

    Greg

  • In the old sense, meaning people who believe that elegance is important and simplicity is a factor of elegance, and give them identical problems and environments to work in. Syntactically their solutions will most likely be very different (one will comment, the other not, one prefers braces indented one way, the other another way, etc.) but for many, many problems, particularly those posed in acedemia, their solutions will often be extremely similar, the algorithms possibly identical. How many ways are there to efficiently write a bubble sort, for example, or a node walker? There are lots of sucky solutions but may be very few elegant 'hackerish' ones, in my experience.

    So what happens when the cheating detector spits their names out?

    After 20 years of professional programming, I've recently gone back to university to get my BS in Computer Science (yes, there is a wall, Virginia.) The first time this happens to me I'm calling my cousin, the lawyer. At the least I expect to have the rest of my tuition paid for. At best I want to see some lazy prof fired. I refuse to put up with this B.S.
  • More Info (Score:5, Interesting)

    by pmcneill ( 146350 ) on Wednesday January 16, 2002 @03:01PM (#2849998)
    Here's some more info, from the perspective of a former TA (once for one of the classes in question). First, everyone at GaTech is required to take the first CS class, not just CS majors (== people in the CoC). Second, GaTech doesn't restrict collaboration in all classes. The first tier of classes are strictly individual so everyone has to be in front of the computer. In the second tier, CS2130 - Languages and Translation explicitly allows colloboration as long as people turn in their own code. Going further, later classes involve heavy amounts of group work.

    With regards to the cheater-detecter program (called 'cheatfinder'), it's significantly more complicated than diff(1). It involves checking the structure of the code (ignoring variable names , indentation, and whatnot). Admittedly, I've never seen the source for it (very few people have), but it's been around since at least 1997. The output of the program is a single number indicating the probability that two people colloborated on an assignment. The threshold is typically set fairly high (0.90+), so false-positives are less likely. 187 students, the number caught this time around, is definitely the highest I've heard of, but it's definitely not the first time we've hit a large number -- just the first time it made the cover of the local newspaper.

    Interestingly, many students (including myself before becoming a TA) think (well, thought now) cheatfinder is just something the profs made up to scare students.
    • by statusbar ( 314703 ) <jeffk@statusbar.com> on Wednesday January 16, 2002 @04:01PM (#2850459) Homepage Journal
      Hmmm...

      Makes me think of the possibility of a new product! anti-cheatfinder!

      Parse the original code and then re-organize the parse tree to be different yet functionally equivalent. Output the new code based on the modified parse tree. No need to change the variable names since cheatfinder ignores them anyways.

      Then, sell the product to cheaters around the world! Hopefully they will go get jobs at microsoft!

      Jeff
  • Last semester in my Operating Systems class at Rutgers Universuty, a large portion of the class got caught "cheating" on the first assignment by software like this. Each person got a chance to plead their case to the TAs and Professors before they were given a failing grade, because this was the first time this software was used on such a large scale (>200 students).

    It turns out that about half of the people cheating really weren't- they just all happened to independently come up with a seperate working implementation than the Professors originally intended, and hadn't even thought of themselves.

    All that ended up coming of this was that the Professors apologized on the class newsgroup- I think they still check the code using the same program.

  • what about last years's students? or the entire internet? there's a ton of code out there that is freely available.

    I also know that I would be pissed and question the credibility of the profressor that uses thison my code. Because that is exactly what the professors is doing, assuming that everyone is cheating and run them through the detector... maybe we should frisk and do drug testing on all the students, and hook them up to polygraphs during tests too just to be sure they dont have weapons, cheating aids, or are taking controlled substances or are cheating on the tests as we know they all are.

    Any professor that would use this or even allow it to be used on his students has ZERO respect from me as that is the amount of respect he is showing his students.
  • diff don't do it (Score:5, Informative)

    by kippy ( 416183 ) on Wednesday January 16, 2002 @03:03PM (#2850019)
    First off, diff doesn't work if the kids are smart enough to change their variable names and add spaces here and there.

    I was a grader for the C++ and data structures class back when i was in school. And I saw my share of cheating. One instance that stands out is when a bunch of kids had variables called "dude" and "funtime". Problem was, they had enough differences elsewhere in their code, that an automated diff wouldn't have worked. For a while, I was going to write some fancy perl that would look for certain cheating patterns that I was seeing, but then I got lazy.

    One deeper way to check for cheating is to pass code through the front end of a compiler and check what comes out. if there are too many simmilarities, they will stand out even if kids change paramater names and the like.

    Finaly consider this: Checking for cheaters in a class isn't just doing a diff of two files. For every student in the class, you have to check his code against everyone else's. This is a O(n^2)problem. My class had around 350 people in it
    so that's 122500 checks to do. If it is anything more complex than a diff (multiple files, compiler front-end, fancy perl parcing) this can take a mad amount of computing.
  • This is old (Score:3, Informative)

    by drix ( 4602 ) on Wednesday January 16, 2002 @03:09PM (#2850068) Homepage
    We have had this in the UC Berkeley computer science department for some time now. IIRC it's been quite effective; when it was first unveiled it nailed many, many students for cheating (I think). The verdict amongst students is, if you're good enough to defeat the cheating detector then doing the assignment on your own should be no problem anyways.
  • by Chris Y Taylor ( 455585 ) on Wednesday January 16, 2002 @03:13PM (#2850093) Homepage
    "Today I am going to give you two examinations, one in trigonometry and one in honesty. I hope you will pass them both, but if you must fail one, let it be trigonometry, for there are many good [people] in this world today who cannot pass an examination in trigonometry, but there are no good [people] in the world who cannot pass an examination in honesty."
    - Madison Sarratt (1891-1978), dean, Vanderbilt University.
  • Head TA Elaborates (Score:5, Informative)

    by Arkhan ( 240130 ) on Wednesday January 16, 2002 @03:28PM (#2850221)
    As a former Head TA for one of the classes in question (CS 1502 - Intro to Computing), I'll try to elaborate and answer common questions.

    No, I have no current affiliation with Georgia Tech.

    Yes, the cheatfinder really, really, honest-to-God exists. We used it every quarter that I was associated with the class and caught _lots_ of people. You'd be stunned how many people thought we were just making it up to scare them into not cheating.

    Yes, it actually works. It examines mostly source code, although some versions of it were twiddled to look at "in-between" assembler to help catch those who just change variable names and such. It scans for patterns in the logical constructs of code blocks, even if they've been rearranged or altered in other "cosmetic" ways. It also looks for exact matches in text (like the "commas in same places" mentioned by Kurt in the article), but this is misleading -- it does a whole lot more than that.

    Yes, depending on how you run it, it can generate a boatload of false positives, but it contains several tweakable threshold levels that let you control how "suspicious" a pair-match has to be before it gets flagged, and these thresholds are made looser for simple programs where there's really only one way to do it.

    No, no action is *ever* taken based on the output of the cheatfinder directly. It merely alerts the TA who's responsible for cheatfinder that quarter and he/she then manually reads the source code to see if it looks like a case of cheating. If so, it gets sent on to the professor for a final verification (and possible discussion with the student if it is a borderline case), before being forwarded to Kurt for examination and possible disciplinary action.

    Finally, yes, it's an old and very "evolved" codebase. You wouldn't want to be the one to maintain it, but on the other hand, it has been tweaked to the point where you'd be really surprised at the sort of clever cheating it can detect. (i.e. it works a lot better than diffing the source code ;)

    Anyway, figured I should throw in my $0.02 on this one, since I used to run that class.

    If anybody has any specific questions, please post to this comment and I'll reply. (Questions from current Tech students asking how to "get around" the cheatfinder will be happily ignored, of course. ;)
  • by rkischuk ( 463111 ) on Wednesday January 16, 2002 @03:30PM (#2850234)
    As a former TA for one of these classes who nearly ended up working on the cheat finder software for a quarter, let me add some additional fuel for the fire.

    1. These are not just "programmers" in the traditional Computer Science major sense. The first class is required for almost all students at Georgia Tech. It started off just for Computer Science and Computer Engineering, then expanded to all engineering majors (civil, mechanical, etc). Now, even management majors (Georgia Tech's version of Communications, Basketweaving, or whatever the weak major that many athletes did at your school) have to take the class. The language used to be a locally developed pseudocode language (affectionately known as Russcal). Right or wrong, many of these students consider the class to be an unnecessary hurdle on their way to a degree, and to a technologically illiterate management major, programming does not come easy, nor are they inclined to learn their ethical obligations as a "programmer" - they just want out of the class.

    2. Contrary to many snide remarks, the algorithm is, in fact, quite sophisticated. It is not fooled by extra white space, variable name changes, or simple rearranging. As a TA, I saw even simple algorithms done a slightly different way by every single student. Chances are that a student who will resort to cheating doesn't know enough to rearrange the code beyond the recognition of the cheat-finder and still have it be correct, and a student who does know enough would probably spend as much time dressing it up as it would take them to write the thing in the first place.

    3. Once two submissions are flagged as possible copies, they are first reviewed by a student TA. If the TA believes that they are in fact copied, it is escalated to the class manager (GT staff), and then to the dean if need be.

    It's not a perfect system, but the cheat-finder does a good job of crunching the role of a human down to a minimum, and leaves room for people to make a subjective judgement. It's pretty good, so cut the sarcasm back a bit - it's unwarranted.
  • by cfulmer ( 3166 ) on Wednesday January 16, 2002 @03:41PM (#2850308) Journal
    So, when I was the head grader for the first hard cs couse at CMU, I wrote a similar program -- it does't need to be very complicated because cheaters are, by definition, lazy. Change some variable names, comments, whitespace, move some code around and maybe break a function into two smaller ones, and that's it. My code just counted the numbers of braces, 'if' statements, parentheses, equal signs, etcc, producing a set of numbers for each program. Then, sort them and pick out the ones that are really close to each other. Those get picked out for hand checks.

    As for why: everybody was supposed to do their own work -- this was not one of the courses where people were supposed to collaborate on their programming assignments (those courses came later.) Some students went overboard on the restrictions -- there was never anything wrongg with discussing the assignments (that's part of the learning process as well), but everybody needed to do their own work to prove that they understood the material.

    Now that I'm out in the "real world," this makes sense -- I can tell the people who cheated and slacked their way through school, because they don't last long without understanding what they're doing.

    Now that I'm out in the work
  • by Courageous ( 228506 ) on Wednesday January 16, 2002 @03:43PM (#2850319)

    You may not be fired for consulting a coworker, but if you take a coworker's worker and then claim you did it yourself, you'd certainly better cover your bases.

    C//
  • by BWS ( 104239 ) <swang@cs.dal.ca> on Wednesday January 16, 2002 @03:51PM (#2850379)
    I was TA/Maker for 2 classes at my Univ and my experience with cheating suggests that programs like these are necessary...... here are some of the better exampls.

    1. for an intro to java course... I had a student that photocopied another's assignment and then used white-out to remove the other's name and write his own down
    2. for an intro to java course... they had to implement a doubbly linked list structure... I had 4 students copy off another student... they changed all the variables by adding an extra to the end ... so like next became nextS... (they used Z, S, and B FYI respectively)
    3. for an data strcutres course... I had the two students who's code was the same expect that they replaced the numbers in the varibles by the written version... so like counter1 became counterOne and counter2 became counterTwo
    4. my 2nd funniest example (well, the first one is really good) is an student who tried to pass in another's assignmetn and changed the comments type... he replaced all the /* */ comments by // comments..
    5. my best example was once when I was TAing a course and during my tutorial one of the students acutally offered me $40 to do an assignment for them... (they didn't know I was the TA)...

    in general... they are the extreme... there is a lot of general cheating going on and I think something like this is a good idea... to catch "the smarter cheaters"

  • by Embedded Geek ( 532893 ) on Wednesday January 16, 2002 @04:00PM (#2850451) Homepage
    I've taught numerous courses and once figured it wouldn't be too tough to build a detector like this. Inevitably, someone who cheated would follow a very basic procedure:
    • Copy the original code.
    • Change every variable name (even if to a less sensible name - HalfCircleWidth instead of Radius).
    • Rephrase most comments, but in the most transparent manner (e.g. "incerment the counter": becomes "the counter is incremented").
    • Grab one or two lines of code near the top and rewrite them in the most awkward manner possible. Presumably, this is to prove to themselves that they're more clever than the teacher and that they could've actually done the assignment if they'd bothered.
    Inevitably, it was the trivial stuff (indentation, comment structure) that set off my alarms. Then, I'd give them a moment of truth and sit them down to try to explain how "their" code works. If they didn't, I'd kick their tails out. If I was teaching a seminar at someone's workplace, I might or might not inform their management. Since all these penalties were spelled out in my syllabus, I never lost any sleep (in fact, putting them in my syllabus tends to ensure no one tries it).

    As to the the differenece between "consulting" with another and "cheating", I've found that the "explain your own code" is a pretty good yardstick. If I spend 2-3 hours preparing to teach a lecture, I have no sympathy with someone who doesn't spend enough time to do the assigned work but instead cheats.

  • by wberry ( 549228 ) on Wednesday January 16, 2002 @04:09PM (#2850523) Homepage

    I took Intro to Computing in the Spring of 1996. It was cake for me because I was a Computer Science major and I dig this stuff. But a lot of non-CS people dreaded that class above all others, especially Management, International Affairs, and Architecture majors, but also some engineering people, such as Aerospace and Industrial Engineering.

    (And can you really blame them? How many civil engineers really need to know how to sort numbers in O(N log N) time? Or insert into a linked list for that matter? They write hacked-up FORTRAN if they write anything at all.)

    Kurt Eiselt came to the first lecture and gave us a scare speech about Cheatfinder. Knowing that it looks for similarities between two students' works, I was worried constantly about my homework answers. A typical problem was to write an inorder binary search tree traversal routine in pseudocode. Honestly, how many different ways are there to do this? And there are 500 people in all sections of the class?

    Fortunately, I was never flagged, but I have heard a few stories (which may not be true, you know how that goes) of people who were flagged, and were only vindicated after losing student jobs and failing classes.

    I don't think an automated cheat detection system is applicable to small problem sets like binary search, stacks, and Mergesort. For the later classes, say Sophomore level, I have no problem with it though.

    Besides, many Greek orders and clubs on campus have extensive "word" banks--archives of previous homeworks and tests, with solutions, from previous class offerings. Are they going to check against all previous students' work too?

  • by trenton ( 53581 ) <`moc.liamg' `ta' `lnotnert'> on Wednesday January 16, 2002 @04:22PM (#2850612) Homepage
    That argument never holds up. When does your orginzation have all devleopers coding the same solution, simultaneously? Not everyone working on part of the problem. But everyone solving the same, complete problem in parallel. Never.

    So, we've now proven that in the real world, you can't very well "cheat" off a coworker becuase they're doing something different. You could reuse code, but that doesn't count either. You can ask for their input, but you can't pass their work off as yours. Try that and see how long you last (probably about as long as those cheater students).

  • by mikera ( 98932 ) on Wednesday January 16, 2002 @04:26PM (#2850636) Homepage Journal
    ....who thought he would cheat by copying someone else's code.

    But he was pretty paranoid about getting caught, and realised that a verbatim copy wasn't enough, you'd have to change the variable names, comments etc.

    So he did some research, wrote himself a little parser that read in the source code and built a parse tree of the program. He then wrote another function that spat out all the code again but with different spacing, block ordering and some simple variable renaming (e.g. x,y,x->a,b,c)

    To make sure the structure of the code didn't give him away, he wrote a few code transformations, e.g. if a then b else c became if !a then c else b. The order of non-conflicting assignments were swapped, and mathematic expressions were re-arranged (sometimes actually optimising the original code in the process!).

    Still wasn't good enough, the comments needed changing and the structure of the code looked the same. So he linked in a thesaurus and NLP/AUG engine to change the words in a meaning-preserving manner. Same principle could be applied to the more complex variable and function names, so buildTree became makeStructure etc.

    Finally, to put the icing on the cake he modified the program so it could output the code in a couple of different functional languages. Made the plagiarism almost impossible to spot.

    Best programmer I ever met.
  • by Dominic_Mazzoni ( 125164 ) on Wednesday January 16, 2002 @05:07PM (#2850961) Homepage
    The best formal cheating policy I've seen was from Professor Steven Rudich at CMU:

    CMU 15251 Course Document and Cheating Policy [cmu.edu]

    His policy encourages collaboration and specifically forbids cheating. It itemizes various types of cheating, for example copying from another student, letting another student copy you, and looking at someone else's files online (even if they forgot to set their file permissions).

    Furthermore, he requires all of the students in his class to sign a statement saying that they have read and understand the cheating policy. Not only does that discourage some students from cheating, but it also makes it much easier for him to get students into serious trouble with the school when they are caught.

    In addition to the course document, here's more or less what he had to say on the first day of class: (I apologize for paraphrasing; this is how I remember it) "Nobody plans to cheat. You all must be very smart, or you wouldn't be here. You think you're going to try hard and do well in this class. But later in the semester you'll get busy with other classes and activities, and all of a sudden an assignment will be due in one day and you haven't started. Or you'll be taking a test and realize that you forgot to study an important equation. Or you'll work hard on an assignment and almost completely get it working, but get stuck on one subroutine. Even though you never planned on cheating, all of a sudden you'll find yourself in a circumstance like that and it will seem tempting."

    (BTW, I shouldn't have to say this, but Prof. Rudich's cheating policy is copyrighted. If you're a teacher or T.A., don't copy his cheating policy without his permission. That would be just as dishonest as cheating!!! If you want to use it, contact him and I'm sure he'd be delighted to let you use it, as long as you give him credit.)

  • by MattJ ( 14813 ) on Wednesday January 16, 2002 @05:37PM (#2851200) Homepage
    As you may have read, bestselling historian Stephen Ambrose was recently caught having lifted sentences and even passages from other sources, and passing them off as his own writing in his books. (While he mentioned the source books in footnotes/endnotes, he did not put the cribbed text in quotes.) At least four different Ambrose books have now been shown to have the same pattern of lifted, unattributed passages.

    These instances only came to light because an author of a lifted passage noticed it while reading Ambrose's book. Subsequent episodes came about because other authors started looking, and now some people are checking out new likely sources; this works because Ambrose only lifted passages from books that he admired and heavily footnoted (at least, so far as we know!).

    Perhaps Ambrose was really just lazy, as he was fairly open about crediting others for the ideas (he "just" failed to credit them for the words, too). There are many cases of sneakier plagiarism than that, both in academia and in journalism.

    So, class, the programming problem for today is, given the text of two books, spit out the most likely candidates for lifted passages, based on length and similarity of words. You get a B if you can do this for exact, verbatim matches, an A if you can do it with individual word substitution, and an A+ if you can recognize re-ordered clauses. The end users for this tool would be 1) authors everywhere who want to protect their own writing, and 2) journalists looking for juicy plagiarism scandals.
  • by brumby ( 93242 ) on Wednesday January 16, 2002 @06:13PM (#2851380)
    When a friend was taking a prac class a few years ago, he had the following happen.

    A student came up and said "I've written all of the assignment, but the compiler is broken." My friend looked at the error output from the toy teaching language compiler.

    "Unknown keyword 'From:' in line 1 'From: student2@cs.university.edu'"

    "Unknown keyword 'Subject:' in line 2 'Subject: Assignment 2 answers'"

    ...and so on.

    The student tried to insist that it was all his own work.

  • Real world (Score:3, Funny)

    by p3d0 ( 42270 ) on Wednesday January 16, 2002 @06:48PM (#2851525)
    Cuz remember programmers: in the real world you are fired if you consult with a co-worker ;)
    And remember, the primary purpose of school is to simulate the real world.
  • by StevenMaurer ( 115071 ) on Wednesday January 16, 2002 @07:09PM (#2851662) Homepage

    Was in a class where the instructor asked us to write a program to perform an ascii sort of a file (kind of like 'sort' actually). I specifically asked if we could use libraries, and he said yes. Of course most of the students were using Pascal...

    You can probably guess what I did. My program featured the prominant use of "qsort()" out of the C library. Even though I had learned about callbacks with the thing, he really didn't like it. Made me go back and reimplement it so that there was an actual "sort" being performed in my code. Ug.

    Now I'm a Principal Engineer.

"...a most excellent barbarian ... Genghis Kahn!" -- _Bill And Ted's Excellent Adventure_

Working...