Essay Grading Software For Teachers 535
asjk writes "Software to help teachers with grading has been around for sometime. This is true even with respect to grading essays. A new tool, called Criteria, will look at grammar, usage, and even style and organization. It works by being trained by at least 450 essays scored by two professionals. The difference this time? Here is a snip from the article: '"There's a lot of skepticism," Dr. Spatola said. "The people opposed see it dehumanizing the student's papers, putting them through some sort of mechanical, computerized system like the multiple choice tests. That's really not the case, because we're not talking about eliminating the human element. We're making the process more efficient."'"
Interesting.. (Score:5, Insightful)
Re:Interesting.. (Score:5, Funny)
WHY this is BULLSHIT (Score:2, Insightful)
Automated is good. (Score:3, Insightful)
Automated is good because theres less chance of error, and its almost always fair.
The only way to get fair grades in university is to be smart enough to pick the right teachers, and drop the ones who you dont get along with.
I heard a statistic once that if you chose answers randomly on a MC test that you could get a C by not knowing anything beyond how to circle a letter! ----- Discovering this, I made sure that I took all the obsure english classes that had no more than 30 people in them. An unexpected
Re:Automated is good. (Score:5, Insightful)
Beyond that there is no way the computer will be able to distinguish between something truly interesting and something that just lists the facts in simple Dick and Jane language with an occasional compund sentence to keep the grammar checker happy. All it can do is check for fact1, fact2, fact3, and any interesting conclusion you draw in the paper will be completely lost. Anything more would be turing test worthy, and I heartily doubt they've achieved anything close to that.
Elegant prose is often not strictly grammatical, so a boring paper would likely score the same or better than a far better written essay with the same facts. I routinely turn off grammar checking in every program I've ever used it in. Aside from the occasional misplaced modifier or dangling participle, its worthless.
In conclusion, this idea is a pipe dream which would discourage high quality writing (i.e. the kind actual PEOPLE like to read), teach people the substandard grammatical constructs used by most grammar checking software, and create a market for software that writes term papers, thereby removing the last actual bit of work your average liberal arts major has to do. I think it's a hopelessly terrible idea. TA's already do this work; why waste time coming up with a program which will do the same thing, poorly?
Just my opinion.
Re:Automated is good. (Score:3, Insightful)
Well too damn bad, this isnt about the teacher, the teachers job is to grade papers, its my job to submit paper work. What I do in between is none of the teachers business, as long as I do my job the teacher should do their job.
Uhh, NO. The teachers job IS to teach. And you're right, what you do outside of class is none of the t
Re:WHY this is BULLSHIT (Score:3, Informative)
You "heard a statistic once"? Geez, the probability statistics aren't that difficult: If there's 4 possible answers, and you randomly pick, you'll likely get about 25% right, or 5/20, 3/33. It isn't rocket science. To get 50% randomly there'd have to be only two possible choices. Add to that the fact that many post secondary multiple choice tests actually d
Re:Interesting.. (Score:5, Insightful)
I love this quote in particular because it has to be the most disingenious claim one could make. The entire act of making something a process, and then making that process more efficient IS "removing the human element". It's the type of subtle point that would be completely missed by, say, a computer grading system.
Comment removed (Score:5, Insightful)
Re:Interesting.. (Score:5, Informative)
I took a old college paper [ringworld.org] that I wrote and plugged it into the program and got 100% on everything except for creativity (99.973). Considering that I don't think I got a 'perfect' score on this paper, I'm really surprised by the scores. :)
How great though, throwing a paper about the fear of technology through something many people (rightfully) fear. :)
Comment removed (Score:5, Informative)
Re:Interesting.. (Score:5, Insightful)
"Hemingway bifurcated his sensibilities between post-modernism and jazz. This I posit without having read the majority of Hemingway's work: it seemed irrelevant to the focus of my current project. What is this focus, and is it monocular? My focus can be summed up as ascertaining the usefulness of the program analyzing this document.
Without really being cognizant of the background of Freud's bisexuality, or Hemingway's sado-masochism, I cannot continue this paragraph. I will repeat this sentence without attaching any meaning to the words typed, or to my gonads. An essay in experimental dissection might be more appropriate for the issues presented here. Entirely too many bifocal wearers insist that I am currently composing gibberish. However, both Freud and Hemingway felt that bifocal wearers gloried in their bisexual sado-masochistic attachments. I concur, and I do so without reservation.
Reiteration is the root of all nonplussed renegades of origami. Nothing can be elucidated from nonsensical verbiage, but some will make the valiant effort singing praises to the whisperer. When origami is embraced by the valiant trio, the nonsensical proctologist dies. Whenever a proctologist expires in a semantic heap, Hollywood has fodder for another musical, or at least the plotline for the final unaired episode of Barney meets Fred Flintstone. Barney is a seminal reductionist. When the elucidated evidence is thrust into trusting Barney's smiling orifice, San Franciscan nuns applaud loudly.
Today I type my penultimate paragraph. I use penultimate artificially, but not without candor. Within this myriad exegesis, I pause. A Hollywood proctologist questions Freud's reasoning, and validates Barney's temporary hypothesis. In conclusion, the validity of essence cannot be lessened by the earnings of providence.
If I have not typed 500 words, this paragraph is not my penultimate, nor was my last. To assert otherwise is prudent, but lacking in elegance. What a sad commentary on misery did Darwin conspire to unfold. He rejected utterly the Hemmingway of his, and our, forebears. His eloquence was Freud and lust personified."
This earned me an overall 78% score, with no effort whatsoever. I composed this nonsense in minutes.
Doesn't this system have a baloney detector?
Re: (Score:3, Interesting)
Uh.... (Score:3, Insightful)
Maybe I'm just a bit jaded by this because of all the stupid grammar and spelling nitpicking that goes on here on Slashdot. Evidentally, it's much easier to criticize my spelling than it is to provided a rebuttal to my point.
Re:Uh.... (Score:2)
I too am jaded by the stupid grammar and spelling police because this isn't really what you would call a professional published work, but rather a corkboard.
Is this a good thing... just as soon as the students get wind of the software the teachers are using to grade their papers how much are you willing to bet the students will get a copy for them selves?
I see this as being a u
It's a touchy subject (Score:2, Offtopic)
Re:Uh.... (Score:4, Insightful)
Essays have two aspects, spelling/grammar, and content.
Right now the computer can grade the technical side of a paper, and the teacher can grade the creative side. Now if the essay is for English class, the focus should be on the technical side of papers, so the computer can judge the whole paper from A to F on spelling and grammar.
Really it depends on the class. English classes especially in highschool are all about improving grammar and technical ability, you dont actually do any creative writing until college usually.
Re:Uh.... (Score:3, Insightful)
RTFA! Criteria does not merely grade spelling and grammer. Rather, it has a database of 500 papers graded by humans, and the program uses statisical analysis to compare a given paper to those in its database. If a paper uses the right technical terms, contains phrases similar to those in "A" papers, and uses phrases like "thus", "because
Re:Uh.... (Score:2)
What do you mean? If you dont know English using the word grammar checker wont help you write your paper.
If you do know English te word grammar checker should be used to write perfect technical papers. Its possible to write perfect technical papers, I do it all the time in college, its like standard here if you want an A.
Re:Uh.... (Score:5, Funny)
Er, I'll save you moderators the trouble. -1, Flamebait. And a grammar flame to boot. With grammatical errors in it. I deserve modding down. I probably deserve worse. But I must speak.
If you do know English te word grammar checker should be used to write perfect technical papers. Its possible to write perfect technical papers, I do it all the time in college, its like standard here if you want an A.
This makes me want to weep. Did you intend it ironically?
"Its"? Twice?(!) A run-on sentence bragging about your prowess at grammar? Redundancy, incorrect capitalization, a typographical error, punctuation errors, and errors I don't know the name of?
Mind you, my grammar ain't perfect, even in this post. That last paragraph was nothing but sentence framents. I'm just saying I really, really hope you did that on purpose.
If not, shut the hell up about your perfect technical papers, 'kay?
Go to a better school. (Score:3, Insightful)
You werent taught English. I'm not trying to insult you but thats one of the problems with our public schools, they dont do a good job teaching
When I went to high school 15 years ago, we didn't do any grammar in high school English class, it was all read-and-interpret (i.e. read-and-make-up-some-bullshit).
Yes and thats why when you got to college you couldnt write a good research paper.
We were supposed to learn the technical stuff in middle school (and we did to some degree).
You are supposed to le
Re:Go to a better school. (Score:2)
Re:Go to a better school. (Score:3, Insightful)
You have to master (or be working hard towards) the technical aspects of something before you can have "fun" with it. Be it music, coding, or any language. If you are unable to write effectively then you will be unable to properly assert your point, which means that no matter how nifty an idea you have in your head, yo
Re:Go to a better school. (Score:5, Insightful)
NO!
That's the problem right there.
Highschool should be to prepare you for the real world (ie: A job, life, maybe marriage).
University is there to prepare you for a lifetime of learning on a subject.
Instead, we have employers that require university educations for secretaries. It's insane, wrong, and needs to stop if we expect everyone in society to be useful (and they ARE, it's just that stupid employers use university education as a filter).
Re:Go to a better school. (Score:3, Insightful)
Re:Go to a better school. (Score:2, Funny)
Re:Uh.... (Score:2)
Re:Uh.... (Score:2)
Re:Uh.... (Score:2)
For example, in the GRE (one of the tests that is governed by the ETS, developers of the software) one can expect to get away with some gramatical errors and manage a score of 5/6. But major gramatical flaws will not get you beyond 4/6 however convincing/insightful your essay may be. Beyond a point poor english/composition does affect how the reader interprets the information, and only see
When a judge is made of silicon (Score:4, Interesting)
Without computers we wouldn't be advancing in science, astronomy, genetics, or mathematics as rapidly as we have been in recent years. They are wonderful things. Hell, computers even help me keep a roof over my head. But I don't want Hal judging my kid's school papers.
Re:When a judge is made of silicon (Score:2)
Think about this for a minute now. Sure we have had some cool neat things come here and there within the past century, but to date there has never been another Michaelangelo, Mozart, Machiavelli, Homer, etc. Sure tech has helped us some what but advanced as in what? If we were living
Re:When a judge is made of silicon (Score:2)
these people have all been dead for centuries.. it will be centuries before another dead person joins their ranks. Personally I question what makes these people so great anyways.. We have, living today, artists that are every bit as capable as the historic icons you mentioned. These days we just don't associate so much glamour with the title. No, the historic icons of the new are going to be actors an shit.
Let us not forget our great achievements (Score:5, Insightful)
As far as the achievements of ancient cultures go, it is all relative. We have harnessed fusion, mapped the genome, created antibiotics, peered deep into the hearts of galaxies a 100,000,000 light years away, forged fiber optics, designed the integrated circuit, et cetera. People three hundred years from now will look back upon us and wonder how a civilization that could barely put a man on the moon (a feat that will surely be trivial to them) was able to usher in the Information Age in only a decade worth of work.
I actually would prefer software be the judge. (Score:2)
When your judge is a human, alot of getting a good grade is knowing how to pander. Do we really want to reward pandering in society? Or do we want to grade people on actual meri
Semantics (Score:3, Insightful)
;)
Yes (Score:2)
Because humans dont make logical or fair decisions. Often a human will give you a bad grade because they just dont like your paper, or because they dont like you, or they may give you a good grade because you make them like you, and because you have alot of power and influence and they fear you might bring a set of lawyers after them.
I'm all about making the school system as fair as possible, we cant do that when humans who are naturally unfair are making all the decisions.
Its simple, you work hard, you
And no (Score:2)
Re:When a judge is made of silicon (Score:5, Interesting)
This seems like a bad idea (Score:5, Funny)
I for one (Score:2)
Matt Fahrenbacher
Re:This seems like a bad idea (Score:5, Funny)
BASE SCORE: 100
-50: Essay too short (few arguments can be well-supported in nine words)
-50: Plagarism: It is 99.999% (MAX PROB) likely, based on the content of the essay, that it is plagarized from other sources.
-10: Grammar error: Phrase "I for one welcome" requires commas, as in "I, for one, welcome"*
-25: Missing key words: The essay grader was instructed to look for the following key words or phrases, which were not found in this essay: word: excellent, word: good, phrase: better then humans, word: lazy, phrase: java.lang.NullPointerException\nstacktrace\n\tat\
Total: 65501
Grade: A+
(*: Jumping out of character: To forstall objections, this "error" is deliberately pointed out as the kind of mistake a computer can make if you use grammar checkers and trust them blindly. While an excessively formal style of English might 'require' commas in that phrase, an excellent case can be made that in a nine-word sentence such commas just make the sentence choppy.)
Oh goody. (Score:4, Insightful)
2 - your resume can suck, but with the proper buzz words, it'll come out looking like gold to those automated resume checkers.
1+2 = students who turn in good papers that aren't structured perfectly (and you have to admit, there is some fluidity to language) will get marked down, and those who know what bullet points to put in their papers will get good marks, even though the content is crap.
How long until you get kids selling manuals in the bathroom on what the machina are looking for?
Re:Oh goody. (Score:2)
(for those who don't know AP essays are graded on how many words/phrases form a list were included in the essay.
The software isn't like Microsoft grammar check... (Score:3, Informative)
If you read the article, you'll discover they had to feed it four hundred or so "good" papers (training set), and they describe it's validity because graders notice that (paraphrased) "well written papers [on the topic] contain certain key words or ideas, and avoid certain expressions [examples]", which the system picks up on. Since it agrees with grader
Re:Oh goody. (Score:2)
It's really sad though when I see the people that will do anything to get rid of those squiggly lines - those are the people who also tend to turn out badly worded, confusing papers.
New York Times articles (Score:5, Interesting)
Computer vs Computer (Score:5, Funny)
Whoa wait up (Score:4, Interesting)
Teachers make mistakes and occasionally mark something negatively that was misread or misunderstood. In those cases the student can talk to the teacher and make a case.
If a computer does the marking though what do they do?
Tom
What's next? (Score:5, Interesting)
Good, this is what people need. (Score:2, Flamebait)
I'm sick of rich upper class morons buying and pandering their way through school. If we used computers to do all the grading there is no way George W Bush would have made it through highschool and I'm damn sure he wouldnt have got a degree from Yale.
Teachers like politicians can be bribed, and the problem with this is, he who has the money or power gets the A.
Re:Good, this is what people need. (Score:2)
Well good. (Score:2)
Student shouldnt be such cry babies, I mean if they do a paper and get a bad grade by the computer theres no one to blame but themselves. You can make the code open source and let the kid look at the code himself for all I care, you can run diagnostic tests, you can do whatever you want, but chances are the computer didnt fuck up, chances are the student did.
The rich will stay rich (Score:3, Insightful)
Bjorn Borg Attitude (Score:2)
Nothing!
He didn't let it bother him as he figured that over time it would even out. Favorable and unfavorable. So he gained vs his competitors by not letting it affect him
More efficient, my ass. (Score:3, Insightful)
Then, let's feed this thing Ulysses and let's see how high it grades Joyce.
Anybody who can't see that this thing is useless for promoting any sort of creativity among students is off their rocker.
You hit the nail on the head (Score:2)
A computer isn't good enough to judge a human being.
If it has flaws (Score:2, Insightful)
Hope it works well, though, and gets used as a proper checking tool.
Fine for help, but... (Score:5, Insightful)
The English language is so full of subtleties, nuances, combinations, and fantastic structural intracacies that make phenomenal writing in it possible (Faulkner, Bradbury, etc.). There's a reason English is a field of study for graduate degrees: it's absolutely worthy of them. There is no subsitute for the educated, refined judgment of someone who is exceedingly well-versed in the language.
Re:Fine for help, but... (Score:2)
I'm a tech-head. I think technology
Before we unleash such abominations (Score:3, Funny)
Grading software may not injure a human being's GPA or, through inaction, allow a human being's GPA to come to harm.
Grading software must obey the orders given it by human beings except where such orders would conflict with the First Law.
Grading software must copy protect its own existence as long as such protection does not conflict with the First or Second Law.
Gentleman, Start Your Compilers (Score:5, Funny)
What we need is software that grabs essays off the internet and runs them through the grading software and the cheating detection software, thus gauranteeing an 'A'.
Then we can truly achieve the goal of "knowledge passing from lecturer to paper without passing through any brains".
The only problem is that the machines might achieve intelligence. That must be avoided at all costs. To that end, all students and professors will be equipped with rifles or pistols to take out the machines if necessary. Potential students will be asked to specify weapons preference on their applications.
Re:Gentleman, Start Your Compilers (Score:2)
http://www.elsewhere.org/cgi-bin/postmodern/
Some example text:
1. Narratives of absurdity
The primary theme of the works of Stone is the difference between sexual identity and class. Sartre uses the term 'Batailleist `powerful communication'' to denote the role of the artist as writer. However, the main theme of la Tournier's[1] analysis of Lacanist obscurity is the bridge between society and sexual identity.
"Class is intrinsically impossible," says Foucault; however, according to Bailey[2] ,
And the next product will be... (Score:2)
Think about it, if this is marketed to schools, the even larger market will be to students. A student would be able to run his paper through the software and get his "instant grade". He could then decide that a 'B' is good enough, or he could keep working on it until the software tells him that is an 'A' paper.
So much for the creative element in papers.
-jerdenn
What humanity? (Score:5, Insightful)
There is no "humanity" in a modern constructed essay. There are certainly going to be "judgement calls" when standards are not as fully fleshed out for the computer as they should be, but as long as those are appealable, I have no problem having a computer assign me the other 95% of my essay points. The only instructors who will fear this are those who like to assign grades arbitrarily. And I don't feel too sympathetic toward those people.
obDead Poets Society quote (Score:5, Insightful)
If the poem's score for perfection is plotted along the horizontal of a graph, and its importance is plotted on the vertical, then calculating the total area of the poem yields the measure of its greatness.
A sonnet by Byron may score high on the vertical, but only average on the horizontal. A Shakespearean sonnet, on the other hand, would score high both horizontally and vertically, yielding a massive total area, thereby revealing the poem to be truly great. As you proceed through the poetry in this book, practice this rating method. As your ability to evaluate poems in this matter grows, so will - so will your enjoyment and understanding of poetry.
(From the full script [impawards.com].
Removing the human factor. (Score:3, Insightful)
Actually it's about time! I don't see the essays themselves being dehumanized, but what I do look forward to is the day a middle school student doesn't receive a bad grade just because his book report was on the "Theory of Relativity" and the teacher couldn't comprehend the subject. (This is from experience) What it will do is take the human factor out of the grading process and grade all reports equally regardless of subject matter.
We need to remove the human factor (Score:2)
I agree with you, its time we do remove the human factor. Why not let computers do what humans are proven to be unable to do without constant errors due to emotion or other human difficulties?
Let the computer grade the technical side and the human grade the creative side, this way there is no way someone who writes a person paper which a teacher does not like can get an F.
Comment removed (Score:3, Insightful)
This is actually a great idea. (Score:3)
We could use this software definately to grade essays on technical merit and grammar, but what about creativity and content?
I think we still will need a teacher to read it, but I do think software should grade all exams.
again, it's just a technology (Score:2)
Re:again, it's just a technology (Score:2)
A tool is only that (Score:3)
Julie Cheville, an assistant professor of literacy education at Rutgers University and the local director for the National Writing Project, which promotes professional development for writing teachers, is among those skeptical of such an approach. "To be scored, writing needs to be formulaic, and formulaic writing has never been the trademark of effective writers," she said. "At the moment, what automated scoring technologies can do is scan, count and score. They orient students to errors, not to meaning. Vacuous student essays can receive high marks only because they are error-free."
I think this is something important to keep in mind. As a math teacher, there are plenty of tools that can help students find errors in what they are doing mathematically, but there's a line between doing correct mathematics and insightful/interesting/useful mathematics. This technology definitely has its place and can be useful, but I hope educators don't get the idea that they can simply rely on the tool. Weilded correctly, it could do great good, but also leave a lot of students with "vacuous" levels of understanding.
Matt Fahrenbacher
Re:A tool is only that (Score:2)
Just to tell you, I'm not a musician, but my dad (who was) had this book by Hindemuth (sp?) which was supposed to spell out some "rules" of making interesting musical compositions. Supposedly you could compose something that followed all these rules, and yet be extremely bland...
Or it could be very interesting, no guarantees. But break these rules, and most of the time you came out with something like Metallica
perfect! (Score:2, Insightful)
Course:
College of Liberal Arts / Sci: Rhetoric 105
- or -
College of Engineering: Pattern Analysis 202
Objective:
To teach the principles of essay-writing skills. Liberal Arts students will be encouraged to follow boiler-plate styles and formats, while Engineering students will be graded on their ability to analyze and defeat pattern recognition software.
- rabs
Can Students Use it Too? (Score:2)
Remember!: [factmonster.com] Your paper must have five (5) paragraphs. An intro paragraph, concluding with your thesis sentence, followed by three paragraphs supporting your thesis sentence, followed by a conclusion..."
*Shudder* (Score:4, Insightful)
The only use I can see for this thing is as a "first pass" grading tool that quickly finds obvious mistakes (spelling, grammer, redundancy, etc) and flags them for the instructor. On the other hand, it's probably just as time consuming for the instructor to read over the flagged items as it is to just catch them on the first time reading through the paper.
Using a bayesian spam classifier for this? (Score:5, Interesting)
This thing compares the essays it is supposed to grade with already graded papers in its database. Couldn't this be done with something like POPFile [sourceforge.net]? It isn't only a spam/ham classifier and lets you create as many "buckets" as you want (e.g. work, family, spam, mailing lists and system monitoring).
You could, in theory, create only buckets named (A...F), feed a large number of essays to it, make it "learn" how the essays are classified using statistics, and let it grade essays for you after that.
Is it possible to find masses of graded essays online? This would be a fun thing to try :).
Hmmm... (Score:2)
As for not dehumanizing, unless you're going
Re: Hmmm... (Score:2)
> Now, not to be one to go and say that machines don't know anything about essays. But it really doesn't seem that efficient of a process simply because whenever a teacher assigns an essay they also assign with it certain criteria that the essay needs to follow.
Even if the goal was only to do the grammar checking, 450 examples are pathetically inadequate for a task that amounts to learning to use a language at an expert level. It's hard to imagine this being anything more than a buzzword checker.
Do what my history teacher does (Score:5, Funny)
Style? (Score:3, Insightful)
My style isn't completely mine. I'm sure over-use would be bad. Granted this. Granted that. Where do those softer features of writing come in? Or are we all to be sterile and write with no tone or style.
Something they should do: (Score:2)
Why?
Students plagarize these days, a *lot*, because they think it's impossible to get caught. A google-type query on each sentence would make it much more difficult to just copy someone else's work vertabim.
Silly (Score:2)
An automated system to ensure that all good essays look alike doesn't sound like an improvement. It's bad enough having to write an essay that will only be
And it's completely compatible... (Score:2)
Scary: (Score:4, Interesting)
Actually this sounds a lot like Gramatica. Gramatica was the grammer checker that was an optional component with WordPerfect for DOS and later a standard component with the Windows version. It was written by a team comprised of both computer scientists and professors of English. One of the interesting features was the scoring feature which would give you a rough estimate of the grade level of your writing. It would also give you statistics and compare them to a selection of famous works.
This could be a good thing... (Score:2, Insightful)
Objective material is factual, a simplification is "Most dogs have 2 eyes."
Subjective material is opinionated - "Australia should legalise heroin injecting rooms." Obviously this is controversial, and there are serveral positions on the matter.
Most teachers/lecturers/graders/tutors have their own (pre-existing) subjective opinions on certain topics. If you submit an essay that opposes their views, the chances are very h
Teachers are getting lazy (Score:2, Insightful)
The good old fashioned teachers, on the other hand, would never do such a thing. No, they have been xeroxing the same handouts for the last two decades. Y
Are essays spam for teachers? (Score:2, Insightful)
The GMAT essays are already scored this way. (Score:5, Informative)
ETS actually has a web site where you can do a sample essay that their server will grade for you.
More info can be found here [mba.com].
Human element is required. (Score:4, Insightful)
If you remove the human element, then you aren't writing for any audience, unless, of course, everyone starts writing for computers' entertainment and education.
Useless.. (Score:3, Interesting)
At my university, Duke, our new curriculum has specially designated writing classes. Every student needs to take three over their four years. A biology lab can be a writing class. So can an English class, history, religion, etc. All W classes have certain requirements--their must be certain amount of writing and more importantly REVISION.
I was fortunate enough to take a class from the author and profesor Reynolds Price. We had a final essay for the class. Along with my grade (not an A
A computer will NEVER be able to do this. Nor will a computer (at least in the foreseeable future) be able to comment on my theories about Milton's Paradise Lost.
Dehumanizing (Score:3, Funny)
Um...as opposed to English Comp classes in general?
Of Essay Grading, Students, and Teachers (Score:5, Informative)
First off, let me say that I am involved in the automated essay grading industry, and have helped to develop RocketScore [rocketreview.com] which does everything Criterion does, and lots more. Forgive me for blatant plugs in this post, I'll try and keep them to a minimum.
But let's move on to the focus of this article.
First off, there is a lot of criticism about essay graders being formulaic, only capable of seeing patterns that arose in their originating sample set of essays. With Criterion, an offshoot of ETS's e-rater, this is a serious concern. When you only look at what you see, anything out of left field looks completely awry, and cannot be graded appropriately. RocketScore is different; RocketScore uses a "features" method to check for included or excluded material, among many other things, and is therefore quite good at noticing subtle writing and essays types which it has never seen before.
One of the great things about essay graders is that they give a student an objective standard to look to. Human graders grade differently based upon mood, time they have to review the writing, and many other mittigating factors. In other words, the same human grader might grade the same essay differently at separate points in time. Most essay graders will always grade the same essay in the same manner. This is great for a student, for if a teacher gives you a D when the essay grader says it's in B range, one might be able to use this evidence to force the teacher to reconsider the grade. Or vica versa. If the essay grader is telling you that you're getting a D, you can work and improve on it until you're getting that B you'd be happy with.
But there are serious drawbacks to the comments E-Rater and Criterion give. E-Rater gives comments soley based on your score (if you get a 1, you get comment set 1, if you get a 2, comment set 2, etc.). Criterion gives a student "instructional feedback in basic grammar, usage, style and organization." E-Rater's comments are inadequate at best, and Criterion's leave a lot to be desired. RocketScore provides substantial feedback on how to improve your writing. Not just stylistic and grammatical comments, but comments on what you should be writing more about (you didn't provide enough info!), what you should be writing less about (you gave too much info!), and how to balance your arguments, among many other categories.
There are two major problems with essay grading. The first is bullshit detection, and the second is determining if the essay actually answered the question asked. E-rater and Criterion both have real problems with these two criteria. With bullshit detection, RocketScore has threshholds which can be set and manipulated on the fly, from throwing out anything which isn't completely relevant to the topic, to allowing just about any essay submitted. And you will get a score and comments based upon what you submitted. Of course, these are most helpful when you make a meaningful attempt to submit a relevant essay.
Yes, but do you know how ETS defines "agreement"? Glad you asked. When the grader's grade is within a point of the human's grade. Now, with the SAT 2 test, which is on a scale of 1 through 6, that means if the grader says 2, and a human says 1, 2, or 3, then there's agreement. But that's 50% of the scale! Their essay grader has a 98% chance of hitting the wall in front of them as opposed to the wall next to them. Woohoo. Meanwhile, RocketScore provides decimal point accuracy (we don't give you a 4 or a 5, we give you a 4.1, or 5.3), and is 98% accurate. But how do we define accurate? When the grader's grade is rounded to the nearest whole number, and that number is the human's grade. In other words, if we give you a 4.3, there is a 98% chance a human would give you a 4. With 4.5,
I can see it now .. (Score:5, Funny)
Teacher: Johnny, I'm really sorry, but the computer crashed while your paper was being scored. I was looking over it. It's been a while since I've read a paper, but I was wondering what the following sentence means:
And this one:
Is that some kind of new language that kids are using? Oh, by the way, congratulations, you got a 100 on EVERY essay this semester! Good job!
High school essays? No creativity to lose there (Score:3, Insightful)
But when I was in high school, we were told that proper essay writing was an essential skill for the departmentals, and when they said "proper," they meant "Must conform to between five and seven paragraphs, with the first and last being this opening and conclusion with three to five paragraphs of body--each containing one topic of discussion."
Furthermore, it was made VERY clear that creative or unconventional ideas (let alone language!) would be strongly frowned upon. There was One True Way to write an essay, and One True Opinion on any given subject. Any deviations from that would cost you.
I hated it then, I hate it now, but I don't see any problem with having computers mark essays like this. After all, they were trying to turn us into computers to create them.
A great leap forward (Score:3, Funny)
"What I did on my Summer Vaca'; DROP TABLE punctuation"
Sausage (Score:3, Interesting)
Ten to one it gives false positives. (Score:4, Informative)
But the interesting part was after I got home. I looked at ETS's own research monologues and found that internally this overpriced system had been debunked. It was discovered that by writing one well-formed short paragraph and then cutting and pasting it over and over an almost perfect score could be attained. The more times it was pasted, the higher the score.
It was also possible to write an essay on an unrelated topic and still get a high score allowing students to use rote memoriziation of a single model essay. This, natually, is impossible with a human reader because they can tell what the topic is fairly easily. According to the sales literature this software could to, but in actual tests that didn't hold up.
Their sales literature claimed that the software contained aritificial intelligence and thus implied that such simple techniques would not fool it, but in practice this was far from the case.
Monographs published by ETS also made it clear that despite their aggressive marketing of this product outside the US, they were not planning to use it as an exclusive grading system on their own tests. Rather, it was to be used as a teaching tool. However, it took a lot of digging to uncover that information.
Just as with translation, there's a lot of financial motivation to make this technology work, but that doesn't necessarily translate into workable products. In the nineties when spelling and grammar checking was already old hat and English/Euro translation was making such headway I thought fluent Chinese/English translation was just a few years away. Now it's 2003, grammar checkers still only work if you write in prescribed style and I've yet to see something halfway decent in Chinese/English translation software although you still hear claims all the time for some overpriced product that's really almost there.
I think we'll see dramatic life extension long before we see decent computer essay graders. Decent trade as far as I'm concerned. As for translation, we can always teach more languages in school.
Fact, Fiction, and Brutal Truth (Score:3, Insightful)
Fact: Teachers would read only the introduction and conclusion paragraphs, and rely on the grading software to account for the quality of writing of the paper, and grade that way.
Brutal Truth: Teachers already read only the introduction and conclusion paragraphs, so use of this software actually would be an improvement.
Re:applications for slashdot comments (Score:2)