data:image/s3,"s3://crabby-images/d8d7c/d8d7cc860019ede8505436e348ac190f750f07c3" alt="Education Education"
ACM Programming Contest Results Revised 68
Goonie writes: "After the controversy over the results of the ACM programming competition, it appears that the results are being revised. Check out the new standings. The winner hasn't changed, but some teams have moved up the rankings." Actually, from the look of the page at 4:15GMT this morning, "final rankings are under review." Let's hope the fairest decisions prevail, and that all involved are gracious.
Cool... (Score:2)
The rules sounded somewhat weird, but it's probably no weirder than the programming contest [msoworld.com] I've been doing lately (its monthly, so hold your horses). (and MazeMan can admit he makes mistakes.
---
pb Reply or e-mail; don't vaguely moderate [152.7.41.11].
Disappointed (Score:2)
All our submissions had to be hand submitted by the judges which greatly reduced the speed of which we could get back feedback (you could have submitted a program an hour into the contest and chances are you wouldn't have known if it worked or not).
Coding C in Pico (Score:3)
I don't see how anyone could have made enough sense of it for there to be a controversy.
Damn good programmers, them.
What Languages? (Score:1)
programming::contest? (Score:2)
i program in many a language
i program so i can use my computer
in different.ways
programming is an art
Re:What Languages? (Score:2)
C, C++, Pascal, and Java are the only acceptable languages.
What is this? (Score:1)
Thanks!
Uni melb team (Score:1)
Ive taken the year off uni (so im not eligable to join in the acm), and working as a coder. My last job was to write an LDAP enabled pop-server, over the weekend and during that time my girlfried (working as a waitress) earned more than me. god the IT industry is in a crap state. or maybe i just struck out with a bad job.
heh. i dont know why im posting here. but on what turns out to be like $8 per hour, i cant afford professional help.
Re:Disappointed (Score:2)
I would like to add, however, that this contest is important to motivate people to take an interest in teaching group software development under a time constraint. What I have learned from being an ACM coder for the past few years greatly helped me on the job working with other coders, and people who had an idea but just didn't know how to code it up.
Sharing a single computer was better influence over good programing practices (for speedier debuging if not having the occasional team-mate reading over your shoulder) than any software development course I've had.
fairness (Score:1)
No! I hope that the new results are terrible, and that everyone is pissed of. and that the resulting chaos destroys the ACM and everything it stands for!
rating a programmer (Score:1)
- does it get the job done
- does it get the job done every time
- does it get the job done every time, and quickly
- does it get the job done every time, quickly, and doesnt look like something larry wall would write
so yeah, thats one way. they might have more stringent rules then me, but hey
Re:Cool... (Score:1)
Couldn't make sense of problem F? (Score:2)
Now, this does not mean that it is easy to do, or that I remember the algorithm I once learned for finding the shortest path from a to b on a graph, but I don't see why anyone with a bit of computer science knowledge wouldn't be able to understand the problem.
Of course, as I said, I missed the controversy, so that may not have been what the hubub was about.
-Matt
Re:Couldn't make sense of problem F? (Score:2)
What is possibly the best (yes, yes, there is always that worst case where the path goes through all n nodes) solution uses a breadth first search using a queue with parent points to find your path back to your starting node.
A depth first search is often not the best.
Yes!!! (Score:1)
I had prob F coded in 30min or so and got it wrong. We spent the rest of the time trying to figure out why!
This *might* have us solving one more problem, but it prolly won't change the relative standings.
Even if we would have gotten that one solved in 30min, I don't think we would have solved another one.
Isn't this the first time the standing have ever been changed???
Re:What Languages? (Score:1)
(I seem to remember that ANSII does not allow for embeded Assembly code, which would have made for some slick bit shifting and fast output)
Official changes posted by April 24th (Score:2)
The teams recently received letters from the contest directors, who say that they have finally decided to "personally re-examine all submissions of Problem F from the archives" and that "the results of the re-examination will be posted in the standings by April 24th."
So I'm not expecting to see any changes until April 24th, which is next Monday.
If they make any changes, teams will be bumped up to a higher position, but no team will have its position lowered. Best of all, they have changed their official policy so that in the future, teams will have access to their programs after the contest is over, and also there will be a standard procedure for regrading after the contest is over, in case this happens again.
I'm particularly happy since there's a reasonable chance that my team's score will improve, but I think everyone should be glad that they're making an effort to keep this a fair contest. I highly recommend this contest to all interested college students. If you haven't already, check out the problem set [baylor.edu] from this year.
Re:What is this? (Score:5)
Take a look at problem F. (Here are the contest problems [baylor.edu] in PDF if you're interested.) In a nutshell, you're given a complete directed graph, and you need to return the average length of all shortest paths between all pairs of nodes. The problem explicitly stated that you will only be given graphs in which there exists a path from every node to every other.
This is not a hard problem to work out, but anyone who has had a formal course in computer science ought to recognize that the Floyd-Warshall all-pairs shortest path algorithm is designed to solve exactly this problem. Then all you have to do is add up all of the elements of the matrix and divide by n * (n-1).
Except that the judges made a mistake, and tested our input using a graph that was not connected - in other words, there were nodes that could not reach other nodes via a directed path. This would not be a big deal, except that the problem explicitly stated that this would not occur. (Input validation is never a part of this contest.) Furthermore, without further explanation it is unclear how these nonexistent paths should affect the average. It turns out that the judges' solution was not counting these paths, and averaging only the paths that existed. Some teams did this by accident, and others (including Waterloo) figured it out only after submitting multiple runs and incurring large penalties. My team was one of the many that did not figure out the judges' mistake, so we did not get credit for the problem, even though our solution was certainly correct as the problem was worded. If we had received credit we would have had four problems correct, possibly putting us in the top ten. Of course, if we had received credit right away, we might not have wasted so much time figuring out what was wrong with our solution and we could have solved another problem in that time. Of course, many other teams were in a similar situation, so I have no idea what the final ranking would have been, but clearly it would have been different.
Re:programming::contest? (Score:2)
* Sometimes they say that if your program runs in say >5 minutes it is considered non-functional, so a very naive approach that is technically valid is often not accepted.
Re:Couldn't make sense of problem F? (Score:1)
Djikstra's - O(V lg V + E)
Bellman-Ford - O(VE)
IIRC, the problem with the problem (hehe) was that they said the graphs would be connected, and one of the test examples was not.
You would think... (Score:2)
Microsoft OLE DB Provider for ODBC Drivers error '80004005'
[Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified
/past/icpc2000/finals/RosterPublicFull.asp, line 11
It was fun anyways (Score:2)
Re:Couldn't make sense of problem F? (Score:3)
Heh. You're thinking of the longest path problem, which is NP-complete. Finding the shortest path between a pair of nodes is easy - use a simple breadth-first search for an unweighted graph, or Dijkstra's algortihm for a weighted one.
To find the shortest path between all pairs of nodes, you can use the Floyd-Warsha ll algorithm [brunel.ac.uk], which runs in O(n^3) time. It can be done faster, but in this particular case we were guaranteed n is no more than 100, so this was quite reasonable.
Re:It was fun anyways (Score:1)
Can't beat that IBM desktoy either:)
It's the ACM, they just *can't* be wrong...blech (Score:1)
I know, minor little rant, but it just goes to show that the ACM *never* is wrong. This still ticks me off.
Re:What Languages? (Score:4)
The team from MIT spoke only in 13th century latin. The team from Oklahoma State communicated solely in cattle brands.
The team from NYU in Ebonics. The team from Dakota State University spoke only in a very slow drawl.
The team from Georgia Tech kept referring to functions as crackers, and arrays as yankees. The team from UCLA constantly threw in random "likes" like, like this, like.
The team from University of Winnipeg felt it necessary to say aye after each sentence. The team from University of Southern Florida wouldn't shut up about Elian, and often develoved into a Spanglish.
The team from Brigham Young University kept referring to the moral crumbling of the ACM contest. The team from Texas Christian University always commented their code in the form:
Entrant's opinion (Score:5)
If it hasn't already been explained, problem F went as follows: find the average shortest path length between all pairs of nodes in a strongly connected directed graph. The judges appear to have put a non-strongly connected graph in the testing - one for which not all pairs of nodes have a connecting path. It was then pure luck whether you had chanced upon the same method for computing the average as the judges, so while some teams solved the problem in the first few minutes of the contest (notably St. Petersburg state U.
Because it was by far the easiest of the problems, many teams began with this question. By the end of the five hours, many perfectly clever teams had not solved anything, due to the stuff-ups on the part of the judges.
Now, making errors is one thing, and it certainly screwed up the contest for everyone, but in some sense errors are forgivable - what was really appalling was the way in which the situation was handled.
A protest regarding the question was submitted (by my team) within a minute or so of the end of the contest - there was no official channel for appeals, so we were just told to write it down on a piece of paper and give it one of the judges. We never received a response of any sort.
After the contest most of the teams were aware of the problem - one team had submitted a program designed to go into an infinite loop if the input graph was not strongly connected, thus working out what error the judges had made. In any case, we took our printouts to verify our solution, and later presented our arguments to the tournament director, Bill Poucher.
I have never met someone so unprofessional, with no concept of accountability or responsibility, and little even of courtesy. I am not sure what his background is, but if Mr. Poucher were the director any public company, he would have been on the street - or in court - in seconds. He refused to listen to any of the requests from the teams or coaches, telling them instead their memories of the contest perhaps weren't clear. Nothing was done and the prizes were awarded, with Bill Poucher announcing at the ceremony that "every problem was solved by at least one team" - so that there weren't any problems.
After the contest, in the face of protests from regional directors and coaches from around the world, with proofs that their solutions had been correct, the contest organisers continued to deny there had been any problem. It took quite a while for them to agree to remark the solutions.
Now, I have no especial gripe at the judges that an error was made, because people make mistakes - it's just an indication that they need better validation on the testing programs. I'm also not especially fussed that my team didn't come first - because in some sense, we were one of the LEAST affected teams. Just look at the results of the American universities. I mean, it's nice to have a laugh at America getting its ass kicked once in a while
What I do have a problem with is the contempt the organisation showed to the tens of thousands of man-hours of work that go into the contest on the part of the contestants. No channel for appeals, no official responses to protests, no means to elect new directors, no accountability, not even a report saying "we made a mistake; we plan to do this to fix it". There was a similar problem in our regional contest this year, and the regional director (Raewyn Boersen) fessed up, and sent around an email describing the problems, the changes she would put in place to the validation of the competition, and the transparency with which future contests would be conducted.
My question for slashdot is, how do you get an organisation like this to to act like it's accountable to the people who enter?
In any case, I'd still recommend entering the contest to any CS students - its an amazing experience, however it goes. I've never learnt as much CS in such a short time as in the week's training leading up to the finals. Ask yourself - if you had to, how fast could you code up a 3D voronoi tesselation? Or a fast constraint solver? Can you find a bug in someone else's uncommented code under time pressure, or look at a problem and say how long it will take to code, and what the best order algoritm is? These are the kind of things you learn.
cheers all,
John FitzGerald, University of Melbourne,
Australia
I suppose (Score:1)
Re:Cool... (Score:1)
Re:What Languages? (Score:1)
Re:Couldn't make sense of problem F? (Score:2)
Re:Entrant's opinion (Score:1)
> of the problem - one team had submitted a
> program designed to go into an infinite loop
> if the input graph was not strongly connected,
> thus working out what error the judges had
> made. In any case, we took our printouts to
But how would they know that it *had* gone into an infinite loop? Isn't that the essence of the halting problem?
-Andy
and the winner is.... (Score:1)
Microsoft OLE DB Provider for ODBC Drivers error '80004005'
[Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified
/past/icpc2000/finals/RosterPublicFull.asp, line 11
Uh... (Score:1)
Re:Entrant's opinion (Score:1)
Another error usable for signalling back data could be runtime error.. like division by zero, etc. I imagine this could be a bit faster
Regards,
Talence
Re:I suppose (Score:1)
I suppose, but you rate code based on efficiency?
Running time to produce a correct answer.
Re:You would think... (Score:2)
I'm sure hosts of a world famous programming competition would want to waste time and resources to write its own database software and reinvent. Maybe it's a foreign concept to some of you guys that sometimes the most practical, most time-saving solutions is the best one, even if it's not "the right thing".
Re:What Languages? (Score:1)
Get it right eh?
Re:It's the ACM, they just *can't* be wrong...blec (Score:2)
Anomalous: inconsistent with or deviating from what is usual, normal, or expected
1. Get Tenure, 2. Eat brain (Score:2)
They rise to a certain level of recognition, probably higher then they feel (deep down) they deserve. They resolve this insecurity about their competency by believing that they are actually the 'right hand of God', and look down on the unwashed masses with utter contempt, from their Ivory Tower.
'How DARE this student (spoken through clenched teeth) question MY process and MY organization of MY contest!? Without ME he is nothing.'
Typical case of recto-cranial inversion, easily cured with a clue-stick beating and a few months at a REAL JOB.
Re:Entrant's opinion (Score:2)
But how would they know that it *had* gone into an infinite loop? Isn't that the essence of the halting problem? ;-)
Simple. They knew because the first step of the loop took 1 second, and each subsequent step n+1 took half the time of n -- so in 2 seconds the program had completed and infinite loop.
Re:Entrant's opinion (Score:1)
The following year we failed to complete a problem at the Regionals because the mechanics for submitting solutions (on floppy disk) were broken. We discovered the problem and corrected it at the last moment, submitting our solution just as the final bell rang. The judge refused to accept the solution, much to our amazement and that of the nearby teams.
When we protested, he coldly told us "Forget it. You guys don't deserve a break." None of the other judges would overrule him, since they hadn't been present at the submission desk.
Turned out he was the coach for one of the other teams, and had a grudge against us because we were graduate students. (Indeed, the competition was limited to undergraduates the next year.)
That incident kept us out of the Finals. I only wish I remembered his name, so I could besmirch him here on Slashdot
-- Dr. Pain
Re:It's the ACM, they just *can't* be wrong...blec (Score:1)
Re:Entrant's opinion (Score:3)
In the last 5 minutes of the contest, we printed out all of our code to all of the problems that we worked on, so we could have something to go from. We gave the code to our coach, who coded it up and likewise never found a test case that broke it. So he wrote an email message to Bill Poucher.
A few days later, we got a response. Here's a copy-paste of what we received
The reason that 16-bit integers would break any potential solution is that for the case given where we have a 100-node loop is that the sum of distances could easily go beyond the 32767 mark. I guess the teams that did get it either found another algorithm to use or they explicitly used long integers. Pardon me, but ever since 32-bit operating systems came around, I have pretty much assumed that int == 32-bits.
Since then, I have yet to hear of a "general statement", except for this Slashdot story.
In my opinion, many of the results of the competition are invalid because of this problem. Had a correct response come back to us in the first few submissions, we would have gotten at least one, if not two more, correct. We wouldn't have changed the top-most standings, but it's possible that there are other teams that would have.
Nothing new except the response (Score:1)
Re:Entrant's opinion (Score:1)
That is a foolish assumption to make. What happens when you work on a small embedded 8- or 16-bit processor? What happens when you work on a 64-bit system?
The C language spec makes no guarantee for the exact range of an int. It is only specified relative to the size of short and long. Sorry, but if you're going to be pedantic, you should be pedantic eveywhere and not assume type sizes that aren't specified exactly.
Overall negative experiences with ACM (Score:1)
I had a lot more fun at the IEEE contests, run by none other than Mr. Transmeta, Dave Taylor.
- Amit
ACM Programming Contest: A Joke (Score:1)
I competed once in an ACM programming contest (the regional one this year), and have to agree that it's a sketchy affair run by people who really don't know what they're doing.
For example, it was fairly obvious that the problem writers had no real grasp of Java, even though its use was allowed. One problem in the contest involved taking the reciprocal of arbitrarily long integers that are a power of 2 or 5. My group was the only one to finish this problem: all the other groups thought it was one of the most difficult. Considering that Java has its java.lang.BigInteger class, solving the problem was trivial, and a no-brainer.
Re:Uh... (Score:1)
Re:Coding C in Pico (Score:1)
Re:You would think... (Score:2)
Re:Entrant's opinion (Score:1)
I was at the contest. Keep in mind that B.Poucher had a lot on his mind. You were upset that the problem had been misjudged and you had to go and wait for the awards banquet.
Poucher had to ensure that all of the contest floor, including large areas that you did not see, were completely cleared in one hour. He had to make sure that the dinner, the awards, the party, the transportation, and etc. were ready to go.
He was under a lot of stress. Remember that he was responsible for every contest-related thing that you did in Orlando. While you were relaxing in your room, he was planning, setting up, tearing down, etc. While you were writing code, he was writing checks. The time put into the ICPC by the contestants is pretty trivial compared to the time put in by the ACM people and the IBM/host volunteers.
Keep in mind that Poucher did make sure that the problem was re-examined. He just couldn't do it immediately after the contest. It would be a mistake to suggest that only the contestants and coaches thought that Problem F should be re-judged. Remember that all of those mean, evil judges are volunteers. Getting a large group of volunteers to agree to do something as yucky as rejudging a bunch of submissions by hand is like herding cats.
Remember to walk a mile in the other guy's shoes.
Peter Doege
(Hey I'm not anonymous!)
Re:university of central florida. (Score:1)
Booyah
Re:Overall negative experiences with ACM (Score:1)
As a student at UT Austin in the early 90's, I participated as staff on SPC93 and NPC94. By that time, Dave was at Id, but he would still show up for the days of the actual contest. There's no comparison between ACM and IEEE as far as excitement and competition go. The programs at IEEE are judged based on their performance against the other programs. No judges with human emotions of pride or prejudice come into the picture. Teams could actually help each other out if need be. There was open communication of all problems between the staff and contestants. It was a two-day contest, so teams had plenty of time to code a rather massive work. I never heard from a single team, winner or loser, who was disappointed with the outcome of the contest, the organization, or the communication channel with the staff. I'm just sorry to say that after NPC94, there was not enough student interest or outside funding to keep the project going. NPC is greatly missed.
The words "Ivory Tower" have been used in other posts, and I can't think of any that better describe the ACM contest. My experience has been that in the realm of Computer Science, there are those who care only about the abstract, theoretical, and don't care so much about whether it works as HOW it works. Those are computer scientists. Then you have the people who have the attitude of "Get the job done... THEN get it done better." Those are engineers. *sigh* I recall being amazed to hear in '95 that Dr. Djykstra owned no PC of his own, and in fact had not actually USED one in two to three years. I've always been very hands-on, and I feel if you don't continue to use something, you forget it. If as a computer scientist, you don't USE a computer, IMO, you're really just a specialized mathematician.
I'm not saying the ACM contest isn't worthwhile. In fact, it's great fun. I definitely wouldn't hold it up as a model of 'real-world' software development, well, maybe software development at Microsoft
other things need revision too (Score:1)
found that they need to revise their code as well (see below for the error message).
a) Maybe _they_ should hire a medalist...
b) Obviously, they should have used Linux.
I got:
Microsoft OLE DB Provider for ODBC Drivers error '80004005'
[Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified
/past/icpc2000/Finals/RosterPublicFull.asp, line 11
It's at http://acm.baylor.edu/past/icpc2000/Finals/
and then click on teams.
Re:Entrant's opinion (Score:2)
That is a foolish assumption to make.
No it isn't. If you're programming for portability - maybe. Knowing your target platform is absolutely imperative. The goal here wasn't to program for x different platforms - it was to show that you have what it takes to code up algorithms to solve some of the toughest theoretical CS problems, under the constraint of time.
What happens when you work on a small embedded 8- or 16-bit processor?
Especially on this type of work, you want to know exactly what the compiler is going to do - you don't want to throw in a bunch of junk code to make your code portable. Its called optimization . Anywayz, they weren't compiling on a small embedded processor, they were compiling on 32-bit x86 processors running NT where its assumed that the compiler being used by the contestants creates target code exactly the same as that used by the judges. If not - this should be mentioned.
What happens when you work on a 64-bit system?
Well, since you asked, in this case, the algorithm would still work. The fact remains that they weren't compiling on a 64-bit platform, tho - see above.
The C language spec makes no guarantee for the exact range of an int.
This isn't what's at issue here. Maybe in some other thread, your comment has relevance, however, nobody in this thread lacks knowledge of what is and isn't guaranteed by the C language spec. Nobody really cares what you know about the C language spec, either.
I won't make the uninformed assumption that you've never been to a programming competition or that you don't know what its like to sit for hours trying to fix something that doesn't end up being a problem with your code ... but it would certainly seem that way. You're pounding issues that are highly irrelevant.
as always - my $0.02
Jobe
Re:Entrant's opinion (Score:1)
That is a foolish assumption to make. What happens when you work on a small embedded 8- or 16-bit processor? What happens when you work on a 64-bit system?
The C language spec makes no guarantee for the exact range of an int. It is only specified relative to the size of short and long. Sorry, but if you're going to be pedantic, you should be pedantic eveywhere and not assume type sizes that aren't specified exactly.
That's a pretty harsh statement. I mean, speaking of foolish assumptions, aren't you assuming that they were using C? Most of what I've read implies they were using Delphi/Object Pascal.
Re:You would think... (Score:2)
I don't know what webpage you're on, but when I accessed it, it looked fine to me. Spending X amount of time and effort to get a server that fails 1% of the time sure beats spending 10X amount of time and effort to get a server that fails 0% of the time. Especially since it's not mission-critical.
Frac
Bill Poucher can't be that bad (Score:1)
I was webmaster and technical assistance for the 1999 Southeast Regional Contest [auburn.edu]. Bill Poucher came to the contest and was quite available. From my interaction with him (albeit on the side of the people administering the competition), he seemed quite reasonable and quite nice.
This is not to say that the contest ran smoothly. A more detailed account [auburn.edu] of what the tech staff went through is available.
There were some complaints, but they were handled politely, and by the end of the contest, everyone was tired but happy.
Now, back to the international contest. I don't know what happened. I can guess that the judges got hit with a salvo of complaints and joined forces to repel the assualt
I said it before, and I'll repeat it. From my experience, Mr. Poucher is a decent and courteous individual. Managing a contest like this frays your nerves, and when people start bombarding you with challenges and accusations, it is easy to lose patience with them. There are also established procedures for dealing with appeals, and circumventing them by directly approaching the judges isn't likely to influence them positively.
Who am I?
Why am here?
Where is the chocolate?
Clueless director (Score:2)
That's a pretty clueless statement coming from the director of a programming competition. Borland C++ in fact ships with two command-line compilers:
First rule of programming: Choose the right tool for the job. Apparently, this fellow skipped that class.
Contests are great, and Dr. Poucher's a great prof (Score:2)
I have had two classes from Dr. Poucher (one theory and one software development) and I never knew him to be anything but fair to students.
Ask yourself this, though, what would you have done in his situation? You knew that three people, whom you trusted, had checked the problem data. You knew that the problem had been solved on the data already. You've not encountered such a problem in ten years of running such a contest. You knew that every contestant at the contest was way keyed up and running on adrenaline.
I agree about contests though. They are a lot of fun and you do learn a lot. This is true in general, IMHO. Texas has statewide academic contests in high schools -- participation in those takes away all the "test anxiety" many otherwise excellent students face. Programming contest is the same way. It gives you a great confidence, whether you win a contest or not, to just sit down and start slamming out code.
Once again, I am sorry that you had to meet Dr. Poucher under such adverse circumstances!
Zorn
strange solution (Score:1)
If "the matrix" is an adjacentcy matrix, just consider an edge between any two nodes. For a graph that would still be connected without the edge, the average shortest distance between nodes for the graph with the edge is bounded above by that for the graph without the edge, but, as the length of that edge goes to infinity, your sum of matrix elements diverges.
Is there something more about graph, such as a space in which it can be embedded?
I hope that this was not this guy's [other] issue with the competition.
Re:Coding C in Pico (Score:2)
And exactly how is that different from real world systems development? Sounds like a pretty realistic contest to me...
sPh
btw (Score:2)
--
Re:1. Get Tenure, 2. Eat brain (Score:1)
Ya. I is almust ase eezee to till hoo hus a moostirs tu...
Re:Entrant's opinion (Score:1)
It's nice that the Contest people are trying to make amends for the troubles they had in judging Problem F ("Page Hopping"), but the problem isn't so easily solved. The contest organizers can take into account whether or not a team had really solved Problem F or not, but they cannot take into account (nor did they say that they would) any of a dozen other factors that result from a problem like this (lost time, psychological factors, etc...). These factors make it nearly impossible to predict what the final standings would have been in a contest without mishap, and so I think that the effort on the part of the ACM contest judges may be misplaced.
Anyways, the guarantee given about no team's ranking decreasing is much like the compression algorithm that never increases the size of the input...
As it stands, I think that it is interesting that of the top six teams (those that got six or more questions), only Melbourne and Waterloo would have performed differently had the contest been run in an alternate universe without judging errors. A smuggled copy of the standings partway during the contest reveal that of these top six, only Melbourne and Waterloo did not get Problem F on the first try, although our team did eventually deduce the nature of the error with Problem F and submit a correct solution a little over halfway into the contest.
I am certain that both teams suffered heavily due to the error in terms of lost time and added stress, but I feel that the Melbourne team (or perhaps just the one member) is unnecessarily bitter about it (perhaps due, in part, to a conversation the team had with Dr. Poucher afterwards, where he assured the team that, to the best of his knowledge, there were no problems with question F). For example, the Waterloo team (which was justifiable angry halfway during the contest when we discovered the nature of the error with Problem F) decided to get on with the business of doing the other problems. Afterwards, we decided to refrain from speculating on what we could have done with an extra hour or so of time during the contest, and we are still very happy about the entire contest and the things we learned on the way.
I have less sympathy for people that gripe about spending all their time on problem F during the contest and getting nowhere, when there were 7 other problems that they could have tried their hand at. In particular, Problem E ("Internet Bandwidth") was a textbook network-flow problem, which didn't even require knowledge of any network flow theory. (Dynamic programming could have solved the problem within the time limits, although I didn't find anyone who did it this way.)
Anyways, my point is that the problem of "fixing" the results is not as simple as rejudging F and giving an extra question to everyone who didn't get it during the contest, but did have a correct solution.
Donny
Re:Entrant's opinion (Score:1)
only Melbourne and Waterloo did not get Problem F on the first try,
Tsinghua tried F and didn't get it, even in the rejudging. So they weren't directly affected in the same way as Waterloo and Melbourne. Nevertheless, they would have seen the scoreboard and may have made tactical decisions based on the perception that something was wrong with the data. For example, they may have had some trivial bug and 1/2 hour to spare at the end of the contest. Had a large fraction of the teams solved F, they would have had the savvy to spend some more time looking for a simple bug rather than perhaps working on a long shot.
Bad judging affects everybody. Due to luck or skill, it affects some more than others. Tactical decisions are made in real time based on the judgements and there is no way to undo the effects. I encourage the ACM judging team to be more open about the process, to admit their mistakes, and to do a thorough and public post-mortem on this failure in their QA process.
In my opinion this would be time and effort better spent than fiddling with the rankings in an attempt to please everyone.
One thing that is done in the ECNA regionals is to write an "input checker" routine. Imagine that the contest question is recast as "print 'yes' if the input conforms to the given specification; otherwise print 'no'." I don't see how this year's F data could have passed such a test.
Gordon Cormack
Coach, Waterloo ACM Team
Re:Contests are great, and Dr. Poucher's a great p (Score:1)
They haven't had a judging problem that survived the end of the contest in 10 years.
We have first-hand experience with two judging problems on previous finals. In 1995, we were awarded a balloon only to have it taken away much later. In 1997 we had a problem judged incorrect for some reason (probably due to ambiguity in the output). We moved on to another question and an hour or two later a "correct submission" message popped up for that question.
Judging problems at the finals are rare but do occur. Judging problems at the regionals are more common, because they are run locally and those in charge may lack experience.
In any event, judging problems do occur and dealing with them is part of contest preparation. However, it is important not to overestimate the chance of error - even in the worst-run regional, a judgement of "incorrect" is much more likely an error in the submission than an error in judging.
Gordon Cormack
Coach, Waterloo ACM team