I'd be curious to know, among Slashdot readers, whether the essay below rings true. Are you a programmer? A libertarian?
I'm finally making good on my promise to post my "wild speculations" about Computer Science IQ and Libertarian inclination.
First let me give some background on CS IQ. I have taught at least 5,000 students how to program, which has given me a strong set of hunches about what goes on in their heads. But the most useful source of information came from my work as Chief Reader for the Advanced Placement Exam in Computer Science.
AP programs allow high school students to take college-level courses at their high schools and take a test that allows them to receive placement and usually credit for their work. As with all AP exams, the AP/CS Exam is divided into two parts: multiple-choice and free-response. In the free-response section, students hand-write solutions to problems. This has always been considered an integral part of the AP program because of the (at least perceived) limitations of multiple-choice tests. The AP/CS exam had 50 multiple-choice and 5 free-response questions. The free-response questions were all of the form, “Write a piece of code that does the following"
Obviously, the hand-written solutions need to be graded by real people. Every year about 60 CS teachers (called “readers") get together for 6 days to grade 10,000 exams. As Chief Reader, I was responsible for choosing the 60 teachers, managing their efforts for those 6 days, and setting the ultimate distribution of AP grades. In 1988 I made AP history by giving the all-time worst set of AP grades ever given out (I failed almost half of them). As a result, ETS approved a request they had never approved before. They gave me a diskette (actually 2) with the raw scores for all 10,000 candidates so that I could “study" it. My undergraduate degree is in math with a statistics specialization, so I’m the kind of person who likes to play with data.
One of the things I looked at was the set of correlations between various multiple-choice questions A high correlation between 2 test items indicates that candidates performed similarly on those items (i.e., those who got one right tended to get the other right and those who got one wrong tended to get the other wrong). I expected to find either virtually no correlations, because there was little repetition on the test, or clusters of correlations. If you were to test people on math, for example, you might find that arithmetic questions correlated highly with arithmetic questions, algebra questions correlated highly with algebra questions, geometry questions with geometry questions, and so on. I expected a similar pattern based on various programming constructs/skills.
What I found was highly puzzling. Five multiple-choice questions were each correlated with over a dozen other questions and I found virtually no other correlations at all. But there was no pattern to the correlations for these five. Let me describe the grandaddy as an example. One had more correlations than any other and I nicknamed it the “grandaddy." It was highly correlated with 25 other questions, yet the topic that it tested had nothing to do with the topics covered by these other questions.
When I looked at correlations between multiple-choice and free-response, I became even more puzzled. There was definitely repetition between the two halves of the test. For example, we had a free-response question about a technique called recursion and we also had several recursion multiple-choice questions. So were the multiple-choice recursion questions the most highly correlated with the recursion free-response question? Nope. The grandaddy was! Even though the grandaddy had nothing to do with the topics being tested in ANY of the five free-response items, it was the #1 correlated question for 4 of the 5 and was #2 for the fifth. Furthermore, the group of five questions mentioned above were in all cases among the top 6 correlated multiple-choice questions for each of the free-response items, and usually they were the top five.
Either I stumbled upon some kind of statistical fluke, or there was something special about these 5 multiple-choice questions. Flukes like this are highly unlikely with a pool of 10,000. Also, as I studied other aspects of the data, I was surprised to find these same five questions appear in the answer to two completely unrelated questions I pursued. I won’t bore you with the details, but suffice it to say that I found even more evidence that these questions were more “central" than the others.
My theory is that these 5 are CS IQ questions (particularly the grandaddy). I presented my data to CS faculty and students at Stanford, and they seemed to agree with my conclusion. They also gave me some interesting feedback about the five questions themselves. Everyone who looked at them agreed that they “felt" like the kind of questions that would distinguish a computer scientist. One faculty member described them as “the intersection of logic and programming." A more apt description given by another faculty member who had taught intro courses himself was that each question required a model of computation, and in his experience, this was the prime distinction he had seen between those who could program and those who could not. It was also obvious from the questions that logic and recursion are highly related to CS IQ.
Let me say a bit more about what I mean by a model of computation. Programmers are able to “play computer" in their head (sometimes requiring the aid of a scrap of paper). In other words, we have a model of exactly what the computer does when it executes each statement. For any given program, we have a mental picture of the state the computer is in when execution begins, and we can simulate how that state changes as each statement executes. This is rather abstract, so let me try to explain by giving a specific example.
Let me tell a story that is typical of those I heard from the TAs who worked for me at the computing center. A student comes up to the TA and says that his program isn’t working. The numbers it prints out are all wrong. The first number is twice what it should be, the second is four times what it should be,and the others are even more screwed up. The student says, “Maybe I should divide this first number by 2 and the second by 4. That would help, right?" No, it wouldn’t, the TA explains. The problem is not in the printing routine. The problem is with the calculating routine. Modifying the printing routine will produce a program with TWO problems rather than one. But the student doesn’t understand this (I claim because he isn’t reasoning about what state his program should be in as it executes various parts of the program). The student goes away to work on it. He comes back half an hour later and says he’s closer, but the numbers are still wrong. The TA looks at it and seems puzzled by the fact that the first two numbers are right but the others don’t match. “Oh," the student explains, “I added those 2 lines of code you suggested to divide the first number by 2 and the second by 4." The TA points out that he didn’t suggest the lines of code, but the student just shrugs his shoulders and says, “Whatever." The TA endeavors to get the student to think about what change is necessary, but the student obviously doesn’t get it. The TA has a long line of similarly confused students, so he suggests that the student go sit down and think through his calculating procedure and exactly what it’s supposed to be doing. Half an hour later the student is back again. “While I was looking over the calculating procedure, a friend of mine who is a CS major came by and said my loop was all screwed up. I fixed it the way he suggested, but the numbers are still wrong. The first number is half what it’s supposed to be and the second is one-fourth what it’s supposed to be, but the others are okay." The TA considers for a moment whether he should bring up the student on an honor code charge for receiving inappropriate help, but decides that it isn’t worth it (especially since that line of similarly confused students is now twice what it was an hour ago). He asks the student whether he still has those lines of code in the printing routine that divide by 2 and 4 before printing. “Oh yeah," the student exclaims, “those lines you said I should put in. That must be the problem." The TA once more politely points out that he didn’t suggest the two lines of code, but the student again shrugs and says, “Whatever. Thanks, dude!"
The student in my hypothetical story displays the classic mistake of treating symptoms rather than solving problems. The student knows the program doesn’t work, so he tries to find a way to make it appear to work a little better. As in my example, without a proper model of computation, such fixes are likely to make the program worse rather than better. How can the student fix his program if he can’t reason in his head about what it is supposed to do versus what it is actually doing? He can’t. But for many people (I dare say for most people), they simply do not think of their program the way a programmer does. As a result, it is impossible for a programmer to explain to such a person how to find the problem in their code. I’m convinced after years of patiently trying to explain this to novices that most are just not used to thinking this way while a small group of other students seem to think this way automatically, without me having to explain it to them.
Let me try to start relating this to libertarian philosophy. Just as programmers have a model of computation, libertarians have what I call a model of interaction. Just as a programmer can “play computer" by simulating how specific lines of code will change program state, a libertarian can “play society" by simulating how specific actions will change societal state. The libertarian model of interaction cuts across economic, political, cultural, and social issues. For just about any given law, for example, a libertarian can tell you exactly how such a law will affect society (minimum wage laws create unemployment by setting a lower-bound on entry-level wages, drug prohibition artificially inflates drug prices which leads to violent turf wars, etc.). As another example, for any given social goal, a libertarian will be able to tell you the problems generated by having government try to achieve that goal and will tell you how such a goal can be achieved in a libertarian society.
I believe this is qualitatively different from other predictive models because of the breadth of the model and the focus on transitions (both of which are also true of programming). On newsgroups I often see questions like: If we were in situation A and government took action X, what would happen? If we were in situation B and a corporation took action Y, what would happen? If we were in situation C and an individual took action Z, what would happen? Libertarians almost always quickly answer by saying, “I’ll tell you exactly what would happen" And, surprisingly, the libertarians tend to give the same answer in most cases. I think most people find this odd about libertarians. They understand how an economist might be able to predict the effect of a certain law on the economy or how a social scientist might be able to predict how drug legalization might affect the ghettos, but they don’t understand how somebody could predict all of these things, especially someone who has no formal training. Libertarians, on the other hand, don’t seem to understand how someone could fail to have such a model of interaction (it would almost be like having a Supreme Court judge who had never thought about Roe vs. Wade – ha ha). The nonlibertarians have no comprehensive model of interaction, and as a result, they can’t communicate in a meaningful way with those who do. Their attention is always focused on misleading superficial problems rather than on the underlying causes of such problems.
When I observe how most people approach politics, it reminds me of the way my hypothetical student approached his program. A person notices that some people are making $1 and $2 an hour and are having difficulty managing financially on such a sum. This seems bad and they want to fix it. But they have no model of interaction that would allow them to reason about what might cause such a result. So they decide to pass a minimum wage law so the problem will go away. And it does (apparently). There aren’t any poor people making $1 and $2 an hour anymore. But there are suddenly lots of unemployed people who have to live off welfare (a new problem). Does the person make the connection and realize that they caused this problem? Not without a model of interaction. So instead they say we have to fix the unemployment problem. And then we have to fix the new problems generated by the fix to the unemployment problem. And then we have to fix the new problems generated by the new fixes. And so on.
If you suggest that eliminating minimum wage laws and the government interference that made those people so poor in the first place would be a better solution, they look at you incredulously and say you must be crazy. This is just like the situation with my TA and the student who had added 2 lines of code to make the numbers print out correctly (“Are you crazy? Why would I delete those lines of code when the numbers would then print out incorrectly?" Because the problem is elsewhere, and that’s the problem you should be addressing, but that’s difficult to explain to someone who doesn’t have a model of how his program works). Seriously, I think the credibility gap that existed between my TAs and the students who sought their help is similar to the credibility gap between libertarians and nonlibertarians. And I also suspect that the gap will continue to exist unless and until those other people learn to think in terms of a comprehensive model of interaction.
As usual, I’ve talked more than I should. I’m not sure that I’ve made my point very well, but I think it would require a great deal more time for me to make this more comprehensible. I suspect that the programmers who read this message will understand me, but the others might not. Anyway, I think I’ll leave it mostly at that, but add a few related comments.
Don Knuth, who wrote the CS equivalent of The Bible, says that the thing that most distinguishes computer scientists is their ability to “jump levels of abstraction." I mentioned that programmers can “play computer," but what good is that when you are working on a 100,000-line program? It would take so long to simulate the thousands of instructions and the vast amount of data that you’d never get anywhere. But programmers get around this by using abstraction. A programmer can reason about the top-level execution of a program, for example (a macro-view, if you will). But when necessary, he can focus in on a program module, or a single subprogram, or a single loop, or a single line of code (more and more of a micro-view). A programmer can even, when necessary, reason about how that line of code will be translated into machine-code and even what changes are likely to happen to the physical hardware involved. A programmer understands a program at all of these levels of abstraction. It is essential that he can jump quickly between levels, and relate information at one level to information at another level, if he is to be able to eliminate problems in his code. I think libertarians also exhibit this behavior. A libertarian can comfortably tell you how governments interact with each other, how governments interact with corporations, how corporations interact with each other, how corporations treat individuals, and how individuals interact with each other. It would be impossible to have a model of interaction without these levels of abstraction and without being able to jump between levels when necessary (e.g., saying, “If government A passes law X, that is likely to pressure government B to also pass law X, which causes the corporations controlled by government B to take action Y, which causes individuals working for those corporations to take actions Z and W.")
Another link between libertarianism and programming is that the principles of good programming are closely related to libertarian ideals. We call it “top-down programming," but anyone who has studied structured programming knows that “central planning" is quite different. A well-structured programming will have high-level modules that are loosely-coupled (i.e., as independent as possible). This means that at the highest level, a program should minimize tasks so that it performs only those tasks that are essential. In other words, “that program is best that programs least." This is the principle of decentralized government. As another example, the structured programming concept of information hiding is really the libertarian belief in privacy. Information hiding says that the internal details of a subprogram should be independent from other subprograms (in fact, the goal is to have them INVISIBLE to other subprograms). This is like saying that the private choices made by one individual that affect only that individual should not be influenced by other individuals (and would ideally be kept entirely confidential).
I mentioned the importance of logic to CS IQ. I believe it is equally important to libertarian philosophy. From my observation, libertarians tend to think that all political questions can be answered with an almost mathematical certitude. There is no such thing as “a friendly disagreement" in mathematics. If two mathematicians disagree, then one is mistaken. Similarly, if two libertarians disagree, each asserts that the other is either operating from a false assumption or has a flaw in his logic. I think nonlibertarians are really turned off by this, particularly because it comes across as obnoxious and egotistical. But libertarians seem to thrive on it. The community has a kind of intellectual-warrior ethos.
A debugged program is one for which you have not yet found the conditions that make it fail. -- Jerry Ogdin