schmiddy - Slashdot User

Comment Re:Supernovas (Score 1) 442

by schmiddy on Friday November 18, 2011 @09:22PM (#38105290) Attached to: OPERA Group Repeats Faster-Than-Light Neutrino Results

Comment Related anectode: software on space shuttle (Score 1) 243

by schmiddy on Monday October 31, 2011 @09:10AM (#37893630) Attached to: The Weight of an e-Book

Comment Re:I believe a citation needed is in order here (Score 2) 315

by schmiddy on Wednesday October 05, 2011 @11:35PM (#37622016) Attached to: After Six Days of Outages, BofA Claims It Hasn't Been Hacked

Comment Re:watch out for Intro to Databases class... (Score 2) 89

by schmiddy on Tuesday October 04, 2011 @08:42PM (#37606938) Attached to: Deadline Approaches For Registration In Stanford's Free CS Classes

Sigh, you've missed the entire point of the "Primary Keyvil" articles (Part 1, Part 2, Part 3), and many similar ones. Let's go through your drivel point by point.

"Student ID" is an acceptable primary key - you will be able to tell if two rows are duplicates based on this alone.

Wrong, wrong, wrong. A surrogate key, like "student ID", actually is an acceptable "primary key" for a table, but only if you have a real way to tell apart your users, something based on an understanding of an answer to the question "what defines a unique student, and how am I going to verify that?".

It's not automatically generated by the database, which is the primary keyvil syndrome.

VERY WRONG! From the database's point of view, and the "primary keyvil" syndrome, it doesn't matter if you fill in the "student ID" using, say, a database function called SYS_GUID(), or whether you generate this GUID on the client side. Read Part 1 of the Primary Keyvil syndrome articles for another example. But let's take our example of a table of students and run with it. You, the database application developer and schema designer, have created a table of students where the only unique key is a "student ID". Let's pretend you're smart, and you only assign new student IDs to new students coming through the gate on admission day. So far, so good, right?

Well, you're sitting in your office when a freshman comes in and says "Hey, I lost my ID. How do I get a new one". Now you're in a tough spot. You could say "what's your student ID number?", and if the student knows it, then you print off a new student ID for him, since you know who he is based on his ID number, right? Uh oh, you've just opened a door to students impersonating other students. But let's ignore that problem for now... what do you do if the poor kid doesn't know his ID number? Well, you ask him..... his name? Right? What if it's "Joe Smith", and you have fifty of those in your giant state school? Uh, I guess you ask him his name, and his street address, right? That's got to be unique, right? Or maybe his current SSN, those can never change, right? And how do you prove that the student in front of you is really who he says he is?

The frantic grasping around in the above paragraph is why you need to have a good answer to the question "what distinguishes a unique student?" before you go designing a table like this. There are several ways to answer this question: in practice, you might enforce unique constraints on (full name + home phone number), or maybe just a unique key on SSN if you're daring. But either way, relying solely on some arbitrary identifier like "Student ID" with no actual anchor in reality opens all sorts of paths to trouble. (Incidentally, the social security administration has the same problem, they've just thought through and been through the consequences. They have elaborate, formal answers to the question "how do we distinguish unique people, regardless of SSN", for scenarios like assigning new SSNs, changing SSNs, replacing lost social security cards to people who don't remember their SSN, etc etc.)

Another major problem I didn't even touch on, is how your model would prevent a user from getting two student IDs, either intentionally or accidentally. If you haven't answered these fundamental questions, you will have a database full of garbage. Kind of like the No Fly List.

It's as unique as it would be to include the students DNA in number form as the primary key.

Privacy concerns aside, DNA would actually be a totally reasonable way to distinguish unique students -- student comes in to your office, you take a cheek swab, and issue him his replacement ID card. (Hrm, I guess this is ignoring the issue of genetic clones..)

Tangentially related article which it sounds like you need to read, in addition to getting a basic understanding of "surrogate keys": Falsehoods programmers believe about names.

It's splitting hairs, but it's what professors tends to do anyway.

I wish professors would do their jobs, and split hairs about issues like this. Then we'd have fewer cocksure fools on Slashdot. Sigh, one can dream.

Comment watch out for Intro to Databases class... (Score 1) 89

by schmiddy on Tuesday October 04, 2011 @12:40PM (#37600700) Attached to: Deadline Approaches For Registration In Stanford's Free CS Classes

First, let me say that I really appreciate the work Stanford put into these online classes, especially the "free for everyone" aspect. They've done a great job pioneering free online classes _done well_, with lecture videos recorded well plus lecture notes plus banks of review questions plus exams. Really a great package overall.

I'm slowly going through the Machine Learning class, and the course is great. The instructor does a great job of easing the student into an otherwise math-heavy topic with graphing and hand-plotting, "Intuition", and simple examples.

However, I want to discourage anyone from investing a bunch of time in the "Introduction to Databases Course". Here's a slightly-edited explanation I sent to a friend, to whom I had at first recommended the course, before I had a chance to go through some of the videos (just a background note, I've worked with RDBMSs for several years, as an application developer, plus occassionally DBA, plus some work on an OSS RDBMS):

After watching two or three of that class's videos I've decided to give up on it. The course seems to have a needless emphasis on XML data storage, which turns out to be basically useless for real-world big data problems. Plus, either the Professor's presentation is unacceptably sloppy or she just doesn't know what she's talking about: lecture video #2 (or #3, I forget) was particularly bad, with imprecise terminology thrown around (row vs tuple) plus highly questionable database design being presented matter-of-factly (table of students flagrantly violates what's known as "Primary Keyvil"). She dove straight into the use of NULLs in this example table, presenting them as perfectly acceptable -- which would be OK for an "intro to MySQL"-type class, but not for a real course on the background of relational theory and RDBMSs (see "SQL and Relational Theory" and its treatment of NULLs).
Not to dissuade you from taking it of course, there is probably some useful information in there.

Seeing the professor present her table of students as a simple cut-and-dried example, with an explanation that "student ID" was an acceptable primary key, and no other unique keys on the table, really gives me a poor opinion of the professor's real-world subject matter knowledge.

Comment Re:Best Buy Loves Selling Snake Oil (Score 1) 664

by schmiddy on Tuesday July 05, 2011 @11:06AM (#36661358) Attached to: Retailer Calls Rivals' Bluff On "HDMI Scam"

Comment Re:I thought JAVA was supposed to be crossplatform (Score 3, Funny) 451

by schmiddy on Thursday October 21, 2010 @12:58PM (#33975460) Attached to: Apple Deprecates Their JVM

Comment Common superhero theme.. (Score 1) 419

by schmiddy on Tuesday October 12, 2010 @12:24PM (#33871522) Attached to: Study Finds Most Would Become Supervillians If Given Powers

Comment Re:32 bit signed integer strikes again (Score 1) 270

by schmiddy on Saturday October 09, 2010 @10:37AM (#33845090) Attached to: US Monitoring Database Reaches Limit, Quits Tracking Felons and Parolees

Comment Re:Communicate first? (Score 1) 662

by schmiddy on Tuesday October 05, 2010 @04:45PM (#33799508) Attached to: Can We Travel To That Exciting New Exoplanet?

Comment Re:One step closer? (Score 1) 286

by schmiddy on Tuesday October 05, 2010 @01:32PM (#33796462) Attached to: Skype Officially Available For Android

Comment Insomnia (Score 1) 625

by schmiddy on Friday October 01, 2010 @12:20PM (#33760476) Attached to: Senate Votes To Turn Down Volume On TV Commercials

Comment USPS = Socialism (Score 1) 569

by schmiddy on Thursday September 30, 2010 @09:25PM (#33754978) Attached to: White House Pressuring Registrars To Block Sites

Comment Re:Chill out man (Score 1) 440

by schmiddy on Wednesday September 22, 2010 @03:22PM (#33667118) Attached to: Are Desktop Firewalls Overkill?

Comment Re:So.. (Score 1) 344

by schmiddy on Tuesday September 21, 2010 @11:10AM (#33650468) Attached to: PostgreSQL 9.0 Released

Slashdot Top Deals