Forgot your password?
typodupeerror

Computer Date Glitch May Limit Next Shuttle Launch 354

Posted by ScuttleMonkey
from the 1999-called-they-want-their-date-bugs-back dept.
n3hat writes "Reuters reports that the next Space Shuttle mission may have to be deferred if it gets too close to the New Year because the onboard computers do not handle the changing of the date in the same way as the ground computers. From the article: '"The shuttle computers were never envisioned to fly through a year-end changeover," space shuttle program manager Wayne Hale told a briefing. The problem, according to Hale, is that the shuttle's computers do not reset to day one, as ground-based systems that support shuttle navigation do. Instead, after December 31, the 365th day of the year, shuttle computers figure January 1 is just day 366."
This discussion has been archived. No new comments can be posted.

Computer Date Glitch May Limit Next Shuttle Launch

Comments Filter:
  • How Many Times? (Score:2, Interesting)

    by Anonymous Coward
    How many times is this going to bite us in the ass? Ada solves all these sorts of problems, and soooo many of my tax dollars went into its creation? I understand that the space shuttle is a limited platform, but why aren't any of the lessons learned in Ada being applied?
  • wtf? (Score:5, Funny)

    by PhrostyMcByte (589271) <phrosty@gmail.com> on Monday November 06, 2006 @11:35PM (#16747175) Homepage
    Is there a reason these aren't built on standard parts and operating systems? If they ran their shuttles on something like Debian stable it would be a rock solid platform and probably end up saving them lots of money. Or am I missing something here.
    • Re:wtf? (Score:5, Funny)

      by jbrader (697703) <stillnotpynchon@gmail.com> on Monday November 06, 2006 @11:38PM (#16747211)
      Is there a reason these aren't built on standard parts and operating systems?

      It was built by the government.

      • Re: (Score:3, Insightful)

        by Agelmar (205181) *
        Actually, if you want to be correct, it was built for the Government. There's a difference - rather than building a piece of crap using underpaid (government) labor, we paid top dollar so that it could get subcontracted out multiple levels, while still winding up with the same crap.
    • Re:wtf? (Score:5, Insightful)

      by schnikies79 (788746) on Monday November 06, 2006 @11:41PM (#16747247)
      your idea of rock solid and their idea is little different. they have probably one of the most bug-free pieces of software in existence. it's tailored to do what it needs to do, nothing more, nothing less and it does it perfectly.
      • Re:wtf? (Score:4, Insightful)

        by Anonymous Coward on Monday November 06, 2006 @11:43PM (#16747269)
        Bug free except for the rollover to a new year...
        • by dhasenan (758719)
          Which suggests a workaround: Set the computers to June rather than December.

          There are issues with this, of course--mainly, if lunar gravity is significant, then calculating it requires knowledge of its current position, which can be calculated based on the date (presumably). In which case you need to find an equivalent lunar period that doesn't fall on New Year's.

          Still, it should be possible to update the code to handle the date issue, without much trouble.
          • Re: (Score:3, Interesting)

            by Firehed (942385)
            Or just let the thing call Jan 1 '07 'day 366, 2006' and have the boys in the funny suits wear watches.

            You'd think someone at NASA could just do the five-second fix and be done with it though. It's, what, three lines of code?

            Big hint, NASA:
            while ($day >= 366)
            {
              $year++;
              $day -= 365;
            }
            • Re: (Score:2, Insightful)

              by lnjasdpppun (625899)
              And leap years? And anything else neither you nor I have thought of?

              I suspect they don't do a 5-second fix up because it's a space shuttle and they do far more testing and documentation for their code than any other project in existence.
            • by Carthag (643047)
              that would only work for 2007! Preferably do something like $year = 2006 + $day div 365; $day = $day mod 365;
            • Re: (Score:3, Informative)

              by camperdave (969942)
              That might work if the shuttle computers ran C. They run HALS. Also, in how many places does this code fragment need to be installed? How does the timing of the loop affect the rest of the code? Remember, the shuttle has multiple computers. If the timing is too different between them, they can shut down the mission.

              It would be far safer to just delay the launch a few days.
              • by Firehed (942385)
                How so? In space or on the ground, the clock's still going to suffer from the bug.
            • Re:wtf? (Score:5, Insightful)

              by Splab (574204) on Tuesday November 07, 2006 @05:20AM (#16749309)
              You know, people like you give programmers a bad rep. You just dive in for the fix without knowing the cause - and on top of that add a few bugs that are even harder to iron out if you happen to be the only person knowing that code segment.
            • by multipartmixed (163409) on Tuesday November 07, 2006 @10:49AM (#16751195) Homepage
              So, what if, oh, say, the CO2 scrubbers need to work differently depending on how many days the mission has been run. So, they keep track of the first day number, and the current day number. The amount of CO2 scrubbing then is varied based on elapsed days.

              ^^and here's the key -- it's something you don't know about^^

              Now, you make your little 5-second fix, and send seven astronauts into space.

              New Year's Eve rolls around, and suddenly the mission started on day 360 and it's now day 1. Holy crap, says the scrubber, we have to scrub as though it's a 359-day mission, instead of a lousy 6.

              Scrubbers go into overtime, and break. (Or, scrubber math is done in eight bits, and they think the shuttle's still on the ground and not ready to launch for another ~100 days due to integer roll-over. Or any other set of unforseen possibilities.)

              Next, astronauts die of CO2 poisoning because the scrubber subsystem has been compromised.

              Great fix, mister five-second-coder.
          • Re: (Score:2, Interesting)

            by Digicrat (973598)
            Even if it is a simple bug fix, if they just identified it now, it definetly would not be "flight ready" by the end of the year.

            Any code to be used on a spacecraft must be thoroughly tested and certified (by the developers and a seperate team of testers) before it can be uploaded to the craft. That process can easily take several months to complete. This is the procedure for unmanned spacecraft, I'm sure the procedure for the Space Shuttle is even more stringent.

            My guess is they're writing a solution now
        • by joe90 (48497)
          If you RTFA, it seems to have been a design decision, rather than a bug per se. Just because something isn't present in an application, it doesn't necessarily mean it's a bug.

          IIRC, software design for the space shuttle is somewhat detailed, so I don't think something like the date functionality is anything but a deliberate design decision.
          • by wwwillem (253720) on Tuesday November 07, 2006 @04:12AM (#16748951) Homepage
            This is not a bug ....

            Imagine you are a member of the shuttle design team and you can make a choice (for the next 20 years) to either know for sure that you're with the kids at home on X-mas and New Year .... or you can suggest a software feature that could result in your New Year's Eve being spoiled down the road because you have to be for days in a dumb control room. Hey, what would you do??

            And I still remember, when I was a kid, that we had that Apollo flight during X-mas. I think it was the one that would for the first time go behind the moon. Someone in the control room that year made it into an important enough person on the Shuttle program so that this WOULD NEVER HAPPEN AGAIN. :-)

        • It's not a bug; the software spec apparently did not call for such a rollover. Feature drift would be tinkering with the code to allow a rollover and thereby introduce a flaw.
          • by Lehk228 (705449)
            a better software design would have been to simply use int days_since_epoch for your day, then have some nice display formatting code whenever meatware needs to interpret a date, that way any problems converting logical dates into meatspace-compliant dates would only confuse the people rather than crash(no pun intended) anything important
      • Uhm. I think the whole point of this article is that it doesn't do it perfectly.

        That said, I'll agree that NASA's software is certainly a heck of a lot more stable than Debian. After all, this is rocket science.
        • You (and many other posters) are assuming it's a bug in the shuttle code. But maybe it's supposed to treat year end rollover that way. The shuttle code is written to a spec and if the spec says that's how it should behave then either the spec is wrong or the ground based systems are wrong.
      • they have probably one of the most bug-free pieces of software in existence.

        Ahh, so this must just be an unexpected feature...

      • your idea of rock solid and their idea is little different. they have probably one of the most bug-free pieces of software in existence. it's tailored to do what it needs to do, nothing more, nothing less and it does it perfectly.

        Actually, that's not entirely true. The space shuttle's main computer is called the GPC -- General Purpose Computer. Among other things it controls avionics and a whole bunch of ther systems -- it's not really a dedicated-purpose computer as one might think.

    • Re: (Score:3, Informative)

      by SuperBanana (662181)

      If they ran their shuttles on something like Debian stable it would be a rock solid platform and probably end up saving them lots of money. Or am I missing something here.

      Yes, you are. Primarily the fact that Debian Stable isn't even Carrier Grade, and certainly not qualified for life support. Ie, "doesn't crash in the midst of re-entry."

      Keep in mind that the shuttle computers are highly redundant (I believe there are three main computers, and three backups of each component?), monitor a HUGE number

      • by Mr Z (6791)

        When I heard it described to me some time ago, they used 3 copies of the primary system that operate via a voting system. If all three fail due to a shared glitch, a 4th machine, independently implemented to the same specs takes over. The idea of the 3 primaries is that a hardware failure can be addressed through votes, but an incorrect implementation cannot. The 4th machine is a hedge against incorrect implementation. If the specifications were wrong, you're just simply hosed and no amount of redundan

    • by sholden (12227)
      You're missing that they were designed in the 70s - a little before debian existed: http://spaceflight.nasa.gov/shuttle/reference/shut ref/orbiter/avionics/dps/gpc.html [nasa.gov]
    • Re:wtf? (Score:5, Interesting)

      by tverbeek (457094) * on Monday November 06, 2006 @11:49PM (#16747337) Homepage
      Is there a reason these aren't built on standard parts and operating systems? If they ran their shuttles on something like Debian stable it would be a rock solid platform and probably end up saving them lots of money. Or am I missing something here.
      Yeah, you're missing something. Such as the fact that the Shuttle was designed a quarter century ago, when Debian was so far from a stable release that Ian hadn't gotten the hange of long division yet, and still believed that Deb had cooties. "Standard parts" meant 8-bit CPUs with 64KB address spaces, and "standard operating systems" included CP/M, 4BSD, VMS, and an upstart known as PC-DOS. I can't really blame them for building something in-house instead.
      • Re:wtf? (Score:4, Interesting)

        by Anonymous Coward on Tuesday November 07, 2006 @12:15AM (#16747579)
        You neglect the fact that military/gov't programming languages at this time included HAL/S, Jovial, NELIAC, &c. (Yes I know that Jovial is still in use for the Navy's ITS 8/16bit muControllers). I've used XPL (the language that the HAL/S compiler was written in) & HAL/S; It was basically the predecessor to Ada/SPARK's 'provability' & 'stability'.
        Here is a decent source of HAL/S examples:
        http://www.hq.nasa.gov/office/pao/History/computer s/Appendix-II.html [nasa.gov]

        Now, look at the procedure called 'read_accel', about 1/4 down the page.
        Midway through, there is a ton of gunk. That's the HAL/S maths for you:
        the program is allowed to use three lines to express mathematical code, to 'mimic' math
        in code. Now, this is the '70s; it's of little wonder that they weren't worried about the
        date switch so much as making sure that:
        1) the compiler produced code that could be checked & double checked to be '100%' failure proof and at least be resilient to problems.
        2) It had to deal with the beast of being machine independent & easily understandable to the PL/I & FORTRAN programmers of the day
        3) It also had to make sure that tasks were scheduled properly & run when specific interrupts happened. This is the Norm for Ada/SPARK now, but HAL/S was pretty much the pioneer here within the Aerospace field.
      • by Inoshiro (71693) on Tuesday November 07, 2006 @03:51AM (#16748815) Homepage
        "Yeah, you're missing something. Such as the fact that the Shuttle was designed a quarter century ago, "

        I can't believe this was moderated as +5, Insightful.

        The shuttle was designed WELL OVER a quarter century ago. A quarter century ago, they had done some much design and testing, they were able to have the maiden flight (STS-1, Columbia launched in April 1981). Shuttle design and specification requirements analysis began in October of 1968. VMS, CP/M, PC-DOS, and 4BSD did not exist when the Shuttle was designed.

        You must be thinking of Multics, which was the closest thing to a modern operating system that existed in the 1960s.

        Seriously, you have no idea how old the Shuttle design is. I have no idea why they keep using it after the great work done 20 years ago by Richard Feynman [wikipedia.org] who showed that NASA's shuttle design was about 1/100 flights unreliable. For the record, we've sent up 200 missions and had 2 shuttles blown up. The Space Shuttle is a piece of garbage, and NASA has wasted billions exploring low Earth orbit, rather than do something more useful.
        • by Overzeetop (214511) on Tuesday November 07, 2006 @09:14AM (#16750355) Journal
          Actually, the estimated failure rate for the shuttle program was 1 in 35, though the shuttles themselves may have been designed to withstand 100 launch/landing cycles*. This was a bit of an issue when the 25th mission resulted in a failure (since most of the population does not understand statistics).

          And, for the record, there have been 117 launches, according to wiki, which I will take as accurate enough for this discussion (far less than 200).

          *yes, IWAAE (I was an aerospace engineer) working for NASA, and was involved with shuttle payloads and structural reliability analyses.
      • Re: (Score:3, Funny)

        by Pseudonym (62607)
        Such as the fact that the Shuttle was designed a quarter century ago, when Debian was so far from a stable release [...]

        Right. It only seems like Debian stable releases are a quarter century apart.

    • Re:wtf? (Score:5, Insightful)

      by Schraegstrichpunkt (931443) on Tuesday November 07, 2006 @12:00AM (#16747433) Homepage
      Is there a reason these aren't built on standard parts and operating systems?

      Standard parts don't like being bombarded with radiation. Standard operating systems aren't fault-tolerant.

    • Re:wtf? (Score:5, Interesting)

      by E-Lad (1262) on Tuesday November 07, 2006 @12:25AM (#16747671) Homepage
      You're indeed missing something here.

      While I'm not thoroughly educated on this particular subject, I would say that it's a pretty good chance that the flight computers on the shuttles are based on technology that's at least 15 years old (all shuttles underwent a "glass cockpit" update in the mid-late 90s). You don't see NASA cutting a purchase order to cdwg.com when the newest AMD or Intel offering is announced and stuffing that into the shuttles. This stuff is designed, planned, coded for and integrated over a number of years and is very static. No changes. If there has to be changes, they're done under a quality control methods so strict that, yes, Duke Nukem 3D might see the light of day first.

      And that's just the hardware part.

      On the software side, I'd say you're probably looking at stuff written in any assortment of "classic" languages such as ADA, COBOL, or worse. Due to the nature of the metric f*k ton of sensors, mechanical servos, data inputs, and other such esoteric (and dated) hardware on the shuttles, the software must control, query, parse and monitor, the software is pretty darn married to the platform it runs on.

      So, before blurting "D0odz, just instahl leenux n yr shuttlz (deeban stble rox wif glox!)" Give it some deeper thought. There's likely a darn good reason why things are they way they are (bugs not withstanding) when it comes to large flying contraptions that are designed to safely get 7 people 300 miles up, keep them there for a week (or two) and get them home. Sometimes simple things (to you and I) such as a year roll-over are outside the scope when it comes to designing systems to do what the shuttle does.
      • Re: (Score:2, Insightful)

        by Xipher (868293)
        On top of that, it's a realtime system, none of this get it done when I want to, its get it done by this dead line, or people DIE!
      • Ok, I agree with everything you said,but seriously COBOL? Try FORTRAN. ADA's a good guess though.
      • The shuttle software development process is actually famous. A textbook case of good software engineering. At least a decade ago, it spent the most per line of code of any government software contract. The development documents take up several shelfs of large binders. And it was noted for being almost entirely bug free. I doubt its a hodgepodge of different languages, though it might be I suppose. ...its amazes me that with all their planning they didn't think about a year change.
      • What does the date have to do with controlling a "large flying contraptions that are designed to safely get 7 people 300 miles up, keep them there for a week (or two) and get them home"? Flight control, navigation, etc are all done in real time, not on a calendar basis. The only thing I can think of is maintaining a log, in which case the date doesn't really matter. The log file can be cleaned up on the ground afterwards.
    • by voidptr (609)
      Or am I missing something here.

      Yes.
  • Well.... (Score:4, Insightful)

    by SuperBanana (662181) on Monday November 06, 2006 @11:37PM (#16747197)
    ...I guess it -is- rocket science.

    *ducks and runs for cover*

    Seriously though- they never "envisioned" a mission occuring over the end-of-year? Let me guess: a defense (space) contractor designed the systems.

  • Uhm...and? (Score:2, Interesting)

    Pardon my ignorance, but is this really serious enough that it should actually cause a delay? I mean, if it's simply a matter of figuring out what the date is, I'm sure that the astronauts and engineers involved in the project know at LEAST basic mathematics, and can determine that if it's, say, Day 367 on the shuttle computer, then 367-365 = 2, AKA January 2nd, 2007.

    I'd say the article missed something; the whole concept sounds far too ridiculous to stand on its own.

    • Re: (Score:3, Informative)

      by Broken scope (973885)
      Actually a lot of data is from that date. Navigation and other systems probably rely on the date being correct to give an accurate reading or reliable function.
    • Re: (Score:2, Informative)

      by Zordak (123132)
      The article is sparse on details, but it sounds more like a problem where the date on the shuttles computer does not match the date on the ground system's computer. That can be a problem.
    • Re: (Score:3, Insightful)

      by plaxion (98397)
      It probably has to do with the mismatch between systems, not a lack of the engineers' or astronauts ability to count on their piggies and toes. Their current configuration doesn't have a middleware layer that accounts for any possible differences. In other words, while the shuttle continues on thinking it's the 366th day, the ground control systems might get confused (e.g. "Hey, there's no such thing as a 366th day") and their programs may crash (no pun intended) as a result.
    • An even better question is, why does it need to know the calendar date at all? I have very intimate knowledge of 4 different spacecraft software designs (two of them at least on the order of the shuttle flight software), and none of them calculate the gregorian date directly, only one knows anything at all about calendar dates, and that is used for a VERY trivial purpose that wouldn't be affected by this sort of bug. Julian Date, or some variation thereof, is the usual time reference on spacecraft, and tha
  • So we've been flying the space shuttle for almost 30 years and this was never viewed as a problem before? And we're trusting NASA to send astronauts to the moon and beyond over the next 20 years?
    • They had a system with some limitations that worked back in the 70's and was developed for lots and lots of money. They have a tried-and-true technology. The article says that their fleet is due to retire in 2010 and they were looking for a way to change this back in 2003, so my guess is probably that they're just waiting for the next generation of shuttle instead of retrofitting a system that already works within some bounds, which would cost both money and time. If they have to delay launches by a couple
  • Bites me (Score:2, Interesting)

    The shuttle computers were never envisioned to fly through a year-end changeover

    Sorry to sonud so skeptical....but am I the only one who is worried about capability of missiles (and other defence systems) to handle war through a year-end changeover?

  • Yay! (Score:3, Funny)

    by eric.t.f.bat (102290) on Monday November 06, 2006 @11:56PM (#16747381)
    Three cheers for the Y0.001K problem!
  • by T-Ranger (10520) <jeffw@@@chebucto...ns...ca> on Tuesday November 07, 2006 @12:03AM (#16747467) Homepage
    The shuttle runs on three modified IBM 360 systems. Were pushing 35, almost 40 year old systems here.

    Do you know how many eligible 35 year old computer bachelors there are out there? Ill tell you: none. Of course the shuttle computers can't get a date.
  • I used to write this kind of glitch into my kludgy Commodore BASIC Vic20 programs when I was a kid. I never thought I was good enough to be a professional though. Seems like maybe there's a chance for me after this? I reckon the shuttle could run with BASIC if sensible line numbers were used. And GOSUBs for that professional touch.
  • I read it quickly and thought it said, "The shuttle computers were never envisioned to fly through a year-end hangover".

    I couldn't figure out for the life of me why they'd let mission critical crew drink bubbly in space... or why the computer would give a damn.
  • by SchnauzerGuy (647948) on Tuesday November 07, 2006 @12:09AM (#16747541)
    As a professional software developer, I have heard on countless occasions about how the Space Shuttle software development process [fastcompany.com] is so incredible, and how all other developers should try to live up their high standards.

    Granted, the work they do is very impressive and the process is very exacting. But come on...they haven't been able to fix a simple year rollover event in 30 years?!?

    From the Fast Company article:

    Consider these stats : the last three versions of the program -- each 420,000 lines long-had just one error each. The last 11 versions of this software had a total of 17 errors.

    I would say that requiring a reboot every year on December 31 is a pretty huge error. In this case, it is forcing NASA to launch earlier than they otherwise would wish. And this isn't the first time this type of problem has caused problems. The New Scientist has a similar article [newscientistspace.com] that goes into more detail:

    This is not the first time that the shuttle programme has been faced with the year-end rollover problem. On a Hubble servicing mission in 1999, the year of the overblown Y2K computer scare, the shuttle landed on 27 December (see Fuel fault delays space repair). To make sure the shuttle got back on the ground before 31 December, mission managers decided to drop one of the four planned spacewalks.
    • by Nataku564 (668188)
      Have you by chance considered that fixing it may cause more problems than it actually solves? Not all software behavior that seems strange is a bug. If its an understood part of the design, and you can work around it easily, then so be it. I would much rather fly on that, than knowing that the engineers had just added in shiny cool year-end date rolloff functionality that may or may not work, as opposed to the current software that has worked near flawlessly for decades. Especially with dates. They may
      • I'm not sure what you mean by causing more problems. In these two articles, we have two clear examples of how the poor design/poor programming has caused some significant problems:
        1. Pushing NASA to launch earlier than they would otherwise desire - the software is the causing the problem. It might be an inconvenience or scheduling issue right now, but what if the schedule slips for other reasons (weather, typical Shuttle problem, idiot boater in the safety exclusion area)? How many millions of dollars wi
    • Perhaps the shuttle system specification states that the date should be treated that way. It might the ground based systems that are wrong. It might be that neither system is wrong and it is the specifications for each system that are inconsistent.
    • ...which is more than many software development processes would reveal. Chances are that this known restriction is on a check-list which every shuttle mission has to be checked against, and the list would exist precisely because the software development and verification process is so solid and conservative.

      As a professional software developer, I have heard on countless occasions about how the Space Shuttle software development process is so incredible, and how all other developers should try to live up

    • by achurch (201270) on Tuesday November 07, 2006 @04:18AM (#16748993) Homepage

      I would say that requiring a reboot every year on December 31 is a pretty huge error.

      I wouldn't. When you're designing something like Shuttle software that has to work absolutely flawlessly 100% of the time, you don't put in any frills. And on something that is only ever in space for 10-15 consecutive days at most, year-end handling is most certainly a frill. (If you are a professional software developer, it ought to be obvious just how many things could break by adding a feature like that. If the original design calls for a monotonically increasing day number, for example, there's very likely to be some code that relies on that, so you have to go through the entire system, checking everything that even touches the day counter to ensure it can handle a reset from 365 to 1--and then check everything that uses those routines, and so on and so on.)

      I suspect this is routine to NASA, and the reporter just blew it out of proportion. After all, Windows can handle end-of-year rollover, so if the Shuttle can't then it's broken, right?

  • That's the kind of lame bug we used to joke about the Soviet Union equipment designing.
  • If it ain't broke (Score:4, Interesting)

    by GreggBz (777373) on Tuesday November 07, 2006 @12:12AM (#16747571) Homepage
    So, they made the software so it does not kill anyone. Who needs fancy features like precise yearly timing?

    Seriously, though, it's worked fine. The software has not killed anyone. They can either fix it and modify a very critical system on an enormously complex vehicle, or they can move the launch date around a few days, which they seem to do for every launch anyway. B is probably safer and more predictable.

  • by Jozer99 (693146) on Tuesday November 07, 2006 @12:20AM (#16747633)
    The problem seems obvious. If the shuttle computer is allowed to think it is the 366th day of the year, it will obviously turn evil and try to destroy the earth using the vast orbiting nuclear arsenal, while we sit helpless on the surface. We can't allow this to happen.
  • Date/Time Formats (Score:5, Informative)

    by Detritus (11846) on Tuesday November 07, 2006 @12:33AM (#16747731) Homepage
    The problem is that NASA, and other space agencies, standardized on a date/time format composed of day-of-year (1..366) and time-of-day (UTC). This goes back to the 1960s. In ASCII, the clock looks like "310 04:35.27.642". This date/time format is embedded in a huge amount of hardware, software and standards documents. It's also used for things like countdown clocks and MET (mission elapsed time) clocks.

    The end-of-year rollover depends on the leap year and leap second (if any), and has traditionally been a source of problems.

  • by tlhIngan (30335) <(ten.frow) (ta) (todhsals)> on Tuesday November 07, 2006 @12:35AM (#16747755)
    Could it simply be that the date is a hard concept? You've got months with uneven number of days in them, including one month that can have an extra day added to it based on a somewhat complex concept (every 4 years, except if it's divisible by 100, UNLESS that year also happens to be divisible by 400). Calculating how many days there are between now and some future date, without using magic numbers? Heck, even software in the 90's couldn't get it right that there was a Feb 29, 2000.

    Every date math equation I've seen has all sorts of wierd magic numbers in them where it isn't clear how those numbers were obtained. This may work just fine in day to day computations, but oddball bugs in date calculations can lead to some very wierd errors. Look at the C library sometime for the date functions. It's quite impressive.

    Perhaps when the shuttles were designed, the inability to schedule across the new year was acceptable to avoid introducing odd bugs in the program to keep the software provably correct. Ground systems, which can be repaired in the middle of a mission easily, can be a little less bug-free, since a miscalculation won't cause the Earth to suddenly veer off course.
    • by Shados (741919)
      Indeed. The date system is all made to periodicaly fix itself to compensate for little details, like the fact that there isn't an integer amount of time the earth spin on itself in the time it takes it to spin around the sun, and a bazillion other details. Its just not worth it. Make 10 freaking month, 3 weeks of 10 days (or something), and thats one year. 1 day has 10 hours of 100 minutes each with 100 seconds. Use round numbers. So for now winter will be on month 5, in a few years winter will be on mont
    • by isomeme (177414)
      That's why you never let date/time components get past the outermost layer of I/O. Internally, everything should be either a Julian Day (or modified JD) for date-resolution values, or an epoch second or millisecond for values of those resolutions.

      I'm astonished that anyone ever builds time processing and storage systems any other way.
  • Hold up, everybody (Score:5, Insightful)

    by Alizarin Erythrosin (457981) on Tuesday November 07, 2006 @12:52AM (#16747891)
    I work with military navigation software, and that is sorta remotely applicable to this. Here's my thoughts:

    You people with your "WTF NASA SUXORS THIS IS EASY FIX!!!11!!1!one!!" need to stop and think for a second. This is a space application that carries HUMAN BEINGS! Think about how hard it will be to get this "easy fix" qualified, proven, documented, etc. Its not an easy task. A formal qualification test on the systems I work on (military land- and air-, but not space-based navigation software) can take months, and require all sorts of tests and documentation. Anything that isn't formally tested (i.e. run in a van, on a plane, etc) must be shown to not fail in any way; all exceptions handled, no bad data can cause an undesireable state, etc. I would hate to see the type of scrutiny that the Shuttle software goes through (although I could probably call somebody in our Space division across the street and find out).

    Second, I don't know exact specifics, but based on the information provided, I think this "glitch" will have to do with the data/time difference between ground stations and the Shuttle computers. Things like message time stamping between the Earth and the Shuttle, etc, will be wrong, and things could be garbled or just dropped all together. The navigation systems themselves should not be terribly impacted since the date will just roll to the next day. Inertial instrument samples will continue to flow in and be correctly time stamped, be it the 366th or 400th or 500th day.
  • I caused quite an uproar with a post to The DailyWTF [thedailywtf.com] where I proposed that dates like “September 31, 2005” could be considered the same as “October 1, 2005”. The responses are varied and some of them insightful. Worth a read if this stuff interests you.

  • by Venik (915777)
    In other news: a computer glitch may elect a President.
  • by LakeSolon (699033) on Tuesday November 07, 2006 @02:31AM (#16748479) Homepage
    I originally found it hard to believe the shuttle hasn't been in orbit over new year's before.

    http://en.wikipedia.org/wiki/List_of_space_shuttle _missions [wikipedia.org]

    The closest I could find was STS-103, the HST servicing mission in '99. Launched December 19th and lasted 7d 23h.
  • http://www.fastcompany.com/magazine/06/writestuff. html is a really good read on how the shuttle software is actually made. It's the most reliable software in the world with the most exacting design process.

    How many other groups can deliver a half million lines of code with only 1 error (and no, not this issue. And as far as this being an error or bug, it really isn't. It's a know design restriction on a system that just works. Do you really want to go redesign a large chunk and possibly introduce life threatening bugs, or work within the known design window for the system.
  • by JetScootr (319545) on Tuesday November 07, 2006 @07:08AM (#16749747) Journal
    TFA carefully does NOT say that anything actually will fail, but that something might fail. Thank you, Fallon: your link (http://www.fastcompany.com/magazine/06/writestuff .html) is a good explanation. (However, the "on-board shuttle group" is actually called the "on-board systems group").
    It's like this: A clock rollover (such as at midnight or the last day of the month or year) always sets something back to zero. That resetting is a risk: Is there something somewhere that doesn't take the rollover into account? It may be an obvious bug, or not so obvious - what if the problem is dynamic? For example, what if system A sends some data and rolls over, and system B rolls over and receives the data? Then it looks like stale data, but isn't. How do you test for dynamic conditions like this?
    Dodging this bullet is far, far cheaper than testing for it.
    The only time I know of that a shuttle flight software bug affected a flight was uh...STS 2 or 3 or thereabouts. The shuttle often flies an updated load on one or two of its computers before the load is installed on all of them. On this mission, a new load on one GPC dumped (crashed) at T -9 seconds or so, causing everything to shut down automatically. The shuttle launched a day or two later, after the new load was rolled back.
    Funny thing was, the same bug had occurred in the training simulators before launch, but was written off as a lack of fidelity of the simulator itself, not a bug in the flight software.
    After that, the astronauts really began to appreciate running the real GPCs with the real flight software in the simulators.
    PS: Although I work at NASA, this message is my own expression, and not that of NASA or my employer. I am a programmer only, not anyone with any kind of authority or insight except for my experiences here.
  • by tacokill (531275) on Tuesday November 07, 2006 @01:19PM (#16753507)
    Hundreds of comments and not a single one mentions that NASA is a CMMI Level 5 organization. For those that don't know (and apparently, that's a lot of you), CMMI, aka Capability Maturity Model Integration, is software ENGINEERING methodology for developing processes and technologies around IT systems. It is a very in-depth methodology for developing software and comes about as close to "engineering" as you can get in software development.

    Here is a list of participants in this program. [cmu.edu]

    And here [cmu.edu] is a general overview of what CMMI is.

    And just to put it into perspective, when I was last working with CMMI, there were only 3 companies certfied at level 5. Nasa, Motorola, and another one I can't remember. I am sure that has changed but nonetheless, it's a big deal and shows a serious effort to do things in a controlled, measureable, testable, way.

    I only bring this up to counter the ridiculous "solutions" that some have proposed on this site.

    "I can fix that in 3 lines of code".

    Well, great. That might work at YOUR company. But please don't do that at NASA. Despite what many think here, NASA is a top-notch software development house. And I would expect nothing less given what is at stake.

One picture is worth 128K words.

Working...