Comment Re:This isn't news (Score 1) 190
It wasn't a sensor bit error that it failed to guard against. The control values I referred to are those in RAM, used by the software. The RAM apparently wasn't parity protected, and a bit-flip in the right word could cause uncontrolled acceleration. It wasn't the only thing that could cause havoc; there were race conditions and stack overflows in the code, apparently, and those were more likely the sources of actual, observed UA.
This lengthy article at EE Times digs into some of the details. The main quote, though, is on page 3:
Memory corruption as little as one bit flip can cause a task to die. This can happen by hardware single-event upsets -- i.e., bit flip -- or via one of the many software bugs, such as buffer overflows and race conditions, we identified in the code.
There are tens of millions of combinations of untested task death, any of which could happen in any possible vehicle/software state. Too many to test them all. But vehicle tests we have done in 2005 and 2008 Camrys show that even just the death of Task X by itself can cause loss of throttle control by the driver -- even as combustion continues to power the engine. In a nutshell, the fail safes Toyota did install have gaps in them and are inadequate to detect all of the ways UA can occur via software.
I don't think that article pointed out this other detail:
Although the investigation focused almost entirely on software, there is at least one HW factor: Toyota claimed the 2005 Camry's main CPU had error detecting and correcting (EDAC) RAM. It didn't. EDAC, or at least parity RAM, is relatively easy and low-cost insurance for safety-critical systems.
This particular set of problems at Toyota was very interesting to us at work. I'm just now starting to work with our safety critical team that sells hardened controllers into the automotive market. They include all sorts of hardware failsafes, including ECC, lockstep execution between parallel cores, etc.