Forgot your password?
typodupeerror

Comment Resources / Techniques (Score 1) 690

First, you want to get some very experienced engineers who have done this type of thing before. Try ones with a background in either Avionics or medical devices, since both are life-critical / mission critical arenas. Second, you may want to look at companies which make fail-safe systems as these usually require special purpose hardware. HP has a computer line called NonStop which may be worth looking into (no, I don't own any HP stock :)). In terms of techniques: 1. NEVER, NEVER, NEVER, NEVER -- NEVER execute a loop waiting for some event to happen, that does not have a bailout mechanism, even if its just counting a variable up to (or down from) a few million or so (however long you've determined would be the maximum wait interval. If a piece of hardware breaks or a sibling thread crashes you'll be out to lunch. 2. Try to use a real-time system that is used on fail-safe systems commercially. 3. Don't use Windows. No matter how defect-free / error-free you make your system, it won't matter, because Windows will have more than enough defects and flaws to make your system fail in weird and mysterious ways. 4. Use a journalling file system like ext3 or reiserfs. 5. keep a recent copy of your operational state / data somewhere safe, like in non-volatile memory. If your system has to restart itself, this data will help you become operational again much faster. 6. Use a watchdog timer. Basically, this is a piece of hardware that your code has to "feed" on a periodic, repeated basis. If your code gets hung up in an infinite loop somewhere, the watchdog timer will assert the reset line and start things up again. That's where your "warm" data comes into play. 7. As many here have mentioned, try to partition your system in such a way that you can stay away from C++ as much as possible. 8. As some here have mentioned, real-time java or a commercial garbage collector library service could help alot in avoiding pesky memory leaks. 9. Assume you will mess up the first time. Its a much more realistic assumption than assuming you'll get it right the first time. Hey, most of us didn't even get our first KISS right the first time, and what you are looking at is alot more complicated than that :)). So, schedule enough time to do so (call the first one an R&D program), collect enough information about your design decisions and rationale that they will help you to understand where you went wrong, and help you to do better the second time around. Good Luck. 10. You've gotten alot of good comments from a lot of very intelligent and experienced people on this list. Read them over carefully. Good Luck dennis

Slashdot Top Deals

Real Users find the one combination of bizarre input values that shuts down the system for days.

Working...