There has been assloads of research on mitigating soft errors going back to the 1970’s. I’ve published some myself. There is no shortage of workable methods on masking transient errors in logic and bit flips in DRAMs. SEUs are a major problem for supercomputers, so their memory systems have sophisticated mechanisms for catching them.
If Cisco is blaming this on SEUs, that just proves their incompetence, since they obvious didn’t spend 5 minutes with Google Scholar looking at hundreds of GOOD papers (in the top conferences and journals) on this topic. Seriously.
PLUS, if something goes wrong, even if it IS a transient error, it’s FAR more likely to be a fixable bug than radiation. We had a weird bug in a DRAM controller whose state kept going invalid. We had to add another circuit to fix that. We *called* is a cosmic ray deflector, but the more likely causes, in order were (a) another bug we couldn’t find, (b) a timing violation caused perhaps by voltage or temperature fluctuation, or (c) crosstalk in the circuit. We would have kept looking, but this deflector circuit made it robust to hundreds of hours of slamming the memory system, so we let it go. (Also, it was graphics memory, so even if it did ultimately suffer a glitch some day, it would go unnoticed.)