The 'explanation' is that the demo triggered all the devices within earshot because apparently a device designed to perform possibly-sensitive actions on your behalf was assigned a model line wide, public audio trigger in order to make it feel more 'natural' or something; rather than some prosaic but functional solution like a trigger button/capacitive touch point/whatever; and that the device just silently fails stupid, no even informative feedback, in the even of server unresponsiveness or network issues. Both of these seem...less than totally fine...for something explicitly marketed for public use in crowded environments on what we euphemistically refer to as 'edge' network connectivity.
This. The "someone says 'Hey Siri/Okay Google' on TV/radio/loudspeaker" problem is a well-known failure mode, and if they don't have reasonable mitigation in place by now, they don't know what they're doing, and their product shouldn't be taken seriously. Whether that mitigation is blocking it during meetings, doing handshaking to limit commands to the nearest device when multiple nearby devices detect the hot word at exactly the same time, making it recognize your voice and not other random people's voices, or any of dozens of other strategies for coping, having some mechanism in place to handle this should be considered a base requirement for any voice-based assistant.