You make a good point, but I don't quite agree. 3D is where there are three dimensions. A picture with a z-map (z-buffer) also has depth, but it is still a set of 2D pictures. A true (in mathematical sense) 3D image has infinitely more information stored in it than a set of 2D images. A discretised 3D image therefore has 3 resolution values, e.g. 2048*1080*512. Each pixel in a hologram, for example, has a color value for each direction in which it sends light, separately.
What I meant with "This" in your quote is the fact that people not sitting in the 'sweet spot' are getting the wrong perspective sent to their eyes, which has nothing to do with movement. The brain has trouble with things that don't match up, something similar to car sickness etc. The mismatch between focal distance and stereoscopic distance that you mention must also be an important part of it, I agree!