From a psychological perspective, the "attentional spotlight" does not lend itself to having 10 points on the screen to focus on. Imagine if you wanted to sort out coins on a table. It's more natural to look at coins individually and sort them sequantially than look at them all together, place multiple fingers on different coins etc.
The gesture side of multitouch is also unconvincing. The video talks about tasks that are already extremely easy with a mouse and keyboard, and take up very little time. I'm sure it's easier to scroll a mouse wheel than move two fingers apart on a pad, and why would I want 10 fingers on the screen to swap a window when I can do it with one simple drag and drop with the mouse?