You're assuming that "thousands of Google coders, workers and managers" are auditing the parameters passed to tcpdump in some script or at least looking over the raw capture files rather than the output data.
If people in charge haven't had time even once in three years to look at what they collected, they are idiots by anyone's definition. Intercepting of other people's communications is a crime in many countries. It's perfectly legal to receive AP's broadcasts that advertise it, but once you start capturing packets that are sent to (or from) other computers, you are receiving "legally protected" (but not physically protected) data that is not for you. Lawyers in different jurisdictions may have different laws on this subject, but intercepting other people's data is amoral in most human societies.
What they were collecting were locations of access points.To collect that data, they had equipment in the vans capturing data broadcast by those access points and processing it to determine their location. When a vehicle is moving around, it is not within range of most networks for very long. You can only capture a limited number of frames in that time period. Then you use software to analyze the signal strength data from those frames, along with the gps log, to determine the locations of the access points. This only looks at lower layer segments of those frames, and the higher layer segments (including the payload) don't affect it.
Google failed to go back and delete the raw capture files. Maybe they wanted to keep the raw data in case there are future improvements in the analysis software. If that's the case, they failed to reprocess the captures and edit those frames to remove any potential user payload data. I'm not aware of any functionality for doing this on-the-fly with any of the open-source tools for capturing traffic. Ignoring all information from all non-beacon frames would have far less accuracy, especially in areas where wifi is heavily used. There isn't anything to indicate that they had any interest in any user payload data, or that any of it was collected anywhere outside of the raw capture files.
The whole purpose of using analysis software is so that you don't have to look over large amounts of meaningless raw unfiltered data... I'm willing to bet there are some sources of raw data that you have and fail to thoroughly review by hand. For someone who has a job to do, wasting that much time isn't usually an option. Even for someone who doesn't have a job, that still could require more time than they have depending on the amount of data.
Anything capturing data is going to inevitably collect some data that is not intended for it. For example, if you've ever typed around any audio recording device. That captured audio can be analyzed to determine what you typed, even if it was not intended for the person capturing the audio. Or if you've ever been to a tourist attraction you've probably been at least captured in the background of other people's home movies, whether or not you indented to be.