The problem with data driven science... is that data isn't evidence.
Correlative statistics are not evidence.
I think you are confusing "evidence" with "proof". Data, and more specifically, the patterns in data, most certainly are evidence. If that were not true, then there would be no reason to even try doing science.
Having data isn't an accomplishment.
Any scientist who has spent years obtaining a hard-won dataset would strongly disagree with you. Consider, for example, the ground-breaking data generated a few years ago by the Human Genome Project, or the current explosion of data about exoplanets. These data most certainly do represent substantial intellectual and technical accomplishments. Now, if what you mean is that simply downloading someone else's data from the Web is not an accomplishment, then I agree with you.
Scientists need to be willing to get their hands dirty and get the data themselves.
I think you will find that, in the hard sciences at least, that's usually how it's done. The researchers who write the papers are usually the same people who were involved in collecting the data. However, for very large-scale studies (e.g., global biodiversity research), there is no way that a single scientist, or even a single research team, could gather all of the necessary data. In these cases, the only way to make the research tractable is to integrate multiple datasets.
Your points about the importance of understanding where the data one uses in a study came from, how they were collected, and any potential biases are all well taken. However, ignoring any of these factors is simply sloppy science, no matter whether the researcher collected the data him or herself, or if someone else collected it.