Star catalogs aren't data: they're the results of decades of observations, corroborations, corrections and debates over just exactly what that particular black spot on the white plate was. You want the raw telemetry from every telescope that isn't read out with a Mark I eyeball, and every plate ever taken and scientist's observation note from those that were? You want all the calibration data from WMAP, and all the histograms that were plotted to analyze them and turn them into corrections for the main data so they actually *mean* something?
Particle physics, "data" is the 1s and 0s from every piece of sensory equipment in the detector hall, beam area and points between: often millions of readout channels, each of which means something and has its own quirks and problems that need to be measured and understood with more and different types of data (calibration, cosmic rays, etc). And, these readings are taken at frequencies between thousands and millions of times per second. We often have to analyze the data to a preliminary level just to decide whether they're worth keeping to analyze properly later because there's neither the bandwidth nor the storage space nor the computing power -- even now -- to keep them all. The LHC experiments store petabytes of data per month, and storage, access and transfer costs are significant: you pay for access to those data by contributing to the experiment.
OK, now let's assume you get the raw data. Now what? Good luck with that. There's a reason scientist groups and expert contractors spend years and sometimes decades writing the reconstruction and analysis software for particle physics experiments: teasing useful results from the data are hard. If we were to spend our timing pointing out the rookie mistakes of every schmo who fiddled with the data for a while and thought he'd found something new, the work would never be done. "Heisenberg's Uncertainty Principle Untenable [icaap.org]," anyone?