What you are getting is the reconstructed data. To be able to do anything scientifically valuable with it you have to understand the intricate details of the reconstruction software, the trigger, the calibration etc. etc. To be honest I would be amazed if anyone outside CMS will be able to do much with it at all. I'd also expect that there will be bandwidth restrictions on accessing the data since the dataset is multi-PB (if it is the full set of run I data).
We did a similar exercise with the D0 experiment at Fermilab several years ago and it was of interest to practically nobody. I expect there may be somewhat more interest with this being the LHC data but I'd be surprised if anything useful comes of it given the massive amount of work required to be able to do a useful analysis. The best I can think of is that this might make a really nice undergraduate course project or, with some pre-written, high level analysis code, perhaps even as outreach for high school students.