Comment Acquired data vs. archived data set (Score 2) 23
The project represents a 'Big Data' problem of the highest order.
Before or after de-duplication of the data? Before, yes obviously but if that is still the case after de-duplication then gaining much knowledge from this experiment may prove to be a fools errand.