NetFiber asks: "I work as a security analyst for a large university. We have recently been tasked to scour our network in the hopes of finding and removing sensitive information such as credit card numbers, social security numbers, and such on all publicly available web servers. Our current method of analysis is to archive all the content (which often grows over 100GB) and later parse the data with various utilities and regexes that search for patterns and other pertinent information. So far, this process has proven to be rather cumbersome and time consuming. Does anyone have any experience collecting and sanitizing large amounts of web content? If so, what procedures/utilities do you use to accomplish this?"
Attend or create a Slashdot 20th anniversary party! DEAL: For $25 - Add A Second Phone Number To Your Smartphone for life! Use promo code SLASHDOT25. Check out the new SourceForge HTML5 Internet speed test. ×