Slashdot is powered by your submissions, so send in your scoop


Forgot your password?
DEAL: For $25 - Add A Second Phone Number To Your Smartphone for life! Use promo code SLASHDOT25. Also, Slashdot's Facebook page has a chat bot now. Message it for stories and more. Check out the new SourceForge HTML5 Internet speed test! ×

Submission + - Automated ebook piracy tracking? (

Angwe writes: "I work in IT/electronic publishing at a university press in the US. ( On a recent posting to the Association of American University Presses (AAUP) mailing list, one of the member presses posted a long discussion from an outside expert they hired regarding ebook piracy and some of the complications in tracking it down (the files are not hosting by the "search site", the links go through a "link launderer" and the actual hosting is done on a site like DirectDownload or RapidShare, whose admins will sometimes respond to DMCA requests, but the users can usually circumvent this rather quickly).

It occurred to me that one of the major complications (tracking a proving the link between the search result and the file itself) could be overcome with a spider program whose output style would track pages/links in depth with search-to-file collation tracking. I was wondering if any slashdotters out there knew of some F/OSS software like this?

Full text of email:

— Forwarded message — Date: Wed, Feb 4, 2009 at 9:29 AM The "eBook Thief" site is a typical example of file upload/download piracy (which differs from torrent-based file sharing, the other way in which pirated books move around). Here are two more: In this model, the website (e.g., "eBook Thief") directs visitors to a file-hosting site such as RapidShare (, where someone usually unrelated to the web site operators has deposited a file that is available for free and/or paid downloading. It is also common that the link from website to fille host is channeled through another type of website that specializes hosting URLs that lead to filehost sites, presumably because this make it more difficult with automated anti-piracy monitoring to discover which website is guiding visitors to a which file-host site. As an example, at present "Atlas Shrugged" is the most recent post on eBook Thief ( If you click the "Download eBook here" link, another file is appended to the post, a file with two URLs. This 2-step process is common because it makes it impossible for automated anti-piracy monitoring to find a book information (e.g., author, title, ISBN) in the same web page/file as a link to a file-host site. To continue, you'll see two links presented, one to Rapidshare and one to DepositFiles (both very popular file-host sites). You can go ahead and click on these, it won't do any damage to your computer. When I click through the DepositFile link this morning, I arrive at DepositFiles only after a brief stop at a third site that wants me to see an ad before I get to DepositFiles. Don't download the file from either RapidShare or DepositFile, though, since presumably this is illegal. It is also a process whereby viruses, trojans, and so forth can be sent along with the eBook on offer. However, 99% of the time this will indeed be the eBook it purports to be. So, here the publishers of Atlas Shrugged have three (or four) problems. One website, two file-hosting sites, and, in the case of the path from website to DepositFile, one intermediary "link-laundering" site. The website, eBook Thief, is a tempting target. They even invite you to contact them to have posts removed under DMCA. Some websites will actually do so in response. But mostly they hope this "disclaimer" somehow makes it less likely they are guilty of piracy. It may also make visitors feel less guilty about stealing, I would hypothesize. However, the web site probably isn't hosted in the U.S. It can be difficult to convince a web hosting firm in, say, Russia to shut down a site that directs visitors to pirated content. A higher priority target would be DepositFiles and RapidShare. The later in particular will usually remove a file in response to a DMCA request. I don't know of any examples of successful intervention against the "link laundering" sites. Nobody seems to pay much attention to their role in eBook piracy. Unfortunately, even if RapidShare and DepoitFiles remove the files, it may take only a few days before the files show up again, on another file-host site, and frequently even on RapidShare, under a new account, with a new file name. RapidShare is being pressured by German courts do to more to prevent pirated material from re-appearing, but actually this is a difficult and expensive task, for technical reasons too complicated to go into at the moment). (As an aside, now you can see why RIAA has given up on everything but trying to go after U.S. ISPs who carry all this piracy traffic for consumers in the U.S.) By the way, you can't completely trust the Search functions on sites like eBook Thief. There are some tricks that are used to thwart visitors who may be looking for copyright violations (I'll elaborate another day, if you like)."

Slashdot Top Deals

Everybody needs a little love sometime; stop hacking and fall in love!