Comment Let them eat data! (Score 4, Interesting) 111
This may not be popular, but I think somebody needs to make the argument in favor of scraping the web unhindered.
1. Copyright law controls the distribution of copyrighted works. AI models are not distributing the works that they’ve accessed, any more than a seller of applesauce is distributing apples. You won’t find copies of those materials anywhere in the database, and whatever the AI coughs up is almost always mangled and mashed up beyond recognition. (And when something comes out that’s recognizable, that’s most often a concern of trademark law rather than copyright. Meaning, if your AI generates pictures of Batman and you use them, then DC may rightfully want a word with you. They won’t care how the model was trained; they’ll only care what you’ve done with it.)
2. Copyright law is supposed to economically incentivize the production of creative works. It’s hard to come up with how AI scraping undermines that principle. It’s not a direct replacement for any of the works that it was trained on. It’s not breaking the various business models that got the training material posted in the first place. One could argue that generative AI is a supercharger for creative work and lets more of it be produced faster. That’s what we’ve always wanted, isn’t it?
3. The way generative AI works is analogous to the way that human beings learn and create. We observe and consume creative works, mentally digest them, and then we mix-and-match our favorite parts to create something new. AI is a more automated version of the same general processes. So, why should it be acceptable for every human being to do this “by hand” (as it were) but not employing automation to the same effect?
Going beyond copyright, I understand there are practical concerns about massive scraping operations and the sheer amount of traffic and hits that they can produce, and real costs that they can incur. So, I'm not addressing that. Those are issues that'll have to be negotiated and worked out between various parties.