Comment Re:Reminder that you are the product (Score 2, Informative) 19
"We're midflight in our data licensing deals and still learning, but what we have seen is that Reddit data is highly cited and valued"
In case anyone has any misconceptions, its "Reddit data", not "Reddit's users contributions" or "community" or any other words people use to convince themselves they are part of something important,
Reddit gains an irrevocable, sub-licensable, royalty-free license for user-created content posted on its site. So while the term "Reddit data" is somewhat inaccurate, it is a passable approximation of the truth. The more correct term would be "Data that Reddit has irrevocable, sub-licensable, royalty-free license to". But where's the fun in saying that?
"Reddit data" does obscure the fact that the Reddit user retains full copyright ownership of their comments. So if a user wished to, they could independently license their content to GOOG, AMZN, META, MSFT et al. Theoretically, all the other users in a Reddit thread could make same decision, cutting Reddit out of the licensing loop entirely.
But the logistics of organising this are onerous. Even if a few crucial users in a thread refuse to license content, there will be 'gaps' and intelligibility and utility for AI training will suffer.
But if replies briefly quote the specific content they are responding to (as I am doing here), the context becomes much more clear. In that case, individual comments become much more intelligible and useful. Legally speaking, there should be no problem here because brief quotations fall under fair use.
Concievably, GOOG and other browsers manufacturers could offer to store your user-generated content in a browser repository ("keep a record what you wrote", like Windows Recall) or in the cloud, with the option of licensing the content to GOOG. They could also generate AI-summaries of the context you were responding to.