Maybe not in the case of Pearson specificly, but this is actually a valid complaint involving copyright, and a legitimate use of copyright , preventing _commercial_ infringing use.
The language models are making use of copyrighted materials for very much commercial use.
Think of the interesting issues here
- if i buy one copy of a book, but i'm training 1000 AIs in parallel, do i need to buy 1000 copies?
- the ML responses essentially plagiarize the source material. if a response substantially uses something from a copyrighted source, and changes a few words, or rearranges the sentences, isn't that really copyright violation ? after all, they response is making direct use of the material. they could deliver that response millions of times, that's a lot of infringing.
I have to admit that i'm completely biased show. These ML systems are leeches and parasites and commercial entities, so it's not obvious to me why they shouldn't pay handsomely for every scrap of copyrighted material they are taking advantage of.
This is mostly definitely not in the category of "i lend my book to my friend".