>Without a license, purely for developing software, and en masse, yes.
You don't need a license to compile data about an article, and that's all that an LLM is - data about the frequency of words and their order. It's highly transformative, and this sort of thing was already fought an won by Google when they digitized and made searchable millions of books without their copyright holder's permission and for commercial purposes. An LLM doesn't even retain the original work.
https://en.wikipedia.org/wiki/....
"The court's summary of its opinion is:
In sum, we conclude that:
1. Google's unauthorized digitizing of copyright-protected works, creation of a search functionality, and display of snippets from those works are non-infringing fair uses. The purpose of the copying is highly transformative, the public display of text is limited, and the revelations do not provide a significant market substitute for the protected aspects of the originals. Google's commercial nature and profit motivation do not justify denial of fair use.
2. Google's provision of digitized copies to the libraries that supplied the books, on the understanding that the libraries will use the copies in a manner consistent with the copyright law, also does not constitute infringement.
Nor, on this record, is Google a contributory infringer."