You know what a search engine is, right? It takes all the words in a document and stores them in a database along with a link saying that this word was found in that document. The search engine has stored every word in the document, but it has done it in a way that it's not possible for the search engine to reproduce the document. The legal precedent is crystal clear that this activity does not violate the document's copyright.
Now you have a baseline for storing every word of a copyrighted book without violating its copyright.
When the LLM "trains" on a copyrighted book, how does it store the data? Has it saved the original data, in order, where it can spit it back out on command? Or like the search engine, has it stored relationships learned from the data which allow it to reason about the work but not reproduce it verbatim?
That's the correct question to ask when determining whether an LLM violates the copyrights of its training data. The plaintiffs failed to offer a credible answer to that question.