Forgetting for a minute that GPT-x is not anything like "AGI" at this point, and may never be, though that may be beside the point, practically speaking.
Our minds are unarguably "NGI" -- when we read books (copyrighted or not) our minds/brains compile that information down into memories, such that we (given the aptitude/inclination) could write another book in the same style, same characters, etc., which GPT-x+n presumably at some point could also do, even while still not being anything like an "AGI".
Does a copyright holder have the right to demand a brain-wipe (again, forgetting that isn't yet possible) so that a person who reads a book would then lack the capability to create derivative works? Of course not.
Is that not what we're talking about here? Also forgetting about how stupid current (virtually perpetual) copyright law is for a minute, it shouldn't be possible to violate copyright by reading a book and then having information from that book stored in our minds. Actually publishing derivative works is a different story, but the *capability* of creating derivative works isn't (I'd say can't) be a copyright violation. So how does one legislate that? I guess you could say that the LLM is allowed to "read" the books but should be prevented somehow (on the front end) from creating full-blown derivative works, but how does one define that? Are summaries allowed? Use of fictitious names and places? Parodies are protected under current law, so there's that. If I ask ChatGPT "Who is Frodo?" and it says "A a hobbit that goes on a quest to destroy the One Ring." is that infringement? Is it infringement if someone asks me that question and I give that same answer? Of course not. Should the owner of an LLM be liable for works created thereby? Or is the person who asks for it to "Write me a fourth Lord of the Rings novel, picking up where The Return of the King leaves off." and then publishes it the one that is liable? My first instinct would be to say that it's the latter, since that amounts to computer-assisted plagiarism. What if they never "publish" it, but only read it for their own pleasure? Is that a violation of some kind? No? Is the fact that the LLM "distributed" that work to a single person actionable? Yes? No?
These are the murky waters we are now dipping our toes into. FWIW I'd say that if we have the "right" to read a book that we purchase or otherwise legally acquire, we have the right to compile statistics (by hand or with computer-assistance) and create summaries of that book, and share that data with others, whether in our heads and in speech, on paper or in electronic form. That's almost certainly fair-use. I guess what's at issue is if someone is doing that for-profit. Where is the fair-use line drawn? I don't see any easy place to draw it, since wherever we draw it, there will always be edge-cases, which will end up in court, just like all the past cases where someone sued someone else for copyright infringement.
Since we now have competing vested-interests beyond "The People vs. Copyright Holders", maybe this is a watershed moment for copyright law to get things back to something reasonable, like 7-year terms (with a limited number of extensions for living authors: maybe three or four) for copyright. I know, I know, cold day in hell and all that, but we can hope?