You apparently seem to think that automated "storing" of copyrighted works on automated systems that provide services unrelated to the dissemination of said works is illegal.
You might want to have a consult with Google's entire business model about that one.
Yes, if a model, in its training, encounters the exact same text (or images, in the case of diffusion models or LMMs) enough, just like a person encountering the same thing over and over, they can eventually memorize them. Does the model have, say, The Raven, or the Star Spangled Banner memorized verbatim? Yeah, probably does, and it really should. Just like you, however, memorizing something isn't a violation of copyright law - you have to commit a violation, to do something that isn't fair use. They're perfectly allowed to possess copyrighted data when it's processed in an automated manner to provide novel / transformative services. What they're not allowed to do is deliberately disseminate data they know to be copyrighted it to third parties without the copyright holders' permission.
The problem is that to actually get the models to reproduce copyrighted data has, as a general rule, required attacks. People using the models in ways in violation of their terms of services, in generally convoluted manners and exploiting bugs, to try to trick them into disclosing content that they have learned verbatim. In such a scenario, if anyone is attempting to violate copyright law, it's the attacker, certainly not the model developer. It's akin to hacking one of Google's servers to dig up copyrighted data that they've processed and then going, "AHA, here's PROOF that Google is breaking the law!" No, you idiot, YOU are breaking the law.