It should be even easier than that.
Archive.org should archive everything, including the robot.txt contents, at each scan.
The content being displayed from the archive.org website itself however could then still honor robots.txt at the time of the scan, purely for "display" purposes.
This way changing robots.txt to block search engines would not delete or hide any previous information.
Also the new information would still be in the archive, even if not displayed due to the current robots.txt directives.
Although it would require more work to do so properly, this would potentially allow for website owners to retroactively "unhide" content in the archive in the past as well.
Proper in this case would require some way to verify the domain owner, but this could likely be as simple as creating another specifically named text file in the websites root path, with content provided by the archive.
That can be as simple as the old school "cookie" data like so many other services use such as Google, or as complex as a standard that allows date ranges specified along with directives.
But in any case, this would preserve copies of the website for future use, such as for when copyright protection expires.
Despite everyone having a differing opinion on just how long "limited time" should be in "securing for limited times to authors and inventors the exclusive right to their respective writings and discoveries", no one who wants to be taken seriously can argue that this time of expiration must happen at some point.
Since the vast majority of authors make no considerations to protect our property, that task clearly needs to fall on us to secure.