Comment I'm feeding the AI crawlers AI generated text now (Score 3, Interesting) 85
I've been hit many times over as well (smallish forum website with about 12000 posts).
Seen it all: fake user agent strings, ignoring robots.txt, either localized IPs (lots from China) or distributed, load increasing to 500 times the normal value, until the site goes down.
For now, a combination of these keeps it manageable:
- fail2ban
- apache mod_evasive
- restricting forum access to logged in users
When the forum is accessed by a crawler, they get a short paragraph about how great the site is, generated by ChatGPT