Study Finds a Third of New Websites Are AI-Generated 58
alternative_right shares a report from 404 Media: Researchers working with data from the Internet Archive have discovered that a third of websites created since 2022 are AI-generated. The team of researchers -- which includes people from Stanford, the Imperial College London, and the Internet Archive -- published their findings online in a paper titled "The Impact of AI-Generated Text on the Internet." The research also found that all this AI-generated text is making the web more cheery and less verbose."The proliferation of AI-generated and AI-assisted text on the internet is feared to contribute to a degradation in semantic and stylistic diversity, factual accuracy, and other negative developments," the researchers write in the paper. "We find that by mid-2025, roughly 35% of newly published websites were classified as AI-generated or AI-assisted, up from zero before ChatGPT's launch in late 2022."
"I find the sheer speed of the AI takeover of the web quite staggering," Jonas Dolezal, an AI researcher at Stanford and co-author of the paper, told 404 Media. "After decades of humans shaping it, a significant portion of the internet has become defined by AI in just three years. We're witnessing, in my opinion, a major transformation of the digital landscape in a fraction of the time it took to build in the first place."
Maty Bohacek, a student researcher at Stanford and one of the co-authors of the paper, added: "As AI-generated content spreads, the challenge is finding a role for these models that doesn't just result in a sanitized, repetitive web," he said. "Rather than forcing models to be perfectly compliant and agreeable, allowing them to have a more distinct personality or 'friction' might help them act as a creative partner rather than a replacement for human voice."
"I find the sheer speed of the AI takeover of the web quite staggering," Jonas Dolezal, an AI researcher at Stanford and co-author of the paper, told 404 Media. "After decades of humans shaping it, a significant portion of the internet has become defined by AI in just three years. We're witnessing, in my opinion, a major transformation of the digital landscape in a fraction of the time it took to build in the first place."
Maty Bohacek, a student researcher at Stanford and one of the co-authors of the paper, added: "As AI-generated content spreads, the challenge is finding a role for these models that doesn't just result in a sanitized, repetitive web," he said. "Rather than forcing models to be perfectly compliant and agreeable, allowing them to have a more distinct personality or 'friction' might help them act as a creative partner rather than a replacement for human voice."
Might be to late already! (Score:5, Insightful)
Re: Might be to late already! (Score:5, Interesting)
Re: Might be to late already! (Score:2)
You being an asshole has nothing to do with AI (Score:3)
As someone who has been repeatedly banned by humans, can I cry you a river? When you blamed me for driving away users, did you see AI coming?
Ever think that if people are repeatedly banning you, you're the problem? "If you run into an asshole in the morning, you ran into an asshole. If you run into assholes all day, you're the asshole." or if you prefer "the common factor in all your failed relationships is you". Either you're terrible at picking your audience...AKA, no one wants to hear your misogynist Men's Right's rants in a Pokemon Go subreddit...or if you're in the appropriate forum, maybe you're just someone who likes to spout things ever
Re: You being an asshole has nothing to do with AI (Score:3)
Re: (Score:3, Insightful)
Re:Same as it ever was (Score:4, Insightful)
Yeah, there's a lot of AI slop and it's problem. But I am not going to pretend that most web content has ever been artisanal and well curated. Well maybe back in the day when my GeoCities page for my dog was part of the most well regarded Webring. At least then Tom was still my friend.
Re: (Score:2)
I remember building a website in Frontpage 2000... and editing it in HTML (and hosting the website on a 200MHz Packard Bell with 32Mb of RAM).
I learned coding the hard way... no classes or anything. By the time I got my graphing calculator (TI-83+), I flipped through the handbook and realized I already knew the code (BASICA, with math stuff tossed in), so I already knew it's programming.
Small blogs by anyone don't add up to anything, neither do niche creators.
If someone well-known by everyone (in the viewi
Re: (Score:2)
That entirely depends on the relative value, even assuming that everyone who views it will assess it using similar mechanisms.
The academic blog scene is still a thing, for example, but it's predictably only relevant to people who know each other by name and communicate about their blogs face to face at conferences.
Re: (Score:3)
Um, that's not my argument. It's that when looking at "per site" statistics the internet has always been a lot of low effort content. For every specialist or niche creator website there's been a site for an abandoned hot dog stand in Toledo, an astrology for pets microblog, abandoned instance of Mastodon for squirrel breeders, or fanfic journal with tenuous grasp on how normal human relationships function.
Elevating good content has always been hard and the era of search engines obscured how much of the web
Re:Same as it ever was (Score:5, Informative)
Wordpress is a framework for publishing web content. It's not really relevant here. You can publish slop using AI too. And the fact a website is Wordpress based does not mean the content is good or bad.
This is idiotic snobbery, and you should know better.
Re: (Score:2)
Of course using WordPress or even GeoCities doesn't mean the content is instantly bad, just easy to deploy. But that's the same for sites that include being AI assisted a technical blog post by a bilingual person using an LLM to fix improper idioms use. Knowing that 48% was just one framework which enables the publishing of web content of which a significant percentage is low effort or even human slop contextualizes the meaning of per site statistics.
There's many wonderful WordPress sites, there were some g
Re: (Score:3)
So you're sayin... barriers to entry being lowered results in increased access to create slop?
There's some merit to that point.
Re: (Score:2)
Umm... it's very relevant to the article in question.
Back in the day, we'd put our whole site together in Frontpage, go into HTML edit mode, stuff some JavaScript in there so you couldn't download it... we could show any picture as a background, links could lead anywhere... you could host an entire message board in 20MB of code.
We don't need ads for Malwarebytes (I was a beta tester), just like we don't need ads for Spybot (also a beta tester)... if people need a tool like that (and, I don't mean a 'tool' l
So? Does someone have a problem with progress? (Score:3)
Somebody is writing things as if they expected something different to happen.
Yeah I am sure there are still people out there that hand code web sites.
Probably many more that use template-based tools that have been developed over years and they are familiar with them. Many of those.
For myself, if I started a new web site now I would use AI to do it because it is a better tool than any other. It is just a tool. You can get good or bad results from it like any other tool, depending on how skilled you are. Yes, knowledge and skill still matter. Same as before.
No news here. Move on.
Re: (Score:2, Troll)
Did you even read even the summary? I know we're on Slashdot though, so ... fair enough. It's not about hand-coding websites, it's about the *content*. Imagine that somebody wants to share something with the world via the internet. Now, this sharing is at 66% efficiency, with the added bonus that the content will be stolen, rehashed and added to the already crowded competition for user attention. It's a bit grim, imo.
Re: (Score:1)
Pirating content is one thing, generating random bullshit is something else.
Re: (Score:1)
AI is reusing your content without proper attribution and without linking back, that's pretty much stealing even by pirate standards.
Re: (Score:1)
Good? cut out the middle man. we can chat with a bot ALL the time and stop surfing the web. why bother to fake websites and fake loaded search results? just have the bot do everything directly! add in some randomized tone/attitude by topic for variety like multiple sources do... conspiracy crazy people and idiot big mouths can be replaced with hallucinations, possibly with a lower occurrence.
new chat. repeat. foobar.
My mistake (Score:1)
maybe a browser extension (Score:2)
Re: (Score:2)
Re: (Score:3)
If you want to cut off yourself from the web ... a third is quite a bit. And coders are good at adopting new tech and at saving work. Of course they use AI tools where they are useful and it will only get more over time.
Up from zero? (Score:1)
Re: (Score:2)
Not to mention the affiliate marketing SEO spammers had their own little AI-based respinners going back a little further once the simple string substitution respinners were getting clobbered in google search results.
Summary suggestion isn't great (Score:2)
"Rather than forcing models to be perfectly compliant and agreeable, allowing them to have a more distinct personality or 'friction' might help them act as a creative partner rather than a replacement for human voice."
It would also make AI more difficult for humans to detect. Being able to spot AI is the reason that key parts of the internet is still usable and human trust in it hasn't been broken yet.
half my browsing is AI driven (Score:2)
If you don't feel like writing things, then I'm not going to read them myself. We will plug AI into AI as a big circular human centipede oroboros.
I guess us humans will have to do things that don't involve mass media consumption.
R.I.P. late stage capitalism
Re: (Score:2)
Search results dominated by AI slop (Score:2)
The problem is not that there are a lot of AI slop pages being generated every day. The problem is the search results you get back from google are increasingly dominated by these AI slop pages. It's becoming difficult to avoid them and the misinformation they spew.
Re: (Score:2)
Go on Google News and start clicking links to stories. A growing fraction of them now read less like articles and more like a list of bullet points. I don't know for sure they're AI generated, but I suspect a lot are. This isn't random webpages. These are commercial sites that a few years ago would have been considered reputable news sources.
Re: (Score:2)
You're absolutely right. There is a certain style and format that makes them easy to spot. Or, more precisely: they are easy to spot for the small number of people who know what to look for. However, very soon they will become difficult to identify as AI-generated.
Re: Search results dominated by AI slop (Score:2)
The bullet point style was rising popularity before the 2022 ChatGPT epoch. It is strongly favored by readers in the under-40 demographic, at least in USA. I suspect it is becoming more common because of that as much as because of AI.
GIGO (Score:5, Insightful)
Re: (Score:2)
If they were all this funny I wouldn't complain.
https://m.youtube.com/watch?v=... [youtube.com]
This is a race to the bottom (Score:5, Insightful)
Many readers of this site are likely familiar with various sci-fi stories that deal with nanobots which have begun reproducing without limit, eventually consuming all resources and reducing their planet to "gray goo". This is the information equivalent: it will expand to occupy everything that it possibly can, overwhelming everything generated by humans. And when that happens, it will impact our shared view of reality, which is based on a (mostly) common set of facts.
And when nothing is real, anything can be real. This will not escape the attention of would-be fascists and dictators.
"Less Verbose"? (Score:3)
I find the repetitive, flowery crap that claims to be a website today to be quite useless. Multiple sites on a subject have carbon copied content (or at least lead-ins). It is the quintessential enshittification of the web.
Curious what this brings us next...
So F*$king Slow (Score:2)
Re: (Score:1)
Not all.
I edit my business's webpage with vim. It's plain html and has only two graphics (stored in the same directory).
And I get people complimenting me on its design, which I find amazing.
Re: (Score:1)
it's dead, jim (Score:3)
This has got to be a joke (Score:2)
What is the purpose of it all ? (Score:2)
Hosting a web site costs money, using an AI to do something costs money. What is the reward for spending this money ? I have a feeling that most of it is not what one would call good. Reasons that come to mind:
There must be more that I cannot think of now. What do you think ?
How good is their ranking? (Score:2)
But how good is their search ranking? If search engines are appropriately down ranking these sites, then no big deal.
Enshittification is already there (Score:2)
"More cheery and less verbose" ?! (Score:2)
I find it hard to believe that the web is becoming "less verbose" with the use of AI content. In my experience, it is very hard to get the AI to provide to-the-point answers. It goes on an on.
Now as for cheery, this might in fact be the case - after all, the AI is trained on human output, which, on the web, is about 99% marketing material (kidding, but you get the idea) - probably leaving out the haters...
Also, it seems most commenters here fail to realize this is about content, not about the technical impl
Scams Hosting (Score:1)
If second paragraph rewords content of the first.. (Score:2)