Publishers Seek Change in Search Result Content 181
explosivejared writes "The Washington Post is running a story on the fight between publishers and search engines over just what exactly is allowed to be shown by search results. From the article: 'The desire for greater control over how search engines index and display Web sites is driving an effort launched yesterday by leading news organizations and other publishers to revise a 13-year-old technology for restricting access. Currently, Google, Yahoo and other top search companies voluntarily respect a Web site's wishes as declared in a text file known as robots.txt, which a search engine's indexing software, called a crawler, knows to look for on a site ... [new] proposed extensions, known as Automated Content Access Protocol, partly grew out of those disputes. Leading the ACAP effort were groups representing publishers of newspapers, magazines, online databases, books and journals. The AP is one of dozens of organizations that have joined ACAP."
The Text I Actually Submitted (Score:5, Interesting)
So they tell you what they don't want you to see? (Score:5, Interesting)
Just don't do it in the US or someone will tell the judge: "The defendant knowingly circumvented the DRM - which is called ACAP - of our online newspaper".
ACAP - Anonymous Coward Anonymously Posting
Historical footnote: where robots.txt came from (Score:5, Interesting)
Back in 1993, when I was teaching myself Perl in my spare time (while working for a -- cough -- UNIX company called The Santa Cruz Operation -- no relation to the current Utah asshats of that name), I was practicing by working on a spider. Now, back then SCO's Watford engineering centre was connected to the internet by a humongous 64kbps leased line. And I was working with a variety of sources on robots, and it just so happened that because I was doing a deterministic depth-first traversal of the web (hey, back then you could subscribe to the NCSA "what's new on the web" bulletin and visit all the interesting new websites every day before your coffee cooled), I kept hitting on Martin Kjoster's website. And Martin's then employers (who were doing something esoteric and X.509 oriented, IIRC) only had a 14.4kbps leased line. (Yes, you read that right: a couple of years later we all had faster modems, but this was the stone age.)
Eventually Martin figured out that I was the bozo who kept leeching all his bandwidth, and contacted me. Throttling and QoS stuff was all in the future back then, so he went for a simpler solution: "Look for a text file called
So if you're wondering why robots.txt is rather simplistic and brain-dead, it's because it was written to keep this rather simplistic and brain-dead perl n00b from pillaging Martin's bandwidth.
Ah, the good old days when you could accidentally make someone invent a new protocol before breakfast
Re:The Text I Actually Submitted (Score:3, Interesting)
However, I tend to agree with you, and when I don't see a relevant summary, I'm simply less likely to click through to the page, so this may well backfire on them. Either they're not understanding search users' usage patterns, or else they believe that this is so prevalent that nothing will have summaries, and searchers will be forced to click through to find anything.
Re:Here's a tip... (Score:3, Interesting)
It would make far more sense for these institutions to just take their sites completely off of the search engines via robots.txt and save up those slots in the search results for sites that want traffic. Or perhaps limit it to just the front page, but I think that one can still do that with a competently crafted robots.txt as well.
Re:Historical footnote: where robots.txt came from (Score:4, Interesting)
I'm fascinated at the beginnings of the web and the people who drove it.
If you know any place where I can hear more of these please let me know. (reading your blog right now)
Re:The Text I Actually Submitted (Score:3, Interesting)
I hope they realize that restricting search engine crawlers with robots.txt this way really doesn't do much other than decrease the number of people who visit their dite in the first place. I wonder how that will alter their revenue streams. Let them go ahead with it and the whole thing will be self correcting.
It's not a bad thing. (Score:2, Interesting)
Now some sites will probably want to over control, but they'll lose out.
Re:So they tell you what they don't want you to se (Score:3, Interesting)
Personally, I don't really see the problem. You either want your site spidered or you don't. You don't get to control the presentation of the data that is spidered, only the search engines get to do that.
SO the thing is here is that Google takes its ordinary web spider, applies a little magic to it, and then displays the results as a news page. Big deal.
You either want your site spidered or you don't. You can't have your cake and eat it too.
Re:The Text I Actually Submitted (Score:3, Interesting)
I do. Every time I hear about something like this, the site goes on my CustomizeGoogle blacklist, never to be seen again. It was the slashdot policy of posting "registration required" links to the New York Times that got me started on this path, and honestly, I'm better informed for it. All these big "news" publishers deliver is sanitized, oversimplified, dumbed down, biased and superficial stories blended with propaganda and outright lies concocted by private interests who stand to gain by your being misinformed. They make you stupider for having been exposed to them. Anyone with integrity has already adapted or left long ago, and those that are left are personally responsible for the wreckage. I hope airplanes land on their heads.
Yeah, but... (Score:1, Interesting)
Re:A prediction (Score:3, Interesting)
You are right: If the search engines disappeared, the big news services wouldn't care. Actually, they would probably enjoy it, because people would go to the New York Times, Washington Post, and other big names sites rather than seeing these smaller sites with better reporting and commentary. But you contradict yourself as well. You say that if the search engines disappeared, the internet would just create more, but then you say that if the big news services stopped providing news, the search engines would die. No they wouldn't. The internet would create more, filling the need.
If the news sites want to control their content better, fine. But I guarantee you the next whine you will hear from them is how Google isn't directing traffic to their websites and it must all be retribution by Google for being made to limit what it displays, rather than people clicking on sites where they can read the summary.
Re:The Text I Actually Submitted (Score:5, Interesting)
Your sig is particularly ironic here. If you want information to be free, you're welcome to offer to pay the salaries of all the journalists, reporters, cameramen, sound crews, and support folks who are out there all over the world collecting it. Go ahead and put your money where your mouth is.
I am.
I'll be launching a service in the new year to help actively creating artists make a profit off selling original works, leveraging the copyleft and mashup cultures to generate a fanbase and simultaneously devalue the global copyright pool.
For the right types of creators, the strategy of increasing the amount of budgets available for custom work by annihilating the cost of existing bodies of work is a valid one, and I intend to make it very easy for those types of people to do so as a side effect of their making money off the things that you cannot copy.
You'll excuse me if I wait till the new year to slashdot myself, but I assure you, I have sunk hundreds and hundreds of man hours and a lot of my own dough into putting my money where my mouth is, and when I'm ready, you will know all about it whether you like it or not, because it will be some noteworthy stuff.
So no. I don't think I'm overreacting at all. I like to think when it all pans out in the end I'm going to play some small but important personal role in bringing the old things crashing down as a matter of fact. And have the people doing the real work be richer for it.
Re:A prediction (Score:2, Interesting)