Submission + - What do we allow search engine robots to do? (menneske.no)
Anonymous Coward writes: "I run a pretty big sudoku-site, and as all content providers on the net I want to be found in the search engines. But how much do we want to be in those indexes? So far this month (10 days) Yahoo slurp has slurped about 150000 pages from my site, or 2.2GB of data. If I add all the search engine I get a number of 250000 pages downloaded in 10 days. But with the increasing amount of search engines out there, how will this number be in 2-5 years? And how high will it be if all were like Yahoo slurp?
How far do we go to allow the search engines to crawl our sites so we're searchable? How many pages each day and how many search engines do we allow? Is there a limit we should set on search engines using our bandwidth? We can of course change robots.txt to set crawl-time etc, but what are our limits? I have a 10Mb/s uplink, so I'm not very bothered by this, but what about people paying for the bandwidth used?
(I have no commersials on my site, so I will not earn a dime if all of ./ visited it.)"
How far do we go to allow the search engines to crawl our sites so we're searchable? How many pages each day and how many search engines do we allow? Is there a limit we should set on search engines using our bandwidth? We can of course change robots.txt to set crawl-time etc, but what are our limits? I have a 10Mb/s uplink, so I'm not very bothered by this, but what about people paying for the bandwidth used?
(I have no commersials on my site, so I will not earn a dime if all of