Forgot your password?
typodupeerror
News

Google To Create "Blog" Search; Potentially Remove From Main 311

Posted by Hemos
from the about-freakin'-time dept.
Skyshadow writes "Google, search engine of choice for pretty much everyone, has announced that it will begin a seperate index for blogs and remove them from the normal index, handling them instead in much the same way as their usenet archives. This will hopefully put an end to the recent difficulties locating primary source material among the mountains of blogs which are clogging the ratings system." There's been comments from elsewhere that says they won't be removing them - but that remains to be seen.
This discussion has been archived. No new comments can be posted.

Google To Create "Blog" Search; Potentially Remove From Main

Comments Filter:
  • lousy words is gettin in the way of my pictures
  • journals (Score:5, Interesting)

    by asv108 (141455) * <alex@phatauNETBSDdio.org minus bsd> on Monday May 12, 2003 @11:04AM (#5936486) Homepage Journal
    Will /. journals be included in this?

    Is there any chance of having an RSS feature for journals, for everyone or even just subscribers?

    • Re:journals (Score:5, Interesting)

      by jawtheshark (198669) * <slashdot&jawtheshark,com> on Monday May 12, 2003 @11:05AM (#5936499) Homepage Journal
      No... Check robots.txt [slashdot.org]
      • Re:journals (Score:3, Interesting)

        by jawtheshark (198669) *
        Oh, wait.... It says "User-agent: Mediapartners-Google*" can scan everything. This surprises me however. Still, that's not "GoogleBot", which I see from time to time in my apache logs.
        Anybody got an idea what "Mediapartners-Google*" exactly is?
        • Re:journals (Score:2, Informative)

          by Anonymous Coward
          You see that block of ads at the top of the page, that looks like some classifieds crammed into a banner?

          That's what "Mediapartners-Google*" is. Google driven ads for slashdot.
        • Re:journals (Score:5, Informative)

          by Cyberdyne (104305) * on Monday May 12, 2003 @11:14AM (#5936571) Journal
          Oh, wait.... It says "User-agent: Mediapartners-Google*" can scan everything. This surprises me however. Still, that's not "GoogleBot", which I see from time to time in my apache logs.

          Anybody got an idea what "Mediapartners-Google*" exactly is?

          Mediapartners-Google would appear to be Google's ad engine - it tries to determine "relevant" ads for the page by spidering it beforehand. Presumably, you would only see hits from that bot if you serve Google text-ads; GoogleBot is the crawler which drives the actual search engine.

          (Aside: Those text ads were quite tricky to filter out - not being images, there's no 'block images' option! Putting "127.0.0.1 pagead.googlesyndication.com" in /etc/hosts did the trick, though...)

          • Re:journals (Score:5, Interesting)

            by cygnusx (193092) on Monday May 12, 2003 @11:45AM (#5936826) Homepage
            > Those text ads were quite tricky to filter out

            You're entitled to block them if you wish, of course, but if the ads don't consume too many bits, and bring the site-owner some moolah, and don't interfere with your browsing, how does blocking text ads help?

            Knee-jerk ad-blocking will only kill free content on the net, imho.
            • Re:journals (Score:2, Insightful)

              by Stiletto (12066)

              Advertisements are intrusive no matter what form they take. Just because they use less bits and/or are smaller on the page doesn't change the fact that they are unwanted.
              • Re:journals (Score:3, Interesting)

                by AndroidCat (229562)
                There Ain't No Such Thing As A Free Lunch. Google provides an excellent free service, and uses relevant text-only ads to pay for it. I look at most web sites as a package deal. If their ads are too much of a PITA, I tend to avoid the site.

                Ah well, your option. Some people do find ads matched to the search to be a useful feature.

              • Re:journals (Score:3, Insightful)

                by delphi125 (544730)
                Compare and contrast:

                A) The ad for HPC I/O: A brief history at the top of this slashdot page.

                B) The ad I get when I search for slashdot on google (It says: "Google is hiring (expert software designers)". YMMV)

                C) The ad on Dutch TV which has some bimbo checking if her white trousers are bloody around the crotch area. (Several variations, for both tampons and pads, she looks over her shoulders to check from behind in a mirror or kicks up in front of a mirror). Note that this occurs at maximum volume first
          • Re:journals (Score:3, Insightful)

            by AndroidCat (229562)
            Putting "127.0.0.1 pagead.googlesyndication.com" in /etc/hosts did the trick, though...

            You might want to use 0.0.0.0 instead. That way you won't get an access attempt on localhost. I usually only block annoying ads (x10) or privacy problems (doubleclick). I don't see the point in blocking Google's text ads.

            One day I'm going to put a mini-server on 127.0.0.1 that serves up cute cat pictures instead of blocked banner ads. :^)

    • Re:journals (Score:4, Insightful)

      by fjordboy (169716) on Monday May 12, 2003 @12:12PM (#5937031) Homepage
      I'm not so concerned about the journals so much as just forums and discussion boards in general. The blogs don't bother me nearly as much as looking for something on google and the first 30 responses are just people spouting out opinions in messageboards....not unlike usenet. I've had to sift through page after page of forums and discussions to find the real information. I'm all for adding a blog.google.com or something, but I think that doing a similar thing with discussion boards and forums would be a good idea as well.

      However, I think there is a potential problem with blogs that also contain real content or at least original content. A lot of people have regular webpages that they just update regularly in a blog fashion...will there be a seperation?
    • Re:journals (Score:3, Insightful)

      by cheesyfru (99893)
      The real question is whether Slashdot itself will be included in this. I don't see how Google will determine if a given website is a blog, and if so, which parts of it are. Slashdot looks like a blog. It has stories posted by humans. Stories can be commented on. It offers an RSS feed.

      Then there are sites like mine [joshw.org], which is part blog and part my website as a singer/songwriter. How would Google determine which parts are which? I'd be pretty peeved if the whole site was tagged as a blog.

  • blogs.google.com? (Score:5, Insightful)

    by fewnorms (630720) on Monday May 12, 2003 @11:04AM (#5936491)
    Thing is, some of these blogs actually contain some pretty handy info from time to time, as blogs are becoming more and more used as a cheap and easy alternative to a content management system imho ....
    • Re:blogs.google.com? (Score:5, Interesting)

      by GT_Alias (551463) on Monday May 12, 2003 @11:16AM (#5936584)
      Which is why Google is not eliminating them entirely, just moving them over to their own search.

      It's a reasonable solution, I think. Is it worth tainting the vast majority of the search results with useless blog entries just so that the (very) few blogs with good information will still show up?

      This solves their problem with bloggers manipulating search results, yet still keeps the information available to those who want it. Granted, you have to know to look for it, but it seems to me like a fair trade-off.

      • by ichimunki (194887)
        I don't see why a separate search would be useful. Perhaps if they had a keywords function that would apply to certain things, this would improve the ability to write a search in the first place. Something similar to the site tag where you coud then do search "foo bar keyword: -blog" to get results for foo bars that were not tagged as being in blogs. Conversely you could search +blog to only get blogs. Perhaps this could tie in with their directory-based listings as well.
    • by simong_oz (321118) on Monday May 12, 2003 @11:22AM (#5936638) Journal
      some of these blogs actually contain some pretty handy info from time to time [my emphasis]

      yeh, that's true, but let's face it - the vast majority are complete and utter drivel and manage to make a cereal packet look like an interesting read.
      • by rf0 (159958)
        I must admit that I've sometimes found blogs more helpful than mailing lists as they normally give instruction on howto do something as the blogger just wants a personal copy for the next time they want to do it.

        Mailing lists on the otherhand sometimes just target one small part of the problem however they are both definitly useful. Of course I'm also nosy so do like to read other peoples live's ocassionaly :)

        rus
      • Re:blogs.google.com? (Score:3, Interesting)

        by Sethb (9355)
        I don't know, my blog has some very useful information that Google serves out to a lot of people needing help, for instance, this page [editthispage.com] is a lifesaver when you hose your Win2000 install using Easy CD Creator, and a lot of people still e-mail me, 2 years later, to thank me for writing it up.
      • by poot_rootbeer (188613) on Monday May 12, 2003 @12:31PM (#5937169)
        let's face it - the vast majority are complete and utter drivel and manage to make a cereal packet look like an interesting read.

        But Slashdot is a weblog... oooh, I see.
  • 'Bout time (Score:4, Interesting)

    by Surak (18578) * <surak@ma i l b l ocks.com> on Monday May 12, 2003 @11:04AM (#5936494) Homepage Journal
    I, for one, am sick of searching material only to find that the page is some asshat's blog. Nothing against blogs, but you never know where this material came from.

    OTOH, what constitutes a 'blog'? Is Slashdot a blog? Is this a blog [witchvox.com]? The lines are constantly being blurred, and I'm not sure it'll be easy for google to make that distinction.
    • Re:'Bout time (Score:3, Insightful)

      by NReitzel (77941) *
      That's what makes google valuable, now isn't it? They consistantly do a good job (better than most) of separating the wheat from the chaff from the link farms.
    • Re:'Bout time (Score:5, Insightful)

      by Qzukk (229616) on Monday May 12, 2003 @11:14AM (#5936570) Journal
      Probably the distinction they will make will be between publicly-available blogging space (livejournal,deadjournal,pitas, and so on) and a personal website that is or contains a blog. This would be the easiest way, since it comes down to setting aside a few hostnames for the new search engine to crawl.
    • Re:'Bout time (Score:5, Insightful)

      by arvindn (542080) on Monday May 12, 2003 @11:19AM (#5936614) Homepage Journal
      what constitutes a 'blog'?

      I was wondering about that too. Its not black and white, of course, especially when you want to automate it. I can think of several indications that a page is a blog, some weighted linear combination of these factors should work well enough in practice if you spend some time tweaking the weights:

      • Updated frequently
      • Keywords like "blog", "weblog", "posted by", "comments", "permanent link", and so on.
      • Got dates all over the place
      • Is hosted on one of the popular blogging sites (blogspot, lj, /. journals...)
      • Links to and is linked from other weblogs.
      This last factor is important. If you start from a rough heuristic and execute an iterative algorithm, similar to how they calculate pagerank, your blog detection algorithm will get better.
      • by mcmonkey (96054) on Monday May 12, 2003 @12:39PM (#5937232) Homepage
        • Got dates all over the place

        Well, that rules out /. Anyone who spends a lot of time here certainly doesn't get dates all over the place.

      • Updated frequently ... "posted by" ... dates ... hosted on one of the popular blogging sites ... Links to and is linked from other weblogs

        Sounds like the news sections of most SourceForge.net projects I've run into. They're updated frequently (release early, release often), the maintainers frequently post status updates on given dates, SourceForge.net has a lot of them, and they link to other projects that use their code or that contribute code that they use.

        Is SourceForge.net a blog?

    • Re:'Bout time (Score:5, Interesting)

      by EinarH (583836) on Monday May 12, 2003 @11:20AM (#5936620) Journal
      The other day I searched Google for some radio stuff. (helping my father find some equipment).

      Then I noticed that Radio Userland appeared very high on Google. In fact, when you search for "radio"* they get a #5 at Google. As far as i know they only existed for a year. And their popularity, as it appears on google, looks very inflated because of extremly many links in blogs.

      Checked out Daypop.com, which ranks articles/links based on the number of links in blogs. This is what I got:
      Searching All Weblogs for link:radio.userland.com... Found 3260 pages matching query.

      Thats insane. When so many blogs links to the same page their ranking on google gets very high based only on blog-popularity.


      *Searching for only radio is obvious a bad idea as google returns some 40 m. hits.

    • Re:'Bout time (Score:5, Insightful)

      by Anonymous Coward on Monday May 12, 2003 @11:50AM (#5936872)
      I, for one, am happy of searching material only to find that the page is some asshat's blog...

      because what is important, in my point of view, is to GET THE ANSWER to what I'm looking for.

      And if the answer is in a weblog that belongs to "Linux-freaks.Adhzerbahidjan", it still is the answer I'm looking for...

      I mean things like "Proftpd doesn't seem to accept fxp connections", why the hell is this part of my distro not working as I wish...can only be proposed by people having the same problem and discussing it in a blog.

      Another reason I prefer Weblogs to, say, IRC is that I don't have to humiliate myself asking "basic" questions to the 15 year old Guru that is nicknamed "EvilRootBeer" , I just have to parse a few blogs and get my answer without ANY fine manual to read.

      "Nothing against blogs, but you never know where this material came from." Because you KNOW where the news from CNN is coming from ? I mean, they show proof and research material everytime they air a show, or a major groundbreaking news ("Mass destruction weapons found in Irak","Terrorist Bretzel Fails Coup d'Etat"..."

      at least with blogs and the net, you can try and cross check the data, whereas with tv, you usualy only gulp some more mountain dew.

      I just wish you had to find you Linux docs using the manuals provided on the distro and absolutly no other acees to raw data...
      • Re:'Bout time (Score:3, Insightful)

        by Surak (18578) *
        When searching for that kind of data, data from blogs is perfectly acceptable.

        But sometimes I search for non-tech related information (shocking, I know). In fact, I was searching for information about a rare debilitating disease that a doctor told my friend that she might have (can't remember the name anymore off the top of my head) a couple of months ago and I wanted to learn about it... I typed the name of the disease into google and the first link that came up was some asshat's blog about how his aunt
  • Great! (Score:3, Funny)

    by Negatyfus (602326) on Monday May 12, 2003 @11:05AM (#5936507) Journal
    Nobody wants to read your blog and this just proves that point!
    • Re:Great! (Score:4, Interesting)

      by Joe the Lesser (533425) on Monday May 12, 2003 @11:36AM (#5936756) Homepage Journal
      Somehow I can't drop the feeling that this will be very similar to a spam filter...
    • Re:Great! (Score:3, Insightful)

      by MagPulse (316)
      I write my journal for friends, but I make my entries public. I want people who know me to be able to find my journal and read it. But it's not written for the masses, and those interested in the content will definitely use the Google Blog search instead of the standard one.

      I welcome the change, and I'm glad people won't be seeing my journal that don't want to.
      • by British (51765)
        Uh, first mistake. There's some lenghtly disclaimer on the net who makes the correlation that his/her(it's gotta be a her) blogger is no different than a paper diary hidden underneath someone's bed. Wish I could cough up the URL, since it comes off as so pretentious.

        Mind you, I find it 100 times easier to read his/her blogger from the comfort of my own home, as opposed to breaking in someone's house and ganking a Teen Girl Squad-like diary.

        In my opinion, anyone's public-entry [blogger || lj || dj || diary
  • yay and aaah (Score:3, Insightful)

    by DaLiNKz (557579) on Monday May 12, 2003 @11:06AM (#5936509) Homepage Journal
    What about personal sites that may seem like blogs? example.. mine.. I have a blog but then again later on i plan for some more content and such.. hopefully it doesnt remove my site from the main index.. or will at least return it once the site becomes useful.
  • by DeHar (92476) on Monday May 12, 2003 @11:06AM (#5936514)
    This is a great idea, especially since many issues have much more commentary than source content. I love the quote "But what happens when the weblog fad dies down?"

    However, I hope they maintain links between the main search and the blog search. Finding primary sources, then a button linking to all blog comments on theis topic would be a great research tool.
  • Good to weed out.... (Score:5, Interesting)

    by caffeinex36 (608768) on Monday May 12, 2003 @11:06AM (#5936519)
    Most of the useless information people put into blogs. Although, when you search for information, would you want to search 2 different locations? This is the whole claim to googles fame. I have found that many times people post how-to's in thier blogs along with other information.


    If it ain't broke...don't fix it

    -Rob
  • has already confirmed this as false..

    our register friedns had slow news day instead..:)

    • Re:Ev from Blogger (Score:4, Informative)

      by Anonymous Coward on Monday May 12, 2003 @11:30AM (#5936710)
      Right, go to this entry [evhead.com] at evhead, and view the source, you'll see:

      <span title="you know, in order to spread more 'Google censors Evhead' suspicions"></snip></span>
      <!-- Andrew Orlowski strikes with another brilliant theory [theregister.co.uk] designed to get attention from bloggers (even though the number of their readers is of course "statistically insignificant"). Well shit, I'm biting.

      Based on Eric Schmidt's mentioning of a blog search [yahoo.com], Orlowski suggests that Google will remove blogs from the main index.

      This shouldn't surprise many people, but as far as I know, Orlowski is full of crap. Again. If Google didn't find that blogs improved the results (and I don't know, I would assume they test these things, like, constantly), do you suppose they'd increase the frequency at which they crawl them, or decrease it? Yes, that's what I think.

      Too bad my headline isn't any truer than the Register's.-->

  • Personally.. (Score:5, Insightful)

    by xchino (591175) on Monday May 12, 2003 @11:08AM (#5936529)
    I've found some of the best information on blogs. I have no problem with them making a blog specific search, but like the Linux specific search I hope relevant sites can still be found from the main search. It would be a pain to have to search every individual google engine for one bit of info. As it is now, I can use the main search and be pretty sure that I'm going to get a relevant result regardless of what category the site falls under. If I'm looking up what IIRC stands for, I don't really care if I get the info from a JoeBlow's blog or from howstuffworks.com.
    • Re:Personally.. (Score:2, Interesting)

      by Fishstick (150821)
      Hmm, I'm hoping the results are excluded, and blog is a "tab" just like the web, images, groups, directory, news are now.

      I've found this mechanism to be really effective in helping me find what I want.

      I use the google toolbar - this defaults to a 'web' search. 95% of the time what I'm looking for comes up on the first page. If not, I can click on the 'groups' tab, where my search is repeated (like when I'm trying to figure out an error message or somesuch).

      If the thing I'm looking for is a business, or
  • Yes! (Score:5, Funny)

    by acehole (174372) on Monday May 12, 2003 @11:08AM (#5936530) Homepage
    Now I can find and read about people's mundane activities more efficiently.

    e.g.

    9:30am :- ate some toast

    9:40am :- went to the toilet

    10:50am :- left the toilet to check the number of hits on my blog

    11:45am :- got a phone call, was wrong number. Might get a real call one day.

    and so on and so forth..

    • Re:Yes! (Score:4, Funny)

      by Timesprout (579035) on Monday May 12, 2003 @11:18AM (#5936606)
      By your toilet reference I can see you have obviously mistaken a stream of diarrhea for someones stream of consiousness .... easy mistake to make with many of the blogs out there.
    • Ummm... no (Score:4, Insightful)

      by neurostar (578917) <neurostar@noSPAM.privon.com> on Monday May 12, 2003 @11:22AM (#5936641)

      I think you're confusing a weblog with a "livejournal". A weblog is similar to slashdot (or warblogging.com [warblogging.com] and back-to-iraq.com [back-to-iraq.com]). In fact, my weblog (http://privon.com) deals with politics, science, and civil rights as well as opinion pieces I've written about various issues. A weblog is another source of information.

      What you're thinking of is commonly called a "livejournal" and it's exactly that - a journal. Some blogs are also journals. For example, I've got two 'blogs'. One is the one I mentioned above. The other is slightly more journal oriented, with me posting about things I've done that my family and friends (and possibly others) might find interesting. For example, I've recently posted about visiting the Trek Bicycles Demo Day as well as some of my latest photography experiences.

      It might be beneficial for you to review your definition of a blog. Blogs can be an excellent source of information, not just a diary.

      neurostar
      • Re:Ummm... no (Score:3, Insightful)

        by tekunokurato (531385)
        What are you talking about? Are you saying that the average content of Blogger is any different from the average content of Livejournal? They're just different branded terms for the same thing- a personal site following a chronological updated format, containing whatever people want to put in them. For example, in my livejournal [livejournal.com], which I call a livejournal because it uses code from www.livejournal.com, I write articles on politics, movies, creativity, or any other topic I happen to feel like writing abou
  • ID? (Score:2, Interesting)

    How are blogs being identified as opposted to non-blog pages? I can see how newsgroups could be moved to a separate search but blogs aren't easily identifiable. Will Google rely on bloggers to identify their sites to Google? I suppose that could work as the article states that bloggers want to legitimize what they do through such a move as Google is approaching.

    I also like the analogy made by the article to the voting system where a page votes for a topic: an expert site on turtles voting for turtles once a

    • Categorization algorithms that combine different features would work quite well here, I believe!

      There is a wealth of categorization systems out there. Generally, they "position" the sites in an imaginary, highly-dimensional space, depending on whether keywords occurr (and how often/prominent etc.), and on certain structural properties of the documents. You can then try to define separating hyperplanes, which are functions that devide the ("feature") space into separate compartments, so you can group docume

    • Re:ID? (Score:3, Interesting)

      by nycroft (653728)
      Probably by the source code located in the HTML template. For example, Blogger's code has to include case sensitive tags like [Blogger][/Blogger] to format the web-based posting. I'm not sure how they would tell for other types like Blog*Spot or Moveable Type. I assume they have some sort of the same types of tags. Or maybe by noting server applets related to the HTML template.

      O yeah, one more thing, Google bought Blogger, so that's another way they'll be able to tell.
  • by gleffler (540281) * on Monday May 12, 2003 @11:09AM (#5936537) Journal
    I just hope that Google does at least say "Hey, you might be able to find what you're looking for on our blog search" at the end or something - like they do now with Google Answers. I do applaud their effort to make their database even more relevant though, and is yet another reason I have to admit to being a shameless whore for Google.
  • by Snowhare (263311) on Monday May 12, 2003 @11:10AM (#5936538) Homepage
    I wonder if this is also intended to stop Googlewashing [theregister.co.uk]? Google has a history of trying to 'play fair' - and the power of a few well connected blogs to basically 'take possession' of any term works against that philosophy.
  • This is going to be nice...Its quite annyoing to get a bunch of weblogs when your looking for something

    They are probably going to have to expand their cluster in order to add another cache....Hopefully it wont impact search times...but google usally does a good job at adding in their new grids relitivly well.

    Microsoft does smoething similar...they cache everything that is !(pro_microsft) and (Linux) the problem is they dont let you search this index :)
  • Mirror (Score:2, Funny)

    by Hard_Code (49548)
    Here is a mirror:

    Google to Create Blog Search Engine? [216.239.53.104]
  • /. is a blog, no? (Score:3, Interesting)

    by Eponymous Coward (6097) on Monday May 12, 2003 @11:13AM (#5936559)
    Am I the only one who thinks it is funny to see all the anti-blog comments everytime a weblog related story is posted? IMHO, Slashdot is a weblog.

    I think I originally found Slashdot on RobotWisdom-- yet another weblog. But that was a couple of years ago...

    • Slashdot is great, but do you actually want to get slashdot comments as search results? As far as I can tell, google doesn't index them at all, anyway. To me, it makes sense to separate the search for primary material (like slashdot's links and features) from the commentary on it (the comments). Of course, slashdot's primary material is mainly references to other primary sources, so there's not much for the main google search to get here; the blog search could, however, pick up a lot.
      • by melonman (608440)

        To me, it makes sense to separate the search for primary material (like slashdot's links and features) from the commentary on it (the comments).

        I can't see how you could even begin to do this consistently. Most of the 'primary' (by your definition) material referred to on /. is summaries of or comments on something else. In many cases you could argue that it is 4 or 5 levels away from 'primary'.

        On the other hand, you often get genuinely creative stuff in response to someone else's article. In the academ

    • Re:/. is a blog, no? (Score:5, Interesting)

      by RobotRunAmok (595286) * on Monday May 12, 2003 @11:40AM (#5936790)
      /. is a blog, no?

      No. SlashDot aggregates news stories. It's the Web generation of what the BBS guys had in CompuServe Forums and GEnie Roundtables. The staff is paid to aggregate and thread stories that are of interest to a particular community. (Sometimes they aggregate the really, really good ones more than once.) Technically, SlashDot staff don't submit the stories, members of the community do. Bottom line: it's a professional operation. (g'head, g'head, make the jokes, it's Monday, get 'em outta yer system...)

      Personally, I would use the litmus test of "professionalism" when doping out what is a blog versus what is "legitimate" content. If the "blogger" makes his living as a writer or journalist, then the blog is "supplemental online material." If the site is, as we referred to the vanity publishing phenomenon back in the early '90's, someone's "homepage," but with the added baggage of semi-regular diary entries, then it's a Blog.

      Use of "blogging software" doesn't make someone a writer, or a journalist, and it certainly doesn't automatically grant its user something worth saying, or even something factual to say.

      It's great to see Google realizing this and clamping down.
      • by sethadam1 (530629) * <(moc.ebuttsrif) (ta) (mada)> on Monday May 12, 2003 @11:55AM (#5936913) Homepage
        Slashdor IS a blog. Because we're not talking about some Google employee sitting around and making a judgement call on every link on the net, it's obviously going to be automated by robots.

        Slashdot, like other blogs, pollutes search engine searches with their "permalinks," which, although they might be useful, certainly constitute a blog. In fact, one of the problems with blogs and search engines is that they generate thousands of clickable hyperlinks effortlessly. It's great for someone reading a blog and trying to bookmark a certain section - it's terrible for the guy who wants information on combatting spam through more effective use of his SMTP server and has to search through 30 pages of /. and K5 chatter to find some substance.

        Certainly, Google's criteria for what defines a blog might be helpful, but it seems to me like you're subjectively deciding which blogs are legitimate news sources and which are "some kid rambling on." Say whatever you like about the legitimacy of /., but make no mistake about it, it's a blog.
      • Take a look at RobotWisdom.com. This is one of the original weblogs and it seems very similar to Slashdot. Okay, there are no user comments (which arguably is where the value in Slashdot is), but the similarities are apparent.

        I would say that measuring the legitimacy of a site and it's content by the number of banner ads and subscriptions is foolish and far too narrow.

  • hmmmm... (Score:2, Funny)

    by jeffy124 (453342)
    maybe it'll solve CmdrTaco's troubles about him getting emails [slashdot.org] from people looking to crack hotmail [slashdot.org].
  • blogs (Score:5, Interesting)

    by Blocked By Sand (623943) on Monday May 12, 2003 @11:13AM (#5936563)
    One of the biggest newspapers in Norway, where I live, has recently said they believe blogs to be the new 'killer app' for delivering information on the net. The problem with that is that the treshold for publishing 'news' is so low, anybody can do it. This makes it very difficult for people to find the info they are looking for. At the same time there is no guaranty the info is useful or even correct. A good reputation will be more and more important for businesses and sites on the net.

    This move by google tells me newspapers in norway aren't the only ones seeing how influental blogs will/could become.This is a truly great step forward if Google could come up with a way of rating the different blogs. That way you could easily find serious tech-blogs.

    Wonder what rating /. would get though ;)
  • is ./ a blog? ebay? (Score:5, Interesting)

    by loomis (141922) on Monday May 12, 2003 @11:15AM (#5936575)
    As a previous poster briefly mentioned, what exactly is a blog? Would Slashdot forums be considered a blog? What about the myriad of ezboard message board forums out there, as well as other discussion websites? If the answer is no, it would be seemingly difficult and perhaps only of minor benefit to seperate just the true "blog" sites while ignoring the other sites.

    And what about ebay? Quite often I am searching for info on an old piece of electronics I've picked up someplace, and I do a goole search, hoping to find information about the item. Well, all I get in return are ebay links to a similar item that was sold on ebay a few months ago. And even then, I click on the link, hoping to see what the item sold for (and thus get an appraisal), but the auction has been removed from the database due to it being several months old. Why index ebay pages? It's really frustrating.

    Loomis
  • by friedegg (96310) <`moc.bdgniltserw' `ta' `nayrb'> on Monday May 12, 2003 @11:16AM (#5936580) Homepage
    "GoogleGuy" (a real Google employee) commented on this [webmasterworld.com] on WebmasterWorld saying:
    I think Andrew Orlowski is taking a comment and taking it in the direction that he wants to go. I would take that article with a grain of salt.


    GoogleGuy, going for understatement. :)
  • by Thoguth (203384) * on Monday May 12, 2003 @11:16AM (#5936585) Homepage
    I really don't mind finding blog links when I search for something, as they usually at least link to some relevant sources.

    On the other hand, it is really a pain to search for help on something, and instead of getting a useful, authoritative document, I'll get a half-dozen archived unanswered mailing list posts from people with the same problem. I would much rather Google address this dilution from mailing lists.
    • A very valid point, mod parent up. I've faced the same problem. Incidentally, I haven't faced any problem from "mountains of blogs" clogging up the "ratings system": few people will link to a blog if it is content-free, so IMHO pagerank is enough for filtering out useless blogs. OTOH, pagerank doesn't work very well on mailing list archives, because links to the archives as a whole say nothing about how useful an individual post is likely to be.
  • by acarr0 (652849) on Monday May 12, 2003 @11:17AM (#5936598)

    The general consensus appears to view this tabbed filtering as a good thing. There are some valid concerns about missing out on good information as a result. Naturally one can go to the "Blog tab" to conduct a search but most people will likely tend not to do this.

    It seems to me that this may be an opportunity for google to improve upon their user interface a bit. Since most folks use the simple imterface provided on the main page it seems to me that adding a few check boxes just below the text box would be a good idea. That would allow for the quick addition of groups and/or blogs to your search query.

  • by borkus (179118) on Monday May 12, 2003 @11:17AM (#5936600) Homepage
    One of the reasons that Google is filtering out blogs is "link-backs". These create huge rings of circular links, distorting page rankings. Because of this, I wonder if Google is going to look for template elements in the HTML of a page to determine if it is a blog. Or maybe anything that has a Radio Userland, Blogger, Gray Matter or Moveable Type medallion on it will be ignored.
    • Excellent point.

      One thing this (the polluting of Google results with high-ranking, low-information blog comments) is proving is that ultimately evaluating the reliability of content is an AI problem. The blog issue is a problem in all web-of-trust models of evaluation: when one uses a consensus-based model to determine "truth," urban legends tend to rise to the top and detailed technicalities tend to sink to the bottom. Rating blogs can be done in two ways: intelligence, or statistics. And the rating of

  • Bad Idea (Score:5, Interesting)

    by rwiedower (572254) on Monday May 12, 2003 @11:22AM (#5936637) Homepage
    I work at a company [peyser.com] that has a blog-like recap [peyser.com] of political news of interest for our clients and friends. If google tries to separate all sites with blog-like content, won't this naturally reduce my rank without actually increasing the source of information? Or am I missing something? How is google going to search for blog-like sites?
    • I work at a company that has a blog-like recap of political news

      From the look of the site, I don't think that would be considered a 'blog, as such. Think more along the lines of /. here: if any fool with an opinion can post commentary which (though moderated) is not deleted or editted, it's probably a 'blog.

      • I agree that it's not a blog. I just want to know how google "knows" it's not a blog. Would it simply search for pages that change at least once a day and contain links to other websites rather than original content? Would it look for field marked "comment" on each entry? Obviously, I'm interested in NOT being labelled a blog.

        If it's purely comment based, then what about sites like TPM [talkingpointsmemo.com]? It's clearly a blog. Or would the answer be somewhat voluntary, in which case it wouldn't actually work?

  • by arestivo (459117)
    Wouldn't it be better if they include blogs in their searches by deafult and then have a 'remove blogs from this search' link.

    I think this solution would make everyone happy.
  • Blur (Score:5, Insightful)

    by limekiller4 (451497) on Monday May 12, 2003 @11:23AM (#5936655) Homepage
    Why not just create a "-source" flag or, as has been suggested, "-noblog"? Why are blogs being marginalized as any less authoritative than other hits? Why is using "-" (eg: ["trading cards" -hockey]) utilized for weeding out certain criteria but not employed here when the goal is the same? Could we at least have a flag for combining the two results?

    A comparison is being made between blogs and the newsgroups which are worlds apart in a number of different ways not the least of which is the thread-nature of the groups.

    What defines a blog, anyway? What defines a not-blog? Is CNN.com a blog? Is it not a blog because many people write for it, because of the number of hits it gets or because it has press credentials? Which category does indymedia.org [indymedia.org] fit into?

    Will I only get news results when I search for "ferret care?"

    What if the source IS a blog? If the subject IS the blog, will a news site reporting on the blog wind up in the main search results while the subject itself -- the blog -- be only in the blog search?
    • I believe they do not care about blogs per se, but their ability to interconnect large numbers of pages via the "friends/enemies/whatever's most recent entries" lists that journal sites have.

      I am guessing they will just skip and index separately the large blog sites that contribute to vitiating google's page ranking results. It's conceivable that the page rank system can be used to distinguish ranking anomalies characteristic to these sites and thus weed them out.

      I don't think this will affect people ru

  • Great idea. (Score:3, Interesting)

    by Musashi Miyamoto (662091) on Monday May 12, 2003 @11:25AM (#5936666)
    I love this idea... and I have been waiting for something like it for some time...

    Think about it... I would love to search the blogosphere to see how widespread certain news items have become, or how widespread a certain opinion is...

    You could use something like this to measure the spread of ideas (at least within a vocal and technologically suave minority).

  • by crashnbur (127738) on Monday May 12, 2003 @11:27AM (#5936682)
    ...remove them from the normal index, handling them instead in much the same way as their usenet archives...
    One would think that the Google blog search would work more similarly to the Google News search, which searches headlines from online news publications all over the web from all over the world. Google Groups is, as you know, just usenet... Google News, however, like the new Google blog search, will be indexing sites on the world wide web (ostensibly removed from the normal index).

    Ehh, the point of this message is to inform the uninformed of the wonderfulness of Google News [google.com]. It automatically features prominent headlines from all over the web, and you can search for topics, keywords, etc. in the search bar and have results sorted by relevance or date. News articles are mostly excluded from the normal index, which makes Google News the best headline locator on the Internet, by far.

  • There's no indication whether or not blogs will be left in or out of search results. This is very different from USENET, which was never part of the web in the first place. Orlowski is far from an unbiased source on this, having published many articles critical of bloggers in general. While two source are cited which are critical of the effect that blogs have had on the google ranking algorythm, none are cited which show the contributions personal publishers have made to the info-sphere.

    Far more authoratative sources that I [weblogs.com] have already weighed in on this.

    While there's certainly a lot of innane content available in blog form, this isn't really any different than it was before. I have never had to wade through 500 pages of results to find an original source either. The whole thing reeks of FUD to me Methinks that Orlowski and Roddy have their own axes to grind.
  • by Bartmoss (16109) on Monday May 12, 2003 @11:31AM (#5936718) Homepage Journal
    Alright, fair enough - but how do you identify a weblog? They can do this for blogger/blogspot/whatever that they bought, and maybe standard software like moveable type etc. But what about sites based on slash, phpnuke or totally custom code? And where does a weblog begin and a news site end?

    Filtering out usenet news is relatively easy, but weblogs? Mhhh, I shall remain sceptical until I see it implemented.
    • That's a great question. Does a site with "News and Commentary" fit in the blog catagory if only one or two people write it?

      What if it looks like a blog, but has nothing but on-topic posts (whatever the news-site's topic may be)? It has too many opinion spots, though, so it can't really be purely news. Does the fact that it's about a subject, and not some person mean it's no longer a blog?

      The line between Blog-NotBlog is so fuzzy at times, I don't see how they can fairly make a distinction.

      After all,
  • by Anonymous Coward on Monday May 12, 2003 @11:34AM (#5936744)
    If they really want to make their search engine useful, they ought to separate out Web archives of mailing list discussions. Blogs usually link back to where they got the story, so with only a little digging, you can find the original material. Mailing list discussions, though, are often out of date, irrelevant, and lacking in easy-to-follow references. They annoy me much more when I'm looking for things on the Web.
  • by benploni (125649) on Monday May 12, 2003 @11:38AM (#5936769) Journal
    Thanks for making Google a link to Google's web site. I would never have been able to find it! Maybe I could have googled for it. Oh wait, nevermind.
  • Commenters have asked how Google will tell weblogs from other web content. Obviously there is not one universal way to do this so the search engine will have to look at a number of indicators:

    How often does the phrase "current mood" appear?

    How often does the phrase "listening to" appear on the same page as "current mood"?

    Does "George Bush" or "shrub" appear on the same page as "dictator", "simian", or "ass"?

    Is Wil Wheaton mentioned on the page?

    It's a start. Google will have to pay me for more...

  • Feedster.com [feedster.com] (formerly known as Roogle, for RSS Google) is a blog search engine that has been around for a while now. It'll be interesting to see how Feedster does once Google comes out with their engine. If it's shot to oblivion, it won't be the first time Google dominated a search engine niche.
  • by Bonewalker (631203) on Monday May 12, 2003 @11:52AM (#5936886)
    I can just see the red, blue, yellow, and green logo...BLOOGLE. Will the new term for searching blogs explicitly be "Bloggling", or will it be "Bloogling"?
  • The Blogs are sites which are Hubs, i.e. they contain a lot of outgoing links to diverse sites. The source sites are usually Authorities, a lot other sites link to them. IBM had developed an engine called Clever, at around the same time as google, that gives separate ranks for Hubs and Authorities.
  • I must use google hundreds of times a day and it seems to be as good at finding what I'm looking for as it always has been. I like the idea of being able to search only blogs, but is there a need to remove them from the main index too?

    All these specialized search engines are nice (usenet, images, blogs), but I still want the ability to search everything at once. Being able to find everything under the sun by typing "g [text]" in my browser's location bar is the best part about google to me. Please don't co
  • ephemeral content (Score:4, Interesting)

    by esme (17526) on Monday May 12, 2003 @12:14PM (#5937047) Homepage
    i don't know that i have any particular need to have blogs filtered out of the google index (i don't see them very often in the searches i do...).

    but filtering out ephemeral content in general would be good -- blogs would be included in this. so would mailing list archives, news stories, online stores, auctions, discussion groups, etc.

    when i'm searching, i almost always prefer a page that somebody authored and put up as a permanent resource (or as permanent as the web allows). the top-level pages of the ephemeral sites would probably be good to keep in the main index, though i'm not sure how you index, e.g., the /. homepage.

    -esme

  • Offtopic... (Score:4, Insightful)

    by jasno (124830) on Monday May 12, 2003 @12:18PM (#5937071) Journal
    I've never had any problems with blogs, but the archived mailing lists are what really bugs me. Searching for something, only to have the first 10 pages of hits be duplicates in various archives of a list makes finding relevant information a bit more difficult.
  • by marmoset (3738) on Monday May 12, 2003 @12:27PM (#5937153) Homepage Journal

    Oy. If Slashdot had managed to perform even a minimum amount of editorial diligence (which, pot, here's kettle, is what the Register rails on bloggers for not doing), they'd have found pretty quickly that this article is yet another installment in Andrew Orlowski's (an up-and-coming Dvorak-wannabe) ongoing jihad against weblogs. Don't believe the hype.

  • what is a blog (Score:3, Insightful)

    by mboedick (543717) on Monday May 12, 2003 @12:36PM (#5937209)

    Determining what is and what is not a blog will be a lot harder than determining what is and is not in a newsgroup.

    I think this is a bad idea. Google has made a mistake if they think what we call currently call "blogs" are a novelty item. Blogs are the future of the web, even if a lot of people are using the technology for toy purposes today.

    I want to be able to search the entire web in a single index, blogs and all. If PageRank is giving too much noise and not enough signal due to blogs, then fix PageRank.

  • Not Quite (Score:5, Informative)

    by emmastory (660486) on Monday May 12, 2003 @01:04PM (#5937408) Homepage
    Google hasn't announced any such thing, at least as far as removing weblog content from the main search is concerned. If you read the article, you'll note that it's Orlowski speculating about a Slashdot comment, of all things - specifically, a comment from the William Gibson [slashdot.org] blog thread. evhead posted [evhead.com] about this Register article on Friday.

Life would be so much easier if we could just look at the source code. -- Dave Olson

Working...