Slashdot is powered by your submissions, so send in your scoop


Forgot your password?
The Internet

A New Tack In Search Engine Formulation 99

An unnamed correspondent writes: "PC World reports that 'big-shot Web directories such as Yahoo and LookSmart' are missing thousands of the best links, which a new startup HotLinks has in in their directory by building it from people's bookmarks." This sounds like a smart idea (building from people's own bookmarks), but is it doomed to create in-breeding of links? That is, in a search engine based on bookmarks, will they be able to get enough "new blood"?
This discussion has been archived. No new comments can be posted.

A New Tack In Search Engine Formulation

Comments Filter:
  • It's only natural that you have very few, if any, links to static pages, since you've visited them, read the information you needed, and you're not likely to come back (nothing changes!).

    Strike that. Reverse that. Since they're static, you're not likely to hit them very often, thus making you less likely to remember the URL, thus making you more likely to bookmark it.

    The sites that I hit daily, or even weekly, I can easily remember the URL. In fact, I've gotten to the point where it's so automatic for me to type in the URL for some sites that when asked what's the address for a site I visit often, I actually have to mentally type out the URL to remember (it's more finger memory than brain memory, I guess). The only dynamic pages in my bookmarks are to my dnet stats and the local weather page. The rest are static.

    I'm pretty certain that this is where the value is going to be located.


  • I often use the free (as in beer) search engine made by Copernic. []

    Basically they offer web searchs based on categories with each category using some of 80 different search engines to aggregate a result, with categories such as :
    - programming
    - tech news (searches sites such as slashdot and the register)
    - games
    - file search
    - computer security (searches astalavista for ex)
    - humour
    - mp3
    - e-mail addresses
    - shopping
    - news
    - newgroups
    - the Web(America),the Web(UK),the Web(Mongolian Goat Herders etc-well you get the picture)

    Best of all it can filter out all the dead links before giving you its search results

    It use engines such as Google,The Open Directory Project, Alta Vista, Hot Bot etc as new search engines come along they are available to use when you next update Copernic via the net.

    They sell a commercial version but that differs only in the ability to cutomize more and do get rid of the annoying adverts whilst your surfing

    If you want the commercial version just download the free one and get a license number from astalavista [] , register your software via the menu and lo and behold you have now got the Pro version.

  • More like self-reverential.

    I agree whole-heartedly with the newspaper comment. I also like finding articles that I am not looking for. That is why I still get print periodicals.

    Luckialy most search engines are excelent at finding pages you aren't interested in. Truly one of the joys of the web, and I don't mean that sarcasticly!
  • I could be wrong, but I believe that's how google [] works.
  • I should add that Hotlinks does not index nor search content on those millions of links it has.

    If you search for "four score and seven years ago" (the beginning of American president Abraham Lincoln's Gettysburg Address, for our international visitors), you get bupkus. Nothing. Any other search engine will turn up the Gettysburg Address in the first ten links.

    The first problem with this is that it requires too much pre-knowledge. I have to know that it is called the "Gettysburg Address" or get lucky in that someone might have titled their page or link "four score and seven years ago." If I was an Iranian student studying American slavery and somebody passed along the first few lines of the speech as something I should be aware of, Hotlinks is of no help whatsoever.

    The second problem with this is that, in looking at my own links, I see that the titles of the links are often completely irrelevant, inaccurate or vague. What use is a search based on page titles? Even Yahoo returns normal search engine results when it doesn't find anything in its own directory.

    Finally, if I want someplace to store my links, I'll use my own web site.
  • for saying "tack" and not "tact" like so many people do. Let's take a different tact?! "But it's short for 'tactic'!" they cry. No, it's not.

    I know, I know, off topic. Had to be said.

  • by Alien54 ( 180860 ) on Saturday November 04, 2000 @07:03AM (#649929) Journal
    It depends on where they get the book marks....

    after all, what kinds of links would you get from everyone who worked at Microsoft? or Sun? Would there naturally be a corporate bias in the culture.

    or a regional bias, or whatever....

    what you would probably need would also be some rating by the internet age of the person (how long have the been online) because the people who have been around awhile probably have a more useful collection.

    and I also wonder how different this is from advertiser tracking of where you go by cookies.

    the best combination might be to combine cookie tracking with an internet search engine database. but there are drawbacks here as well.


  • Being the egotist I am, I decided to try them out by looking for my own pages. My home page was there, sort of. What's interesting, and indicative to me of just how badly their approach works, is that the one link I found is a good five years out of date. It found a bookmark to the site when it resided on my former employer's server. (I acquired my domain almost four years ago.) Worse, it includes an intermediate directory that was made unnecessary almost a year before that.

    All of which suggests to me that their attempt to eliminate human effort will produce a lot of old garbage (I don't clean up my bookmarks; do you?) and the obvious set of well known corporate sites.

  • by Jeffrey Baker ( 6191 ) on Saturday November 04, 2000 @07:27AM (#649931)
    Gee, that sounds an aweful lot like Backflip [], who, by the way, are Fucked [].
  • if you don't like it, open your browser with a hex editor and change "onload" to something else. i do that for "onunload" myself, just to screw jerks who wanna pop stuff up when i leave.
  • IMHO...

    In my opinion spiders are better. A spider can go out on the web and get the latest links. Then it can heep them up to date. Google does this and so far they seem to have a very good search engine. They can also (I think they do) use 'clicks' to rank the links bringing the most clicked upon links to the top of the search results. Giving you what people click on most in the search results. NBCi does this too with there global brain technology (whatever that is).

    I don't want a lot, I just want it all!
    Flame away, I have a hose!

  • I have been experimenting with using an algorithm based upon Minesweeper to provide accurate website match distributions which loosely correspond to a fractal representation of the Fresnel zone as depicted by Led Zeppelin. This, equatorial approach to polygamy can be closely equated to the waxing hole in the ozone layer giving cancer to penguins in Antarctica.
  • by mpskeeter ( 78844 ) on Saturday November 04, 2000 @08:14AM (#649935) Homepage
    No, I am not mpskeeter--I clicked on that link below and now I am him...sorry about that.

    Do a search for "password"--some of these geniuses have their banking and etrade usernames/passwords up there. Email and xdrive passwords are abundant.

    Also, an awful lot of these guys look at illegal pr0n. These bookmarks are right next to the ones showing their personal home pages with pictures of the wife and kids. The FBI and a divorce lawyer or two are gonna have a field day with this.

    I tried to contact one of the guys with his bank account open, but, for security reasons, his email addy is not on his profile...

    real smart website they got there.

  • and it isn't even able to find my own Hp, google shows it as number 4, on the first page, with two keywords in search...:-)

    I don't like their cokkies, with my IP inside, even @home with dynamic IP...glad that my ~/.crashicator/cookies is set 400. But wait, it has changed to 600....:-(

    Think I have to change it with chattr, to prevent Netscape from doing this...

  • I use different computers too, but I use for my bookmarks...
  • umm, so what you're saying is, "yes"? (it will be doomed to create in-breeding of links)? Oh well, I guess "yes" wouldn't have gotten you as many karma points...
  • by Anonymous Coward
    oh yeah, well i found the password to Anonymous Coward!
  • hehe, that's funny
  • umm... haven't you just described google []?
  • As opposed to what? the anti-self-referential belief systems that comprise the internet or other search engines?

    I think hot links would be an excellent complement to the number of search engines I use on a daily basis.

    Stop torturing yourself with these incessant thoughts on psychosociological quandaries and get laid! :P

    people are starving to death out there somewhere! obsessing over things like this seems to be a bit extreme... :)
  • I'd *really* like to see lateral discovery added to this (props to Did 1,000 people have links to theFatProject [] ? I want to know what pages the majority of those linkers have in common. Maybe they'd all have links to the Body Modification Ezine []. Or perhaps 83% of them linked to the NSA parody [] site.

    This would be highly cool for finding eclectic stuff. Kinda like, well, the lateral discovery in Napster.


  • When I visit the Hotlinks site, I get a blank page. Oh, another clueless site that assumes everyone wants to enable Javascript and be bombarded with popup windows and other junk.

    Danny, who still prefers Netscape 3.04

  • It's called Google.

  • by mpskeeter ( 78844 ) on Saturday November 04, 2000 @05:01AM (#649946) Homepage
    Hmmm.. I'm posting this as 'mpskeeter', though
    my username at slashdot is totally different. :)

    Guess how? etc was
    a link on that site, enabling people to easily gather username/passwords.

    (Offcourse, bookmarking such a link is a _bad_ idea, it even says so on the login page) :)
  • Interesting concept. I tend to bookmark sites that have stuff I may need again and again -- drivers, products I don't currently need (or can't afford), software I want to download. I use bookmarks as a kind of temporary storage for that kind of stuff. I guess I'm not a big bookmarker, but in general I don't bookmark sites like slashdot,planetunreal,etc. that offer real content. From the comments above, I guess this is atypical. However, a search engine based on bookmarks like mine would suck. :)
  • by Anonymous Coward
    If you have tried submitting a URL to Yahoo in the last year you probably know how bad the service has become. I'm used to waiting a month or two, only to have the link not added, then re-submitting it again, and again, and again. Eventually it's accepted, but submissions seem to be largely prone to the reviewer's mood at the time. I wonder whether they have a quota of links they have to process each day. That would explain a lot.
  • Heh, did a search for slashdot on ebay and found this: m&item=485096924
  • by Anonymous Coward
    Good god, do a search on "username" and it is frightening what you will see. A lot of these users are probably not aware of what they have done and should have never bookmarked these pages, but still...
  • hasn't yahoo licensed Google's engine? Are you saying google sucks?

  • I searched for the following: Age of Empires II Age of Empires the Conquerors AOE II AOE ... No luck, and it kept wanting me to basically pick progressively smaller sets of terms from those I entered. The only thing that got me anywhere was a one-word search. Maybe I used it wrong :P
  • LookSmart has a new subsidiary called Zeal [] that is a dmoz like voluteer built directory, that includes the ability to import your bookmarks into a directory you edit. (Note: if you sign-up use the referral code xeditor to date five dollars to the Electronic Frontier Foundation)

    Another dmoz like project is [] but its just getting off the ground so I'm not clear on how it works yet. But aparently everyone who registers can set up their own virtual portal, and these links are then shared by anyone using Moveo.

  • At the risk of going against slashdot popular sentiment, I have to say that, if done right, this could actually be useful and work beyond the flaw you mention. Let me explain: most of the links of most people will, as you say, be stuff that you could find through the major search engines. In fact, there's no way (well, I should say, I haven't heard of a way yet anyhow), aside from a hand edited directory, that you could feasibly create a search engine which wouldn't return at least some of the results that you would find using the "big search engines". But the point is, they are trying to find a creative way to find "isolated webs", which is to say, little pockets of the web that aren't currently in the link-chain of the web-at-large, or are in the link-chain only one or two places, and are missed by the big "crawler" type systems.

    The fact is that some people will have some links in their bookmarks that the search engine would otherwise never have been able to find, or at least enable it to find those sites sooner.

    • []: Yahoo! directory + Google search engine
    • []: dmoz directory with PageRank technology + Google search engine
    • []: dmoz directory + Inktomi search engine (the engine Yahoo! used to use)

    OP could have just been saying "dmoz [] is better than Yahoo!'s directory."

    Bouillabaisse: It's all the rage among trolls! Here's a recipe []
  • Actually, it's called Google does do rankings based on linking, we [] do ranking based on (amoung other things) frequency of bookmarks, etc.

    FWIW, we are entirely Linux based too.

  • What probably led the VC to open up its wallets to Hotlinks was the common notion that web users themselves, in a free market of information, will seek out and bookmark the most useful sites. Problems with this notion were pointed out clearly by MoNickels; the popular market of information is not well exposed to a broad spectrum of new information, only selected pieces. It is interesting to consider, however, what insights *can* come from an examination of people's bookmarks and how they use them. If one puts aside the Big Brother-esque quality of a firm that tracks one's surfing habits (which I emphatically DON'T) there are a few insights that might follow from a look at the *people* who surf the Net, rather than attempting to "know the shape of the matrix," i.e. to chart a map of the whole web by size or utility. For example, what do people prioritize when they make a bookmark? In various genres (news, fansites, porn, etc.) what features should a site have to make it likely to be bookmarked? In various genres, which sites stay active for long periods of time? Does content or form play a more important role? When choosing which updated content sites to bookmark, do surfers prefer broad content, narrow in-depth content or both? I would oppose research on which kinds of surfers bookmark what kinds of sites. The point is a better web for a community of web users, and not creativity-damping corporate categorization. Much of the consumer research related to the web has been self-reported, for instance in the lengthy multiple choice surveys that appear all over the Net when registering for something. Let's not trust this research to firms like Hotlinks, though. Why don't some folks put some money together and offer free Internet, or even free computers, to those who agree to be tracked? -perdida Vote Ralph Nader, crack the money code in politics:
  • ... has moved to [].

    But by now, you probably think Hampsterdance is annoying. Very annoying. If so, go play Hampsterdeath [], a game based on the Grand Unified Whack-A-Mole Engine for *nix, DOS, and Windows.

  • First, on porn, at our site (2wrongs []) which, as I mentioned above, provides full text search with rankings based on bookmarks, we get a lot of porn links. People that enjoy pornography tend to be very quick to bookmark a site when they find a good one that is 1) free and 2) actually has high quality and/or interesting content. As a result, our "mature" vertical search finds some of what is probably the best (free) pornography out there.

    Next, on the bias towards default bookmarks, any one that bases anything on the frequency of the occurence of URLs in bookmarks probably does as we do and filter out the default bookmarks so it's a non-issue.

  • I put all my links on my homepage, and bookmark that. Or I can just remember its address.
  • rites/

    enough said.

  • Whether this is a novel technique or not, Hotlinks doesn't work very well! I submitted an admittedly narrow query ("beige g3 rom") to it and to Google. I got 7,000+ hits from Google but no hits from Hotlinks' search engine. I guess they haven't loaded many Mac user's bookmarks...
  • by KirTakat ( 110620 ) on Saturday November 04, 2000 @02:43AM (#649963)
    Perhaps a better idea would be a combination of the two, one that searched like a traditional engine, but used a listing of everyone's links to rank the search, i.e. if there were twenty sites on Linux drivers, but one of them was booked-marked by way more people, that would be the first listed one..
  • A system based on people's book marks is a great idea and it cuts down on alot of porn sites because no one realy bookmakrs they're pron with everything else
  • The bookmarks they are reading are those from the websites where you can save your bookmarks online - they are already public property.

    A good thing I think, when your M$ OS blows its top it is the one thing that people forget to backup.
  • by Crewd ( 199804 ) on Saturday November 04, 2000 @02:46AM (#649966)
    I can see it now, people spamming with their bookmarks of and Bouillabaisse.

  • by kfg ( 145172 ) on Saturday November 04, 2000 @02:51AM (#649967)
    Of course, I'll also have to come to grips with the idea that they're rooting around in my bookmarks, won't I?

    I mean, crawling the web for publicly accessable sites is one thing, crawling my data to see where I like go is another.

    Carnivore: The search engine.

    It would certainly come up with the most popular sites though. I'm not even sure that is a good thing. Anybody out there got any bookmarks that they wouldn't want their mom to see? What if it turns out we ALL have that one and it comes up on mom's search? Of course mom might be there already herself and dosn't want US to know.

    On the whole it seems like a "Not a best of the web, but at least very popular with the masses" type of deal and otherwise of limited use a real research tool. Part of the new " Power shopping on the web" paradigm.
  • by waimate ( 147056 ) on Saturday November 04, 2000 @02:52AM (#649968) Homepage
    This would, of course, create a self-referential belief system for geeks, wherein few new notions would enter the collective conciousness, and the group view of the world would be skewed by, er, the group view of the world.

    It's like being able to choose what things you want to appear in your own daily newspaper - it's inherently flawed because the most interesting things one encounters are often those one didn't expect to be interesting.

    Similarly the very best things to find with a search engine are those things which are not common knowledge. The job of a decent search engine is to flush out gems, not popular opinion.

  • well, at least they will have the most up to date list of preconfigured Microsoft and Netscape search engines, media lists, communities etc.
    I *like to type w*w*w*.*s*l*a*s*h*d*o*t*.*o*r*g once in w while, and since i constantly use different computers i never got to use bookmarks really. some are in my head and as long as they can not read my memories, they're pretty lost...
  • by Tal Cohen ( 4834 ) <> on Saturday November 04, 2000 @02:53AM (#649970) Homepage
    A vast amount of information is still kept in static pages. Now, please check your bookmarks: how many point to static pages, and how many to dynamic, constantly-updated pages? It's only natural that you have very few, if any, links to static pages, since you've visited them, read the information you needed, and you're not likely to come back (nothing changes!).
  • Well, I'm sure it's gotta be good - after all, it uses Unique Directory Technology in their Scalable Internet Search Infrastructure!
  • They will send all their 500000 customers an modified version of "i love you" which instead on deleting jpgs will send the bookmark files to them :)

  • by DeadSea ( 69598 ) on Saturday November 04, 2000 @03:05AM (#649973) Homepage Journal
    Whenever I find a new search engine, part of what I rate it on is how well it can find my homepage, and how easy it is to get my homepage listed.

    As far as regular search engines go, it was much faster to get google to crawl my site and list it than anything run by, inktomi, altavista, or northernlight. I am very happy with google.

    As far as directories go, Yahoo lists two of the 7 sites that I maintain. I have managed to get dmoz listings for 6 of the 7, two of which, i didn't submit myself.

    This new directory only appears to have one of my sites, and at a URL that has been inactive for almost two years at this point. I'll have to see how easy it is to get stuff listed, but so far I am not impressed.

  • Most people don't store their porn bookmarks in their regular bookmarks? He? Am I that abnormal then? In my Netscape bookmarks I created two folders one called "Hmmmm, Guess?" and another which is called "Red Checkout". The first folder contains my regular porn sites, the second contains the ones I need to check out.

    As for staying on topic: I don't know bookmarking habits op most people, but actually 50% of my bookmarks are sites which I saw but did not have the time to explore. The days I have time, I explore the sites decide if they are interesting enough to keep and put them in a bookmark folder that represents a category (like "Humour", "Computers", "TV", "Programming" etc...) I think that a search-engine based on bookmarks has potential, but I fear that most people do not organize their bookmarks well enough and that many many links in it will be out of date.

    As a final note: remember that such an engine will be very biased to the "default" bookmarks that are provided by the browser manufacturers. I know most slashdotters clear those on first sight, but a normal user often doens't bother and just adds his own to the existing ones. I saw some bookmarks/favourites that took up the whole screen when openend. They mostly don't even know how to remove broken bookmarks, sad but true.
    The weakness of such search engines is that it relies on human input and we are far from perfect, aren't we?

  • I don't know - perhaps they're for sale on eBay? :^)
  • Isn't this basically how Google's scoring works? They score pages according to how "well-linked" they are, particularly to other well-linked pages. OK, this is using bookmarks, but the premise is the same, isn't it? As soon as Blogger [] has completed it's plans for world domination, most people's favorite links will be online anyway (and they'll all be 'that cool new dancing hamster page').
  • Considering how well google works, is this that bad? I get the feeling this thing might be able to grab a few things that google might miss...
  • Hello, my name is Al Gore.
  • by Anonymous Coward
    yeah, I got the same thing, had to turn webwasher off, I don't like sites like that !

    webwasher is a nice little program, check it [] out. It does a good job of filtering ./ banners.
  • Hey,

    I don't know about you people, but I don't often use bookmarks., and are the sites I think are good. They aren't bookmarked; I can type the URL faster than I can fiddle with those stupid bookmarks. The sites I bookmark are those that look a little interesting, but either aren't good enough for memorisation or have a long, confusing URL like tml. Wouldn't a better judge of site quality be the number of repeat visitors?


    ...another comment from Michael Tandy.

  • I am sure I can't be the only person out there who rarely or never uses bookmarks! Hm, I've had this browser config for a year now, and I've accumulated a grand total of three bookmarks. Now since I found all of these through other means, what good is a new search engine? Now I know there are people who put practically every site they've ever visited in their bookmarks (which makes it surprising this site doesn't seem to have any porn) but I only bookmark things I want to find instantly six months from now, or things I might forget to look at otherwise.
  • If you want some of the old Loudo's music, just look for me on Napster (steverebar).

    I've also found that the only Roger McGuinn song is My Back Pages from Bob-Fest and the only Jimmie Dale Gilmore song is the duet with Lucinda Williams. And for some reason, nobody likes Emmylou unless she's singing with Dave Matthews or Dolly Parton.

    At least I can find The Real Slim Shady whenever I need it.
  • If so many people have the sites in their bookmarks that they'll be listed on the site's search engine -- who's going to be left to search for the sites, since they're already in everyone's bookmarks?!
  • This is an interesting idea, and if we used bookmarks like we are supposed to, then I imagine that it might. However, it won't. Here's why:

    1. It requires active participation of internet users. The beauty of other search engines is that you can set a bot to go out crawling, and when its done, you have a bunch of links. This idea requires that I visit this site, and give them permission to access my Hard Disk. How many users are going to do this?

    2. What percentage of websites out there are even in someones favorites? I can't imagine that every site is in someone elses favorites.

    3. I use favorites to keep track of sites that I won't remember the URL for. Say I read somethiung, i want to come back to it later, but I don't know where it is. I use it like temporary storage. Its faster to type in a url, especially now with the various autocomplete functionality out there. Thus, if Hotlinks raided my bookmarks, they would find a link to Slashdot postings by John Carmack, a few articles on Image Processing and Edge Detection, and the full text of The Little Prince. Its very specific information, and not the kindof information that Hotlinks is looking for.

    I think this idea is in trouble. Who doesn't use Google anyway?

  • Stay away from [].*, as it has been taken over by spammers. Next time, try [].* filtered by "show messages that contain jpeg attachments." Any good newsreader should be able to do this.
  • "as they can not read my memories" I bet someone out there is working on that though....
  • One time I found a link on backflip that included username/password to an etrade account and yes it worked - I logged in and looked at the stocks. Sad thing is the username/password were in the TITLE of the link! There were similar cases but this was the worst.

    That isn't a good thing. I called the owner and left message -- not sure what came of it.

    -- .sig --
  • HotLinks' innovative patent-pending technology
    Sorry, I'll stick to dmoz + google for search.

    PKB. Google's technology is just as patent-pending [] as HotLinks'.

  • ....and nearly 60% of the links on page one were dead links, 404's and long passed-by eBay auctions....
  • by DrWiggy ( 143807 ) on Saturday November 04, 2000 @10:39AM (#649990)
    A person's bookmarks are an insight into that person. From them, not only could you work out their main interests and hobbies, but their sense of humour, perhaps their politcal persuasion, and almost certainly their sexual tastes. The information being submitted to hotlinks, is invaluable in terms of demographic analysis.

    Of course, the same information could be gathered with the use of persistent cookies and normal search engines - what people search for are just as useful, but when people click away and "surf" what they decide to keep close to hand probably gives a better insight of the person.

    Just a thought.
  • Surely not that revolutionary? Many people have their bookmarks file on their webspace (I do []) and engines such as google [] rank pages by how many links they have. They even take into account the "quality" of the page that is doing the linking.

    The PC World article specifically criticises google for being "unable to organize" the links very well, yet seems to use the same technique itself. What gives?

    A paper about the inner workings of google is available is in HTML [].
    I went to look at the HotLinks link in the original article and it was all blank :-( oh I see it uses some form of scripting to redirect you ... whatever happened to 30x return codes?

  • by po_boy ( 69692 ) on Saturday November 04, 2000 @10:48AM (#649992) Homepage
    To me, the power in this is not that you can search through other peoples' bookmarks, it's that you can store your bookmarks here.

    I use about 3 or 4 different computers in a week, and I actually do use bookmarks in my browsers. This means that I end up bookmarking stuff on one machine, and then not having it at another when I use it.

    I'm not sure how many other people have a similar problem, but this service appears to solve it. The average slashdot reader and myself have webservers and the ability to hack together a few perl scripts, or the knowledge to find and mail our .netscape/bookmarks.html files around to keep our boomarks synchronized and always available if we want, but how can most people do this?

    I would imagine that this service would be useful to the average multi-computer user. Is this the best solution you have seen for this problem? What other methods do you employ to move bookmarks from one machine to another and keep your bookmark files in sync with each other? Is this the best type of solution we can provide to users for this kind of poblem? Do you think that it's widespread enough that a good solution would be used by many people?

  • Then you are starting to venture down that terrible terrible path of the "clueless user needing to be forcibly prevented from doing anything that could be potentially harmful.

  • I'm sorry, but this annoys me everytime I see it, almost as much as the Playstation 2 being confused with the failed IBM PCs. (PS2 != PS/2)

    It's HAMSTER - WITHOUT THE P. HAMPSTER *is* *not* *a* *word*. Look it up! []
  • Going by user-agent headers on hits to my webspace []:

    1. Googlebot: 1240
    2. Lycos spider: 21
    3. Slurp: 16 [inktomi]
    4. Scooter: 6 [altavista]

    Googlebot indexes my site more often and in greater depth by far than the other search engines.

    Now this is just one site - anybody else got robot stats?

  • Checked it out...ow, change the color!

  • I have no idea what half of my bookmarks are. In the past, whenever I found a cool site, I'd bookmark it. But now I see "Welcome to [bizarre company name]" in my bookmarks list, and start to wonder what it is. I have actually started to maintain an HTML document - when I find something bookmark-worthy, I just bring up an xterm and vi, and add a line of code describing what it is.

    SUWAIN: Slashdot User Without An Interesting Name

  • For a second I thought this was going to implement an idea [] I had a long time ago, namely to use one's bookmarks to rank search engine results. That is, give higher rankings to pages that are similar to pages I've bookmarked.

    However, in this case, it's not just my bookmarks, it's everyone's, and the bookmarks are the source of the database, not just its ranking technique. As many have already pointed out, this seems really closed and inbred.

    Joe Ganley []

  • Hampster [] is a surname; hamster [] is a species.
    Pearson is a surname; person is a species.
  • by Malevolent ( 231436 ) on Saturday November 04, 2000 @03:29AM (#650000)
    ..most people find websites using existing search engines - therefore the vast majority of bookmarks will be from sites already ordered high up in conventional ranking systems.
  • Ack. What about the predefined internet links, such as Hotmail and Microsoft's links inside of Internet Explorer? If they are on every computer's preset ie links then they would theoretically be on top of many searches in this site. This is not a good thing (TM).
  • Ok, but then that just reduces its value even more through a great deal of sample bias.
  • Of course, every dot-com has to have one or two patents just to keep the VCs happy. But how useful will Hotlinks' patents be?

    Suppose the EU decides not to embrace software patents. Then I could open up a search engine over here, which uses the exact same technology—and there would be nothing that Hotlinks could do.

    Even if the EU does adopt software patents, there will be places that don't. China anyone...?

    How useful are patents going to be when it is no longer necessary to have a physical presence in any particular country?

  • They don't have any links to what must be the most widely sought item on the Internet. At least, it's the most widely spammed item...
  • If they want to get my business, then they damned well should NOT use nonstandard proprietary HTML qualifiers like onload in their main page's body construct!

  • by MoNickels ( 1700 ) on Saturday November 04, 2000 @03:50AM (#650006) Homepage
    The problem with the bookmark approach is that it will tend to result in the Jukebox Phenomenon.

    The short version of this is that current Top 40 radio station rotation systems are reputed to stem from the analysis of a jukebox supplier who noticed the same 40 records kept getting played over and over. This is because when a record gets played once, it tends to get played again, resulting in circular reinforcement, with hits one through 100 charted in a steeply declining curve. This is how current radio programming, music marketing and MTV work today: reinforcement.

    The problem with this approach (in music or data) is that popularity is no guarantee of accuracy, appropriateness or utility. This is represented in the music world by the high cost (real and otherwise) of successful entry into the market. New music (data) is not popular enough to be included, but it can't easily be included without becoming popular.

    Personal bookmark collections tend toward the same phenomena. Besides the inaccuracy stemming from factory-included links (which I would hope they account for), the bulk of entries will result from links in turn resulting from searches on existing search engines, which are, no matter how big, closed data sets: they have boundaries and do not include the entire web. These searches are also happening in a only few places, resulting in the JP. Hotlinks will thus tend to include sites that have already appeared elsewhere. A certain number of "missing" pages will be newly included (the user's own sites, work sites, sites of friends) but very few "missing" pages of other kinds, particularly low-traffic pages (such as those with refined and highly specialized content: deep governmental directories, university research labs). In other words, Hotlink's approach is not much different than Google's number-of-times-linked approach or bulk submitting on an engine's "add your site" link, just a larger population sample.

    Napster experiences the Jukebox Phenomena: If I look for Loudon Wainwright III songs, I tend to find lots of iterations of the same three songs and not much else: Dead Skunk, I Wish I Was A Lesbian and the duo with Iris Dement. But if I want to find, say, any song off of the Therapy album, it tends not come up because it is not as popular. This is because the JP has propagated the popularity of the same three songs. An ideal data source would include the entire data set, popular or not. (I am aware Napster cannot and is not designed to be a complete data set).

    If one's goal is to include more web sites, a more accurate approach than Hotlink's would be to scavenge user's History files. That would, in my case, include a few hundred additional sites a week, although I'm sure the privacy issues would be a problem. If one's goal is to return the most accurate results, an even better approach would be infinite page caching in which a new iteration of a page does not replace the previous entry, but is added to it. In this way, one could search across history as well as data.
  • I don't think that it would be a problem if you ensured that you took bookmarks from people who didn't exclusively use that search engine.

    This search engine formulation requires that other search engines exist and are used by others. If it ever gets popular it will find that increasing it's popularity will increase it's resistance to getting more popular, an interesting exercise in game theory...
  • this one too
  • One good thing would be to do like the altavista spider, but to display the search results in the order of links to that site. For example, if i search for "Linux" then the most linked linux site (probably would be display as number 1 and so on..
  • And this one as well.
  • Where's your Slashdot bookmark?

    I always type it in. Much faster than reaching for the mouse.

  • I tried it on 3 of my most recent queries and goot nothing useful. They linked to altavista when they couln't find anything.
  • October stats for a site I manage: 1436 Googlebot/2.1 (+ 1199 Slurp/si (; 823 Mercator-1.0 671 Scooter/2.0 G.R.A.B. V1.1.0 570 Gulliver/1.3 548 Ask Jeeves)" 517 CrawlerBoy
  • I tryed to post this as a subject with no luck, so I guess I will dump it here. Web searching isn't what it should or could be because the "indexing industry" can't seem to provide the two required elements: complete coverage and a logical way to extract relevant results-- in the same index. The big search engines (Fast, Google, Inktomi...) do index a lot of content, but it can be nearly impossible to find a "class" of web site if what you are looking for can't be tied to an exact phrase or exclusive keyword. For example attempting to find all sites that provide a "message board about internet navigation" using a "search engine" is made impractical by the fact that there is no way to filter out the pages with irrelevant references and the fact that there is no common language (keywords or phrase) used on all message board sites to filter in. When Al Gore invented the Internet he apparently didn't realize that different people can describe the same thing using very different words, and that computers aren't likely to like that. Enter the categorized "web directory" (Yahoo, Dmoz, Go, Looksmart, Hotrate,...) to save us from this lack of foresight. Unfortunately, these "humans do it better" alternatives are "edited" by "humans do it slower with arrogance" communities of a selected personality who are more interested in iradicating "evil"--as they sometimes perversely define it--than doing their job of labeling and categorizing. Worse, this ever-growing population of "samey-vertical-directories" all have the insane perception that web page managers have nothing better to do than fill out redundant "add url" forms all day. The result is that the bad under-coverage of the big search engines is magnified some 100 times by the best of these "tip-of-the-iceberg representations" of what's on the internet. The solution is to make "editor" censorship opaque by opening and centralizing the site submission process and to reduce the noise and chaos of "dumb" search software by standardizing the language used to describe web site content. In my vision of a better World wide web, a site would only have to introduce itself once to the "indexing industry" to be fairly placed in the accessible site universe. The foolish waste of each index or directory doing the same basic "spam check" would be eliminated and done once with greater reliability and objectivity. This public domain list of addable Url's would be simply indexed based on a 5 or 6 keyword description where all but one of the words would have to come from a small pool of precisely defined, clear keywords. No longer would relevant sites be made inaccessible because someone thought a laptop computer was a notebook computer or a insect was a bug. The absence of standardized language and enforceable rules against lying, wrecked the good idea of meta tags --but we still can have the logical navigation they promised. I believe the creation of a "simple, complete universal index" is a no lose proposition. Unfortunately, I by myself can't make it happen, and as providing this service is not likely to be a for-profit enterprise, few "show me the money" web professionals are offering help. I am confident that the web using public would require little convincing to embrace the idea of cleansing the roots of web navigation of destructive corporate self-interest. The unanswered question is, will they ever have a vote? I have written more stuff [] on this subject, and I do have some improved ideas regarding how the index could be structured and regarding spam prevention, but I have decided I am not going to just give them away until I see some hope that change is possible.
  • Good point. These are filtered out before generation.
  • I am the founder and CTO of HotLinks. I want to thank everyone for their insights, both positive and negative. I will also offer some comments in response to issues raised in these posts:

    First and foremost, I would like to make an important distinction that is often lost on the media, but should be no problem for Slashdot community: HotLinks is not a search engine. We have no crawlers. We are building a web directory, like Yahoo, Snap, ODP, or LookSmart, and unlike AltaVista, Inktomi, or Google. Our goal is not to compete with existing search engines to index the entire web, but rather to create a topical web directory like Yahoo that is more scalable and comprehensive. If you could not find your home page on HotLinks, this simply means that none of our 500,000 users has bookmarked your page yet, not that our "search engine" is broken.

    Research done by AltaVista and Google researchers have shown that 60% of search queries are "broad", i.e. only one or two keywords. These common queries are well-suited for human-edited web directories and less suited for crawler-based search engines, which excel at more precise searches. This is why most navigational portals include both a web directory and a crawler-based search engine, and why Yahoo and LookSmart employ hundreds of editors to create web directories manually.

    As far as the default bookmarks pre-populated by Netscape or Microsoft, of course we automatically filter those out, as well as anything similar.

    Several postings posted out that HotLinks could become too "self-referential". This could be a problem if HotLinks users only bookmarked sites that they found while searching HotLinks. However, this will not be the case. People will bookmark sites that they hear about from friends, that they find on other search engines (including 3rd party search engines that we integrate with our site just like Yahoo or LookSmart does), or that they find by clicking on links from a site they did find at HotLinks. There is no reason to expect that HotLinks users will contribute bookmarks of sites found only through searching HotLinks. Our members' bookmarks can point to anything on the web, even sites in the "invisible web" that would not be found with regular search engines. There are many people who use HotLinks for bookmarking but not searching, and many who use HotLinks for searching but not bookmarking. These two groups have overlap but are ultimately independent of each other.

    As far as Slashdot readers who don't use (or organize) bookmarks because they only access Slashdot and two other sites or just type in the URLs by hand, or do use bookmarks but don't need HotLinks because they post their bookmarks on their personal web site or web log, with all due respect, neither of these types of behavior is typical of web users in general vs. Slashdot readers. So even if Slashdot readers don't use bookmarks or don't need HotLinks, all statistics from NFO, Jupiter, Netscape, SRI, etc. show that anywhere from 60% - 99% (depending on the study) of all Internet users use bookmarks. Jupiter Communications reports that 75% of web users online for more than two years navigate the Internet primarily via bookmarks. Studies also show web users having an average of 50+ bookmarks, and HotLinks users have closer to 100 bookmarks on average. These regular web users are also not likely to run their own web sites or web logs.

  • This would, of course, create a self-referential belief system for geeks, wherein few new notions would enter the collective conciousness, and the group view of the world would be skewed by, er, the group view of the world.

    Hmmm.... sounds what slashdot has become.

  • by Chuck Chunder ( 21021 ) on Saturday November 04, 2000 @04:00AM (#650018) Homepage Journal
    You mean the "nonstandard proprietary HTML qualifier" onload as documented in the html specs []?
  • "back in the day" it wouldn't have been a problem even if the users did only use their search engine. It seemed as though every site had a Links section linking to entirely unrelated sites that the html monkey found interesting...

    Now, most web sites try to hold on for dear life. Links captured in frames... duplicating external content on their own site...

    I guess people would call it evolution.

    I think it's sad.

  • Sounds alright, except you never get anything unless other people have bookmarked it. So does that mean you would never find new sites? What if I were to come up with some incredible idea, and put it on the web tomorrow. How would anybody find it, except by me directly telling them? I don't publicise my sites, but Google and AltaVista and likes have them. This would never have them! Sounds like the engine would be doomed to fail from it's conception, unless they start doing something else in addition.
  • They dice [] up your brain and sift through the contents to see what you were after.

"If it's not loud, it doesn't work!" -- Blank Reg, from "Max Headroom"