Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!


Forgot your password?

Modeling Linking on the Web 131

An Anonymous Coward writes "Amazon has a much greater market share among online bookstores compared to the greatest market share for offline stores. How is this possible? Because the web changes how people find information. There are millions of links to Amazon on the web, which makes it more likely for people to find Amazon when surfing the web, or when using search engines which typically use link popularity in ranking. This makes it harder for new businesses to compete. Researchers have discovered that across the entire web, links are distributed according to a "power law" which leads to "rich get richer" or "winner's take all" behaviour where a small number of sites get the vast majority of links and traffic. A new study just released by NEC shows that this behaviour varies in different communities, and shows how to predict competition in different areas. For example, you can see how much tougher competition is among booksellers compared to photographers."
This discussion has been archived. No new comments can be posted.

Modeling Linking on the Web

Comments Filter:
  • by michaelmalak ( 91262 ) <michael@michaelmalak.com> on Thursday April 18, 2002 @07:54AM (#3364397) Homepage
    Wouldn't this prove that the quest for eyeballs was no more crazy than the quest for a starlet to become a Hollywood star, the quest for a high school quarterback to make it to the NFL, or the quest to win the lottery?
    • The quest to win the lottery is a poor business model, and not the way to support multiple people. Most actresses start out (and end up!) as waitresses, most football players get a real job after they fail to make it to the NFL. And most dot-commers got a wake-up call when the money ran out.
  • leads to "rich get richer" or "winner's take all" behaviour where a small number of sites get the vast majority of links and traffic

    No kidding. Look at the hit counter on my homepage, and compare it to Slashdot's. I've probably gotten as many hits in the last year as Slashdot has gotten since I started typing this reply.

    I'm off to work now, so I don't have time to check, but does the article address the massive amount of advertising that Amazon did to get where it is today? Inertia has to come from somewhere.

    • Put up a few obvious stats on the nature of the universe on your homepage, submit to slashdot and I'm sure you're counter will increase nicely (or not so nicely, it'll be up anyway).
    • Links aren't everything. Consider word-of-mouth (viral marketing) if you will. I've made 2 purchases from Amazon, both despatched and received very quickly, so I'm likely to return to them. It's one thing to attract customers and quite another to keep them. Get enough disgruntled customers and word-of-mouth can work against you. That's the beauty of the internet.
  • by Shivetya ( 243324 ) on Thursday April 18, 2002 @08:00AM (#3364410) Homepage Journal
    Amazon does a lot to get their name out. So its very reasonable that most people would tend to look towards them for books.

    Ask people off the street where they would buy books on the internet... and your bound to get many replies of Amazon or "I don't know".

    I suppose by their logic that the only place to be on the web is AOL... but on the street you could get that response as well.

    Advertising does pay. Links on the net may lean towards one provider over another, but most of them were bought by one method or not.

    One reason the rich get richer is because they are optimist, they are willing to do it now. The best time to start a business is always TODAY... the best time to get your name out there is ALWAYS today.

    • absolutely. Amazon understood a basic principle of commerce and marketing... BRANDING.

      Most people associate even the concept of purchasing something online with Amazon... they were pioneers, and managed (using linking, etc) to embed themselves in the mind of the consumer. I must admit; even though I may like to think of myself as one who resists marketing, when I need to buy a book online, where's the first place I think of going? Amazon.

      Insidious, ain't it?

      I am the village idiot i don't have anything to do with this pathetic little opera i just felt like passing through... - PDQ Bach

  • trust (Score:5, Insightful)

    by selderrr ( 523988 ) on Thursday April 18, 2002 @08:01AM (#3364414) Journal
    the internet is a very young medium and most online buyers just don't trust much... yet. I'm a regular buyer, but wouldn't really consider buying a CD from a company that just showed up and is hosted on some obscure domain.

    Wait until online buying settles down a bit, and everyone gets used to buying stuff everywhere. Then wait for VISA to become more secure (read : Hardware cardreader devices at home, which is inevitable) or for banks to do direct transactions between you and the book reseller. THEN we can start to evaluate models.
    • What exactly would you consider an 'obscure domain'? One that you haven't heard of? One of the problems with internet consumers right now is that they think that just because they've heard of a name means its safe to use. What about the customers of Expedia.com or verizon who had their info (credit card numbers included) posted on webpages as a result of a malicious break-in?

      Would you have considered Verizon to be an 'obscure' domain?

      It just might be that the lesser-known companies might be a bit safer in the long run, simply because (and yes, this is highly speculative) the malicious script-kiddies (ok, and some actual malicious hackers) haven't heard of them.
      • true. absoutely true.

        But John Q. Average doesn't know that... In belgium there were 2 online resellers of CDs and books. One of them made a deal with a big bank and a movie complex to have its URL printed on tickes and receipts. Good publicity, indeed, but much more important : the association of the URL with something trustable.

        I thing online internet sales is a market where, right now at least, there IS bad publicity because people rae still too supicious and won't take risks.
    • Re:trust (Score:2, Insightful)

      by GigsVT ( 208848 )
      How would a hardware card reader at home help anything? Havn't you even followed the whole satellite-stealing saga? If you give the end user control of the hardware, you can't trust it, period.
      • Well, I admit that you give hackers more chances to hack when they can get the hardware in their hands, but you'll have to admit that the number of dudes capable of such are far lower than just setting up a pr0n VISA number trap or card skimmers in a restaurant.

        I can hardly believe that using HW at home to authenticate/secure connection & card integrity is going to increase abuse. And with such devices, the transaction company has more means of tracking you.
      • Re:trust (Score:5, Insightful)

        by swillden ( 191260 ) <shawn-ds@willden.org> on Thursday April 18, 2002 @10:15AM (#3364993) Homepage Journal

        How would a hardware card reader at home help anything? Havn't you even followed the whole satellite-stealing saga? If you give the end user control of the hardware, you can't trust it, period.

        Geez, I get tired of correcting this misperception on /.

        The pay TV systems are the worst thing that has ever happened to smart cards. There are numerous reasons why they're insecure, and those reasons do not apply to other systems, particularly not credit card systems.

        First, pay TV engineers have an essentially intractable problem. No security system is unbreakable and the problem is particularly difficult when the attacker has unlimited access to key components of the system (as is the case with all smart card systems). However, the pay TV guys have the additional complication that satellite television is *broadcasted*. That means that every card is doing the same thing with the same keys and the same data, which in turn means that breaking one card essentially breaks all of them (or at least a large number of them).

        To try to manage this risk, the pay TV designers use flashable cards and a security-by-obscurity approach. In most environments flashable cards are a very bad idea, because you want to make sure that attackers cannot fiddle with your software. In this situation, it permits the system designers to modify their system frequently and remotely, to make it obscure again and shut down all the existing pirate cards. OTOH, flashable cards are easier for attackers to fiddle with. The result is that every update is broken within a few months, but most breaks are disabled within a few months, and that keeps the whole thing in the news all of the time.

        Nearly every other smart card system can take a different approach, one in which (a) the card is expected to be secure for a few years and then replaced and (b) the break of a single card does not compromise the whole system.

        Card-unique keys, key rotation, audit mechanisms, hotlists and moderately frequent on-line validations are all tools used by careful designers to ensure that breaking one card does not break the whole system, and that broken cards can quickly be detected and shut down.

        When you add those tools to the security measures offered by the card itself (such as S/D PA resistance, environmental monitoring, security-sensitive layering, chemically-similar protective shields, etc.) the result can be a highly secure system.

        The job of a smart card system designer is easy to state: Design a system such that the cost of breaking a single card is much higher than the value of breaking that card. A good estimate for the cost in expertise and equipment to break (extract the secrets from) a single, modern, well-designed card is about $250,000, give or take a little. As the card ages and new attacks are devised, that cost goes down. Meanwhile, card manufacturers make changes to defeat the new attacks, raising the bar again for new cards.

        For credit cards, this is an easy thing to do. Not only are the cards unique, used on-line frequently (whether at home or at a merchant) and replaced regularly, the possessor of the card is generally motivated to maintain the secrecy of the information in it. Further, the job of the card is identification and authentication, rather than decryption, which is a completely different problem from that of pay TV cards.

        Secure smart card credit cards are easy for a variety of other reasons that I won't go into, but perhapes the most important is that there is already a significant and mostly workable security infrastructure in place that doesn't need the cards themselves to provide any security. This model breaks down to some degree for on-line transactions and a chip card is a perfect way to close that loophole.

        It's true that nothing is completely secure, but it's also true that nothing has to be completely secure. Success in security is when the attackers decide to find an easier target.

    • The problem is that consumers don't have a good way to gauge trust right now. The current standard is "which brand do you recognize the most?", which mostly means "which brand has the most money to spend on advertising?". We may eventually have a real web of trust. Until then, it's not about trust, it's about money. (ergo, the rich get richer)
    • This is an interesting perspective, and it may explain the power law effect in commercial sites.

      However, the power law also holds true for non e-commerce related sites. I.e. yahoo, google, etc. In effect, you're arguing the example ("amazon") rather than the actual theory.
    • the internet is a very young medium ...
      Wait until online buying settles down ... wait for VISA to become more secure ... THEN we can start to evaluate models.

      Yes, if you wait long enough you will be able to know with certainty... but by that point, how useful will that knowledge be??

      From studies already available, it's looking like the final conclusive model will be end up along the lines of Coulda, Woulda, Shoulda. If you believe these numbers, it looks like there may yet be time left to learn from the success of Amazon and others and apply it in other categories (hey, I oughta take that advise for my little site....) Doing it without massive debt would be the real trick!

    • Credit cards are not the major problem. If you get scammed online, you can get a refund from Visa. A bit of work, but not a huge deal, and it is not very common. (Visa kicks out merchants who scam.)

      The real problem is not going to go away: will the merchant ship my order in a correct and timely fashion? This is the *exact* same problem faced by the old "mail-order catalog" business. That business is over a hundred years old and the problem still persists, reputation matters a lot.

      How to fulfill orders promptly and correctly is a *huge* difficult problem, involving massive automation, computerization, use of bar codes, etc.

  • Scale-Free Networks (Score:4, Interesting)

    by bonoboy ( 98001 ) on Thursday April 18, 2002 @08:01AM (#3364415) Homepage Journal
    This is also called a scale-free network, and the research on it, by Albert-Laszlo Barabasi (currently at Notre Dame U) is in this week's New Scientist. (Apologies, it's not on their site yet - www.newscientist.com) He's applied it to many systems other than the web as well, from viral transmission on the net and human populations to the vulnerability of "hubs" in genetics (a few, like p53) would take out damn near everything due to their pervasiveness and even quantum mechanics.
  • by Anonymous Coward on Thursday April 18, 2002 @08:01AM (#3364418)
    Where does the pornography industry fit into the chart?

    And can you provide a few links there as well? Just to "sample"

  • Google? (Score:5, Insightful)

    by cetan ( 61150 ) on Thursday April 18, 2002 @08:02AM (#3364419) Journal
    I don't know if this applies or not, but what about the rise in popularity of Google?

    Clearly they came in late, after all the other search engines were established, but they ate up market share and eye balls because they were (and are) better.

    So I could imagine that there are some Amazon-killers out there or that at least there could be...
    • Re:Google? (Score:2, Insightful)

      by dthable ( 163749 )
      I think the key is that a bookstore, like Amazon, doesn't have a nice single location for data. All of the book information needs to be input somewhere. Google can just create a robot to go collect information. I think this single difference makes an Amazon-killer more difficult than a Google-killer.
    • Re:Google? (Score:3, Insightful)

      by selderrr ( 523988 )
      I really hope there are a lot off Googles out there to kick Amazons fluffy butt : Google has proven that ad-free online information (and in a way commerce) actually works. Works better even than the ones with Ads.

      Amazon gives me headaches with all their fsking commercials and screaming colors. I don't go there unless I really can't find my info anywhere else. The opposite with Google : their layout is so nice I sometimes find myself just googling away for fun... like the old days when the internet was still fun to explore.
      • I don't think Google has proven anything of the sort.

        Have they made money yet?

        Until they do, they aren't a success in the sense that they will be around for a long time. Just being good isn't enough. You also have to make money.

        • Have they made money yet?

          Google has been profitable for quite some time... Just do a search on Google with the keywords "Google Profitable"...
      • A few years ago, Yahoo was like google. Very little advertising (just one banner ad), and none of that other stuff (email, web hosting, news, etc).

        Give google a few years...
      • You have to understand that the number of autistic geeks who have a problem with Amazon's "fsking commercials and screaming colors" is just too small to be of any consequence. Most people, including myself, just don't have a problem with it for what we get in return.

        Also whats with the Anti-Amazon sentiment? What exactly is wrong with a company surviving in part due to ad revenue? Does the immature desire for online companies to try to function in this world without advertising revenue still exist? Do you not know that Google has paid ads, in text form, on their site as well? And that they derive a lot of revenue by being the engines under AOL's and Yahoo's search engines?
    • Re:Google? (Score:4, Interesting)

      by Lee Bottemiller ( 305781 ) on Thursday April 18, 2002 @08:46AM (#3364538)
      So I could imagine that there are some Amazon-killers out there or that at least there could be...
      As much as I want to agree, I can't. Google's quick rise to power was because Google offered something much better for very little extra effort.

      When Google hit the scene, search absolutely sucked so there's was an existing scratch that needed itching and to scratch that itch, all you did was search from a different page.

      So the user got a massive reward from a tiny change. But with Amazon, there's not that much pain in the mind of the online consumer. One-click ordering. Recommendations. Reviews. A very usable site by many standards.

      But switching from Amazon to another bookstore requires a good amount of hassle... MUCH more effort than switching search engines and offers a much smaller reward than Google offered to frustrated internet searchers.

      • This is a good point. For the general public to switch to another bookstore would require the new store to match all of the current functionality of Amazon plus have more discounts and better overall pricing. I don't see that happening anytime soon.

        It's nice to hope though :)
        • For the general public to switch to another bookstore would require the new store to match all of the current functionality of Amazon

          Not likely, considering Amazon's penchant for patenting their (dead obvious) functionality. Hooray for the USPTO! Yeah!

    • The rise in popularity of Google was at least in part due to the actual great product they have produced. It is beyond a doubt better than most, if not all of the search engines out there, especially the old-school ones that were its first competition. The article describes that it can be hard to get into an online market, not that it is impossible. Google did what it needed to be done (create something innovative) to get into and dominate its market.
    • Re:Google? (Score:3, Interesting)

      by wfrp01 ( 82831 )
      Google is interesting for another reason also. If you have a high Google rank, you increase your probability of getting linked to. Which will increase your page rank. Ad infinitum.

      I love Google, but I wonder sometimes if at the same time it makes it easy to find good sites, it also inherently keeps other good sites down (relative to the high ranked sites).
  • Prehaps not... (Score:2, Interesting)

    by keithd1998 ( 259872 )
    While this is probably true for the most part, things can change it.

    I'm in Ireland and since the euro came in to being I've stopped ordering from the UK or US versions of Amazon, and other retailers for that matter.

    With currency and shipping costs they were killing me. I now look for eurozone suppliers.

    I'd still love a good Irish online store though, just to save shipping, but the are all crap half assed efforts.

    At least my french and german will improve while browsing.
  • This isn't a new concept. Welcome to Capitalism.
    It takes money to make money. Always has and always will.
  • by KGIS ( 307368 )
    I don't think that competition for photographers is a good example because location plays a big part in choosing a photographer. I know that some photographers travel alot but they are mostly chosen because of their reputation (often in spite of their webpages ;) )Also, the market share of offline stores is lower due to logistics. For example, all offline booksellers must have a location to display the books and a way to get them there. This costs money.
    I also think that the "rich-gets-richer" thing is more about economics than about the number of links a site gets. Amazon buys bigger quantities, and ships more quantities which means that the economies of scale are a HUGE factor. I would venture that the number of links will be a trailing indicator of the relative size of a retailer much more than a contributor to this relative size.

    To sum up logistics, economies of scale...intro economics anyone?
    • Another thing to consider is that booksellers have a rigid stock they can deal in: books. The one book that a seller gets is the same book that ever seller gets, and so they have fierce competition for eyeballs to sell you their copy of that book.

      They may specialize and carry more of certain kinds of books, but you won't see much difference between The Hunchback of Notre Dame as carried by Amazon, Barnes and Noble, or Borders.

      Meanwhile, each and every photographer has his or her own unique vision which they commit to film. Two photographers can snap pictures of the same subject, but composition of each shot would be different.

      Booksellers usually sell the same goods. Photographers usually sell their own product, which can't be found anywhere else.

      • I agree with you completely. The article is making comparisons that don't really work well. You may be able to make some comparisons between different commodities but the number of links that Amazon has is (mostly) an effect of it's success not a contributing factor.
  • power law (Score:2, Interesting)

    by splorf ( 569185 )
    The power law may be the same as the Pareto distribution, which models the distribution of income in an economy. Economists have observed it experimentally for decades though the theoretical reasons for it have only recently become understood. The underlying mechanism could well amount to the same thing. Some economists might like to take a look at the web link data. There are probably interesting comparisons to be made between link distribution and income distribution.
    • Re:power law (Score:4, Informative)

      by soboroff ( 91667 ) on Thursday April 18, 2002 @08:40AM (#3364517)
      The difference between a Pareto distribution and a power law distribution is that in a Pareto distribution, the probability P[X > x] ~ x^-k, (that is, the probability that a observed value is greater than x is proportional to the inverse power of x) whereas a power law is P[X == x].

      And a Zipf law is a power law on ranks, rather than values.

      Lada Adamic of HP has an excellent how-to on power law distibutions [hp.com] you might find interesting.
  • I would speculate that Slashdot is one of those rich getting richer, like Amazon. It has reached a critical mass of readers. All the other message boards, and I have been to many, are relatively sublime in comparison.

    Slashdot type boards might be an exception since their interest - their value (or perceived value) to readers, their content, is defined by the probability of a reader writing something, which is the most interesting factor to take into account. Of course, once enough readers are there, statistical odds of a story getting a full compliment are very high, and thus good content. Slashdot seems to be past that critical mass.

    Then come the trolls. ;)
    • Could it be like posts being moderated up on Slashdot, moving towards the top of the page, and then being moderated further by ignorant moderators who don't bother to view 'newest first'? The theory that many moderators only view the first page of comments (those which are old and high-scoring) explains why earlier posts are more likely to end up +5, and the seeming scarceness of +3 and +4 scores.
    • IMO it is the responsibility of news sites to pick up the less known projects and present them to a wider audience.
      That's why I like the stories on small innovative projects best. The fact, that a new Apache release is out, will reach you anyway.

      Although user contributions adhere the same critical mass principle, good editing can counter the effect and thus provide news, that you won't find everywhere else. I hope, Slashdot will be able to keep its quality...

  • Same as any offline business. Unless you have a unique idea you will be competing against the established big players. Even location issues are the same in that if you added a bookshop to Slashdot it's like owning a bookshop in a populated area that doesn't have a large chain. Also change happens fast on the net and any business model may become obsolete quickly - even Amazon's. For example if publsihers joined together for an online bookshop they could undercut Amazon by selling direct to the public.
  • SOC (Score:2, Insightful)

    I have a horrible feeling that the words "self-organized criticality" are heading inexorably towards this discussion. Please, just say no.

    No, seriously: it doesn't look like the article is making that particular connection, thank goodness. Per Bak's theory of self-organized criticality predicts power law distributions of many things under many conditions, so people (or, specifically, Bak fans) often get all excited whenever they see a power law and start saying, "Hey, it's SOC!" But there are many ways to get power laws, and of course the logic doesn't work: "SOC gives power laws; thus if it has a power law, it was caused by SOC."

    If you want to know more about SOC, you can check out Bak's modestly-titled book, How Nature Works.
  • I did a dissertation on something similar a while back, due to a paper by S. Redner:

    Redner, S. (1998). How popular is your paper? An empirical study of the citation distribution. [bu.edu]

    This paper looked at how the citing of academic papers follows a power law distribution - that many papers are never cited, some are cited a few times, but a tiny percentage get cited a massive number of times.

    My dissertation actually looked at whether this could be applied to the more general concept of ideas: whether there are lots of good ideas, some never catch on, some catch on a little bit, whilst a tiny proportion explode with popularity. I wrote a little java app to model it.

  • For example, you can see how much tougher competition is among booksellers compared to photographers.

    No, no, no. This is an abstract discussion of new avenues in Communication Theory as exposed by the Internet. Therefore we must apply Internet-specific terminology to our discussion. "Photographers" should read "Pornographers." Thank you.
  • It's not only links in the Internet that are distributed according to a power law, it`s rather a pattern that turns up in many domains. One example is Zipf's law [wikipedia.com] for all kinds of linguistic data, and...much more, although I can't think of an example at the moment ;-)
    • i just ran across zipf's law in this article:

      http://www.theatlantic.com/issues/2002/04/rauch. ht m

      seems that this appplies not only to natural phenoms (earthquake size X rank ~= constant) but also human activities: city size (#10 has roughly 1/10 pop. as #1, #100 ~= 1/100) and even "corporations and firms in a modern economy are Zipf-distributed".

      so it's not surprising that the web is zipfian...
  • One of the things we hear a lot on Slashdot is how evil RIAA/MPAA and the media companies behind them are. And what's often brought up is the fact that these companies are screwing the artists, that these artists would do better to establish a direct relationship with their audience, and that the internet is perfect for that.

    I've always been sceptical of this argument because I think the effect of promotion is a lot greater than people generally believe. A musician playing live can generate a little local word of mouth - but how do you translate that into a wider appeal?

    This adds another reason to be sceptical. So maybe getting onto Amazon's stocklist/recommendations will become more important than getting on a radio playlist?

    • It's more about time than about the promotion.

      Think about it like this:

      You have a good band, that people want to hear. If you self promote, you can only reach a few hundred people at once to begin with. And only say 1% will tell a friend. And only 1% of them will tell a friend.

      So then you self promote to radio stations. Figuring you send a copy of your demo to a few hundred radio stations, and only about 1% of them will play it. But they are playing it out to hundreds more people who also might tell a friend. And so on.

      Then you have to figure out how to get the CDs to people to sell for you, or be in all these places when it plays on the radio (impossible).

      Now a BIG record company can just take a good band, and with one move drop it on 1000s of Viacom, GE, Disney, AOL/TW TV and Radio outlets, 2461 magazines, 100,000 websites, and then move the product into 200,000 record stores the same day.

      And so everyone in the human population who will possibly like the band hears it all at once or hears it from a friend within WEEKS. This wouldn't be a problem but for the fact they (the big record companies) are doing it soley as entertainment soley for profit, rather than the classic uses of music (artistic, social change, etc.). Which is depressing.

      The independents now at least have the internet, which raises the POSSIBILITY of reaching all those people at once, but there is a high noise to signal ratio, and the audience is also the director, meaning they are not CAPTIVE. So, you have to be monumentally interesting, entertaining, AND know how to get people to come to your website to sell music online.

      Doing all these things is definitely easier than the classic touring and underground zine methods of old, but it's still difficult.

      No one ever said being a professional musician is easy. It's a FULL TIME *JOB*, and the harder you work, the more success you will find. But, just as not all businessmen become rich executives, not everyone can top Billboard..

      But dammit, it is the funnest hard job I've ever had.
      • Whats wrong with doing music solely for entertainment and soley for profit? It ends up being art sometimes anyway whether you like it or not. As for social change, well a lot of times the genre thats trying to enact social change just becomes a new fad instead which negates the whole purpose of it in the first place.
    • I would imagine alot of this has to do with the simple psychology of people in large numbers.

      In general it's a combination of decades of getting information and entertainment from a small number corporate sources, as well as the tendancy to gather where others are.

      Still, while this paints a picture in which the majority will always go back to a few sources, ignoring the rest, at least in the digital age, those of us who do want to go discover an unheard of musician, writer or artist from remote locations have a much easier time doing so.

      • Still, while this paints a picture in which the majority will always go back to a few sources, ignoring the rest, at least in the digital age, those of us who do want to go discover an unheard of musician, writer or artist from remote locations have a much easier time doing so.

        Whilst I don't necessarily disagree I question the 'much easier' part. They still have to have some way of letting you know that their site exists and that it contains something you might like. That'll mean advertising in some form.

  • This article is basically a fancy way of confirming the tyranny of the majority. Google's PageRank, as good as it is, both a) suffers from and b) perpetuates the tyranny of the majority (aka "the rich get richer", the "power law"). IE, the more links, the higher the pagerank, the more relevance, the more hits, the more links...

    Teoma seems to be aiming at this chink in Google's armor.

    From Teoma's page [teoma.com],...

    Teoma uses Subject-Specific PopularitySM. Subject-Specific Popularity ranks a site based on the number of same-subject pages that reference it, not just general popularity, to determine a site's level of authority.

    Using vectoring algorithms to find themed hives of related content, Teoma partitions the power law into manageable chunks. IE, the rich get richer, but at least a dominant site in one field doesn't get artificially inflated relevance when querying an unrelated field. At least in theory. (Kinda like laws are supposed to keep a monopoly from illegally entering other markets, but I digress.)

    This is working for Teoma: I (and others) are finding useful stuff on Teoma that Google didn't.

    Google is already aware of this particular limitation of PageRank, as can be seen from what they suggest programmers submit to their programming contest [google.com]...

    Entries in the Applications track generally deal with the semantics of the data. Some examples include:

    Detecting common templates in pages, and separating out the common structure from the individual content.
    Classifying links on a page.
    Detecting pages that are near-duplicates of one another.
    Clustering pages by topic or type.

    Even with all that, I still think that humans are the best filters (and isn't a search engine just a programmable filter?). I suspect the rise of weblogs might have something to do with the usefulness found in tapping into some weblogger's idea of what's useful/cool/interesting.

    So perhaps the best way to find good info is a cross between a human and a content-vectoring search algorithm. Maybe that's why Ask Jeeves bought Teoma [com.com].

    • PageRank is only used as the sole criteria for ranking pages when they are all within the same category - ie, in the Google Directory listings.

      In general search results, Google ranks a page based not just on its PR (ie on how many pages link to it and their overall quality) but on the TITLEs of those pages and the anchor text of the links. So Microsoft could put the string "book reviews" on their home page but they'd still rank behind me on a search for "book reviews" [google.com], because none of their incoming links are from pages about book reviews.


  • by inKubus ( 199753 ) on Thursday April 18, 2002 @08:35AM (#3364506) Homepage Journal
    Well, in "Cybernetics and Society: The Human Use of Humans" by Norbert Wiener, the author talks about messages--communication. Links to a web site are messages given to people using the web that a given site/page/whatever exists. They are easy to make use of, since all you have to do is click it and whatever is on the other end is given.

    This is all fairly obvious. The neat thing is how these messages and the messages they point to interact. Dr. Wiener says that the more unique a message, the more "important" it is. This is simply because an overused message (a cliche) quickly becomes filtered by the human mind, and loses its meaning.

    Take the Amazon example, for instance: How often do you click a LINK to Amazon? Yes, there are hundreds of thousands of links to Amazon, but I would guess most of them NEVER get clicked. Why? Because there are too many of them. The first time I saw one, I followed it, but now I just ignore them. I almost never click on Amazon links because I know it goes to some bookstore.

    When I do go to Amazon, I just type the DOMAIN name into my browser, and go directly there, and do my own searching, follow my own links.

    So, basically, there is an upper limit to the number of links before they essentially become useless. Of course, this upper limit is dependent on the total number of users who haven't seen the links, which is increasing every day as new people come on to the web. As the number of links reach this critical mass, more and more people are just typing in the domain name rather than following a link.

    This is Google's essential flaw. It does not recognize that a site like Amazon does not need an entry in a search engine. There are enough links out there already for just about anyone to find it. Google should instead group searches around a bell curve distribution, where the sites with the medium number of links have the highest relevance, with underlinked and overlinked sites falling off the ends.

    How are new sites found out about and linked enough to show up in an engine like Google? Advertising. Mostly word of mouth and link ads, and in certain cases print and television advertising (although this is less effective, because it requires the user additional steps to make use of the message (ie: remembering the domain name at a much later time and then typing it in), which is why the .com ad explosion on TV failed to do anything..)

    Really, to be effective, you need to have 10-20 contacts online, have each link to your site. Spread the word as much as you can. And save your ad budget until your word of mouth traffic reaches critical mass. Then spend it on bandwidth.

    Really, time is the only key. Oh, and having something useful or funny.

    Anyway, this quickly turned to the theories of getting a lot of hits, and I apologize, but you can see that the middle is the best place to be, and maybe Google will recognize this. This would do a LOT for online commerce, and the economy in general. Support bell curve relevance.

    • This is Google's essential flaw. It does not recognize that a site like Amazon does not need an entry in a search engine. There are enough links out there already for just about anyone to find it. Google should instead group searches around a bell curve distribution, where the sites with the medium number of links have the highest relevance, with underlinked and overlinked sites falling off the ends.

      Unfortunately, mindshare IS the key. You have to keep the product in front of people. Why do you think McDonalds still sends zillions on advertising? Everyone knows about them, there is one on every corner. Yet every week there is some new special, or new Happy Meal toy.
      Because if they stopped advertising, they would quickly get rolled over by BK and Wendy's. 18 months, and they would be out of business.

      For any company to survive, it needs to keep its name right out in front, everywhere possible.
    • Kubus, are you serious?

      How often do you click a LINK to Amazon? Yes, there are hundreds of thousands of links to Amazon, but I would guess most of them NEVER get clicked. Why? Because there are too many of them.

      The vast majority are book recommendations, via the affiliate program, usually related to the material of the linking site.

      So, basically, there is an upper limit to the number of links before they essentially become useless.

      The search engines currently evalute quality of a page based on the number of links, and the quality of the linking pages (and maybe some other calcs to check for fraud). Hardly what many people would call "useless". It's also a really good way to find tiny needles of valuable information among the giant haystack of approx 2e9 mostly mediocre web pages.

      This is Google's essential flaw ... Google should instead group searches around a bell curve distribution, where the sites with the medium number of links have the highest relevance, with underlinked and overlinked sites falling off the ends.

      In the vast majority of cases, Google's strategy works well. It's hard to argue with success, though it seems to come effortlessly in this post. Were you smoking something?

      There are a lot of "howto" and FAQ style documents on the 'net, often times many copies of the same document, some older than others. Google really does have the magic of finding the short list of these documents that are the most up-to-date and relevant. How do you suppose that works??

      Really, time is the only key. Oh, and having something useful or funny.

      Now even if "time is the only key" (whatever that means), it follows from the rest of the message about as well as the typical ramblings of Jon Katz.

      Well, that's enough inflamatory remarks for one day. I don't know what compelled me to reply to this silly message mod'd to "+4 Insightful" (yeah, right!)

    • Don't forget that a LOT of people will search google for "amazon.com" and never, ever look at the address bar.

      • exactly - my personal browsing habits fall in line with the original poster in this thread...but I think we (geeks) are now the minority on the web...my mom, for example, is what I would consider the typical eConsumer...and I once asked her to type in an address of a website I was working on...she didn't know where to type it...I was amazed to find out that her browsing habits were mainly directed by links on her home page (set by her ISP) that she follows and makes bookmarks...links are key.
  • Webcomics (Score:3, Informative)

    by BitwizeGHC ( 145393 ) on Thursday April 18, 2002 @08:38AM (#3364512) Homepage
    This is true of Webcomics as well. Ask someone what their favorite Webcomic is, and they will almost invariably respond with one of the following: User Friendly, Penny Arcade, PvP, Sluggy, Sinfest, Megatokyo or Exploitation Now. With the exception of Penny Arcade, I have found the total combined quality (art + writing + humor) to be fair at best, and atrocious at worst (guess what the worst is; hint: think of a little dustball with feet). But these sites are linked to from all over, and they often link to each other, creating "flash crowds" from Slashdot, other comic sites, personal home pages, etc.

    There is a class of "second tier" comics which have nice little followerships: Little Gamers, Sexy Losers, Polymer City and Cool Cat Studio (really, any Keenspot comic that isn't Sinfest or EN) are among these. Everyone else, myself and my comic included, is "third tier", i.e., tumbleweeds rolling across their allotted server space.

    Then there is Pokey, which stands conspicuously on its own. HOORAY.
  • Amazon.com has not only done massive amounts of advertising, but has also expanded what they sell. Used to be books and music, now they sell an astounding amount of all sorts of stuff. I bought a snowshovel from them a couple of years ago when all the retailers in my area were out (freak snowstorm) and had it delivered to my house before the retailers were expecting their deliveries. Kitchen appliances, power tools, magazine subscriptions - you name, Amazon.com probably sells it.... and their service has always been top notch. Living in a rural area has limitations, but you learn to separate the wheat from the chaff in online shopping. They deserve all the hits & eyeballs they are getting.
  • by dipfan ( 192591 ) on Thursday April 18, 2002 @08:42AM (#3364524) Homepage
    Researchers have discovered that across the entire web, links are distributed according to a "power law" which leads to "rich get richer" or "winner's take all" behaviour

    Ah, no that's not what it shows:
    Quote from the research: "In fact, pure power law scaling appears to be the exception, rather than the rule." What the research shows is that "winner takes all" varies across the web between categories.

    Also, the researchers have (I suspect) a mistaken view of "competition" and competitiveness. They rate the Amazon category of book-sellers as "more competitive" - when in fact it may be less competitive in economic terms, being dominated by one or two sites/sellers. Whereas the photography caregory may be "more competitive" because of a larger number of rivals of about the same size.

    What they mean, I suspect, is that the publications/Amazon category has higher barriers to entry - the amount of adertising etc being a greater sunk cost, and likely to deter any aspiring internet book retailers. In purely economic terms, that makes the category less competitive. As an illustration, ask yourself if the operating system software market is more or less competitive because it is dominated by one large brand in Microsoft?

    • What they mean, I suspect, is that the publications/Amazon category has higher barriers to entry - the amount of adertising etc being a greater sunk cost, and likely to deter any aspiring internet book retailers. In purely economic terms, that makes the category less competitive. As an illustration, ask yourself if the operating system software market is more or less competitive because it is dominated by one large brand in Microsoft?

      Less competitive, unless you're Microsoft, in which case it's "the most competitive industry in the world".
  • They kick back 5 to 15% to whomever provides a link that leads to a sale. That's not small beer. They make it easy for anyone to provide these links. So of course they're all over the place.
    • it's small change though.. in the past two weeks, I've sent amazon 14,316 clicks and 8952 unique visitors.. 46 items ordered. For that I'll see about $40 (my current revenue only displays items shipped not ordered, so no exact numbers there, 37 items shipped is $28.57, so I might be a little optimistic).

      A cost-per-click banner ad that only pays $0.05 per click would have cost Amazon $715.80. In this case, the cost is closer to $0.0027 per click. It's quite cheap advertising compared to almost anything else on the web currently.

      Factor in the fact that affliates don't get referral fees for used items, auction items, and several other categories and you realize that their 5% referral is coming off their highest profit items. Any time a visitor buys a used item from Amazon, they don't have to ship anything, just collect a payment.
  • Amazon's dominance comes from advertising, good software design, and a decent distribution network. The reason why so many sites link to it is that their comprehensive and "reader-reviewable" book database is relatively open for arbitrary and not necessarily sales related searches. This has made it a "standard" for people to link to when refering to a book. If some enterprising individual or collective were to come up with an open database of books and a method of presenting it over a popular medium, like the web a la IMDB or AMG, then Amazon's power over folks would be much reduced, giving Powel's or B&N a chance.

    OTOH, in a commodity market, like for the pulp of books, why would anyone really care who they are buying from?

  • The article goes into how those with the most links get more power in the market. That's essentially how Google works: the more sites that link to your site, the higher you are in the search rankings. Maybe the popularity of Google is partially responsible for this situation.
  • You mean to tell me that the more people come into your store, the more sales you will make?

    I never realized that...
  • by SweenyTod ( 47651 ) <sweenytod@@@sweenytod...com> on Thursday April 18, 2002 @09:00AM (#3364588) Homepage
    Is called VisIT. It produces a graphical representation of how sites link together, based around any given query. It was used quite sucessfully to demonstrate how Scientology had spammed Google [operatingthetan.com], by creating multiple domains all linking back to their main web page.

    It's a freebie download and you can get it here. [uiuc.edu]
  • April 15, 2002: New NEC study to appear in Proceedings of the National Academy of Sciences, 99(8), 5207-5211, April 2002.
    Download a preprint in PostScript or PDF formats. Contact: Dr. David Pennock, dpennock@research.nj.nec.com, (609) 951 2715.

    They bought a domain name specifically to publish (or help publish) the findings in their article. The site had 3 pages on it (depending on how you count) -- the home page, an example page, and links to the article itself in two different formats. In their article, they say "For more information, see http://modelingtheweb.com", so maybe they plan to publish more as time goes on, or add links to other folks modeling the web, or even host such articles directly ... but maybe not.

    I wonder if this foreshadows a coming trend -- write an article, publish it in a print journal, and get a domain for it. Somehow this makes me sad. "We can't just publish an article and put it online, we have to get a top level domain for it." *sigh*
  • Zipf (Score:2, Interesting)

    by limekiller4 ( 451497 )
    The Atlantic has a GREAT article about this effect:

    http://www.theatlantic.com/issues/2002/04/rauch.ht m [theatlantic.com]

    An exerpt:
    "Every so often scientists notice a rule or a regularity that makes no particular sense on its face but seems to hold true nonetheless. One such is a curiosity called Zipf's Law. George Kingsley Zipf was a Harvard linguist who in the 1930s noticed that the distribution of words adhered to a regular statistical pattern. The most common word in English--"the"--appears roughly twice as often in ordinary usage as the second most common word, three times as often as the third most common, ten times as often as the tenth most common, and so on. As an afterthought, Zipf also observed that cities' sizes followed the same sort of pattern, which became known as a Zipf distribution. Oversimplifying a bit, if you rank cities by population, you find that City No. 10 will have roughly a tenth as many residents as City No. 1, City No. 100 a hundredth as many, and so forth. (Actually the relationship isn't quite that clean, but mathematically it is strong nonetheless.) Subsequent observers later noticed that this same Zipfian relationship between size and rank applies to many things: for instance, corporations and firms in a modern economy are Zipf-distributed."

    It's one of the best articles I've read in a long time, demonstrating how they've managed to model not only extinct populations accurately (who knows how much after-the-fact tweaking went on, but...) but race riots and honesty in social groups.

    Add to that, I spent a good fifteen minutes trying to find it again, so someone had better read it. It's just under 10,000 words.

    PS - I strongly doubt it'll get slashed, but if it does, here is the Google cached copy [].
  • We need websites, like googles Directory, that shows all the webpages out there that are in a certain category. For example, bookstores [google.com]. Buyers not only see amazon.com but they can see all the other bookstores out there too.

    I think when people do a search for something in google, google should not only return links to webpages, but also links to directories on their site that contain other sites that are not very popular yet.
  • ...links are distributed according to a "power law" which leads to "rich get richer" or "winner's take all" behaviour where a small number of sites get the vast majority of links and traffic....
    It's never winner take all! I have yet to find anyone who can't use the internet to boost their revenues, despite Amazon and the others. The rich get richer, but to say 'winner's (sic) take all' is misapplying a cliche.

    And how is this different from the RealWorld(tm)? Franchises do better than mom-n-pop both because of efficiencies of scale AND because national ad campaigns give them an edge. Trust of a new location is immediate, too. After a few years, being established gives one an additional edge. The only one of these that mom-n-pop can hope to win on is the last one.

    If Barnes and Noble were to offer a better associate program, Amazon would lose associates. If someone cleverly merges the ideas of Napster and Ebay, or otherwise improves on Ebay's business model, the vocal segment of frustrated ebay vendors would leave, too.

    ... see how much tougher competition is among booksellers compared to photographers."
    This last part even makes sense. When I wander into a 'net community of experts on an unfamiliar topic (like when I decided to buy a film/slide scanner last month), they often suggest three or four vendors for specialty material. Why? Because Amazon doesn't carry the stuff, and neither will 2 of the 3 mentioned vendors. Experience simply tells them that when they get ready to buy, they call those numbers first.

    I know way too many niche vendors that have doubled their revenue by wisely using the internet. It isn't winner take everything, it's a matter of earning attention in a MUCH bigger market.

  • Certainly lots of sites link to Amazon. Amazon knows about more titles (and other merchandise) than any other single site, with the possible exception of Barnes & Noble [bn.com]. They've got LOTS more reader reviews than any other site including B&N. Their return policy is favorable to purchasers (that is, it's comparable to most brick and mortar bookstores). They frequently reduce shipping and offer coupons. What's not to like? Assertion of the one-click patent and their privacy policy changes (claiming they own the records of what you've purchased) are about it. Yeah, the privacy policy thing stinks, but even the brick and mortar stores track what you're buying. If you want privacy, pay cash.

    Personally, if Bookpool [bookpool.com] has what I want in stock, I'll buy it from them. The prices are nearly always less than Amazon's. BUT, Bookpool only sells technical books, and nothing in the way of Christian theology (my other big reading interest). Barnes and Noble throws in free shipping with two or more books in an order, but the prices are usually higher.

    People flock to Amazon because it's simply a valuable service.

  • It seems to me that the article is nonsense. There are a lot of links to Amazon because Amazon is running a very active business in selling books.

    CmdrTaco said recently that efficient search engines like Google [google.com] make the web flatter and more democratic. It has been my experience that this is true.

    In search results, my book, What should be the Response to Violence? [futurepower.org] is often ranked just below stories from large news companies.

    If you search for "books", you will find Amazon. If you search for something more specific, you may find a small, specialized bookstore.

    The rich still get richer, but those who are not rich can now be heard.
  • If I were starting an online bookstore, I'd pick one subject I knew well, and try to build a reputation as the center of that online community.

    I think most users are not yet sophisticated enough to easily find alternatives. Also, they *expect* large mass-market retailers instead of high-quality specialists -- that's how they've gone shopping most of their lives.

    Once they gain the skill, and they learn to expect better than mass-market, they may turn to specialized online bookstores (from politicaleconomybooks.com to trashromancenovels.com to thinkgeekbooks.com).

    For example, many users I support don't type addresses in the URL field. Most can't search efficiently ('how do you spell "google"?'), much less find sites that don't appear at the top of the search results.

    They'll learn of course, and their kids are bloggers and already use the specialized sites. Then Amazon may find that they do everything, but nothing well.

    • A bookstore is a bookstore. How is a "high-quality specialist" going to sell you a better book than a mass market store?
  • Amazon's competition BarnesandNoble.com has not
    accomodated their customers nearly as well.
    1. time-to-live is 0 for many of Barnes and
    Nobles' interactions.
    That is, when you hit most of BarnesandNoble's
    webpages, your DNS server is not supposed to
    go to its cache, but go to BarnesandNoble's
    DNS servers. As a result, almost
    all BarnesandNoble webpages fail for me
    since I use internet's 13 root name servers;
    eg, A.root-servers.net
    Using BarnesandNoble is impossible for me
    unless I use DNS servers besides those that
    are the foundations of internet.
    Here are some time-to-live (ttl) for their sites,
    barnesandnoble.com >0
    www.barnseandnoble.com =0
    store.barnesandnoble.com =0
    shop.barnesandnoble.com =0

    2. BarnesandNoble offers no used books.
    The 1991 no-longer-printing book,
    Passionate Attachment
    by an President Johnson's advisor who
    perpetually recommended ceasing involvement
    in Vietnam,
    George Ball
    was essentially a book prevented from success
    by political maneuvers.
    While I couldn't buy this book at all at
    BarnesandNoble, the listing at Amazon.com
    included a used book listing on the main entry.
    This contribution to the reading community by
    Amazon helps greatly.
    For this, Amazon has been attacked by the
    publishing industry, yet Amazon knows its
    primary customers are those who pay for books.

    3. BarnesandNoble has no obvious listings of
    suggested books in specialty areas.
    But Amazon's listing for "Greene Econometrics"
    has a few, mostly PhD students, listings of
    "Great Books in Economics".

    4. BarnesandNoble has no user reviews.
    They have only the optimistic reviews of the

    Indeed, this seems a recurring theme of
    BarnesandNoble to keep their publishers
    in mind before their primary customers.
    Amazon has kept its lead by helping its customers
    make appropriate choices, without biasing the
    customer towards someone else's wishes.
    Amazon then succeeds when its competition
    hinders its customers in numerous ways.

    ---Jameson C. Burt
  • This is valuable research. It's important to understand the implications. If you're a little guy and you are looking for a Web-based career, you have basically two choices: 1) become a photographer and remain independent, or 2) work for the big guys.

    Anything between these two extremes is a very slippery slope.

    You can use this research to argue that the practice of ranking sites by counting incoming links, a practice that began only a few years ago, is fundamentally altering the nature of the Web.

    This new Web is one of the reasons for the popularity of the weblogs -- it's the only way for the little guy to participate. It used to be that you felt like you were participating by putting up your own Web page. This is no longer true, because a new Web page from an average person no longer draws traffic. No traffic means no feedback, and no sense of participation.

    More research like this will encourage search engines to discover algorithms that do less damage to the Web, one would hope.
  • I have a website (shameless plug: http://www.best-sf.com) focusing on science fiction and fantasy. Every title listed links to both Amazon and B&N for users who want to buy it. I can tell from the hits I'm getting in the two affiliate programs that users prefer Amazon by a good ten-to-one ratio, even though both links are located in the same position on the page.

    It's a snowball effect: a site becomes dominant, people become familiar with the brand, that makes them more likely to buy there, which makes it more dominant. What interests me in particular is that B&N's offline dominance apparently doesn't travel well.


  • I buy books, etc. from Amazon over Barnes & Noble and other online book, etc. stores for one simple reason. The quality, quantity and ease of online customer reviews.

    The comments are what really make the difference. There are many online stores that have customer ratings, but they are just stars and no comments, or very few, if any, customer reviews.

    Comments are important in a review since it helps me, as a buyer, identify which reviews are valid. For example, if I see a customer give a book one star because it doesn't stoop down to the beginner level (which is many times the case) that review instantly becomes a 5 star in my mind and I'll probably buy the book.

    Amazon has by far the most customer reviews and the most detailed customer reviews. I usually buy based on these reviews and have never been disappointed. That is the only reason I shop there more than other places.
  • cant wait tell somone finds a hack to put your own links on any website..
    would kinda defeat the link poularity thingy..
    maybe search engines would have to actually look at content.
  • This isn't just a phenomenon created by the institutions of the web (i.e. the link based search engine rankings, cross-linking, contextual linking, and purchased ad and link real-estate).

    I believe there is also a deeply psychological phenomenon that is exacerbated by the anonymity of the web. Specifically, in the real world we can go into a shop and see the merchandise, look into the eyes of the staff, see the condition of the store, etc. These things foster an ability to trust both the product and the business because we've exprerienced it firsthand and have taken in clues as to the quality and trustworthiness of the products and the people. On the web, however, that function is limited to the professionalism of the interface and the name recognition/popularity/pervasiveness of the company.

    Perhaps the answer for companies that aren't in the forefront on the web is to develop relationships with online communities that might be related to their products, make greater use of customer testimonials and provide better descriptions and actual quality photographs of merchandise.
  • Since Google rewalks the internet roughly once every month, all it takes for a competitor to have equal link-share is to completely spam the internet with thier links (Maybe going down the list of sites that post Amazon links and offering to pay each and every one of them to change it to a link to your site) and procede to wait one month. One month for a total reversal of mind share is absurd in the "real world", so it looks like the internet isn't doing quite so bad after all.
  • amazon is so popular because, unlike all other web sites, it's been designed by people who knew what they were doing. that's all there is to it, but it's amazing how difficult it is for people, including computer scientist, to grasp this very simple point.

  • Isn't this just a variation on the psychological premise that people can only deal with 7 (give or take) things at a time? This leads to the conclusion that there could only be a few major companies in any market because more than half a dozen is too many for the customers to keep track of. Therefore, the goal of any company is to be the one with the most recognizable name in the particular market they're shooting for.
  • The Internet, as had been envisioned by Wired dreamers in the early days, was seen as a great sea where a thing could be accessed by any other thing. A thing being a person/user/hub/server whatever. At the time, digital things had no value, only the machines that made it possible were of value... and not too valuable. The Net itself was the "product", not what was on it.

    The point was to give a massive distribution super power to anyone connected. And visa versa. Anyone could access you too! Tools were used to find specific things, and they worked well.

    Then some jack-ass (I think his name was Bill something...) jumped up and exclaimed; "Hey! You should sell that software! Not give it away for free stupid!" and thus the current situation formed.

    Commercial applications were born. Now children, we all know that the difference between a commercial thing and... something else...*..., is that the commercial thing could care less about giving you anything and is, in fact, designed to do exactly the opposite within the confines of the law.

    Google is ment to "guide" you to commercial sites, most likely Google partners, not present information categorized in a logical and academic fashion. That would be giving it away for free, wouldn't it? What are you? A TERRORIST OR SOMETHING! HUH!?!?? HUH??? WELL??? *aims laser bomb*


Houston, Tranquillity Base here. The Eagle has landed. -- Neil Armstrong