Web: 19 Clicks Wide 114
InitZero writes "The journal Nature reports that the web is only 19 clicks wide. What it fails to mention is that a least one of those must be through Kevin Bacon. " The graphic at the beginning of the article is gorgeous in a Mandelbrot style-now if I could just have it in a 24 x 30 print.
19 hops "on average"?... (Score:3)
The Internet the first in-organic life? (Score:1)
While looking at the mess of tangled wires in our companies engineering patch-room we have often comment on the near organic appearance of the patch cables interconnecting our company. Now, this is just a very small subsection of the entire Internet. Doesn't it seem possible with hundreds of thousands of patch-rooms, protocols, and processors out there that something could evolve?
It would start with a few anomolous packets zipping back and forth reconfiguring routers to interconnect into a giant super-being. It's first triamph as supreme net-being would be to spam us all in every known language, "could you please stop pinging it gives me indigestion...*burb*!"
-APinteresting. (Score:1)
I also administer the NDLUG (http://www.ndlug.nd.edu/ [nd.edu]) web server, and noticed massive spidering from the same machine on campus.
Now I read this article and see this quote:
"The Web doesn't look anything like we expected it to be," said Notre Dame physicist Albert-Laszlo Barabasi, who along with two colleagues studied the Web's topology.
so, i guess i don't have much of a point, but it's kinda cool to see that something actually came of some people in the college of science abusing our poor 486 webserver [nd.edu]...
Re:Interesting article (Score:2)
Northern Light [northernlight.com], as I recall. I don't know that that claim is verified in any credible way, however.
Distance from microsoft.com to slashdot.org (Score:2)
Start: www.microsoft.com
Re:Example program (Score:1)
"Trailblazing" through link-space was a prime motivator of Bush's Memex vision: finding new paths between separate "pages" of information was the same as discovering new relationships between discrete pieces of knowledge. In fact, knowledge can be thought of as connections between previously unlinked sets of facts.
In all seriousness, finding a link-path between two separate pages is a thorny issue. First, you are dealing with a directed graph, and as the posts above point out, a link-path from A to B probably won't contain the same set of pages as a link-path from B to A. Then there is the issue of _which_ link paths are useful (and I believe there are some which could potentially be useful) and which aren't; this is largely a decision made based on the weighting you've placed on the Web Pages in question. Finally, there is the issue that you have to have link-structure information sitting around for a good chunk of the Web before something like this could actually work.
But it would be neat!
Re:Interesting article (Score:1)
I personally don't think this was a really meaningful survey, because of the large range it found. Also, does one really care how far away you are from an arbitrary page? I certainly don't. Generally I care how far my information is from a search engine, which is generally 1-2 clicks, and how easy it is to find said information based on results.
Rarely do I start from a random page, and try to get to information by clicking through links.
Beyond Cool (Score:1)
Re:The Internet the first in-organic life? (Score:1)
No but I can tell you something else interesting (Score:1)
The dialog box has a number of buttons, of which the fourth one down is "Open Source". However, the one on my version doesn't work -- ie it does not open the source of Microsoft Windows.
Sorry to have wasted your time, really.
jsm
Power law functions can be random (Score:2)
Stochastic (random) functions can be characterized by a range of power-law functions and other spectral shapes. The randomness of the power-law function is given by the randomness (e.g., mean and variance) of the individual components of the spectral function. For instance, draw a x-y plot consisting of a straight line that has a negative slope. The y-axis is the amplitude while the x-axis is the frequency (or the inverse wavelength). Now suppose that this straight line represents the "average" value as random fluctuations about this line exists. This is power-law random function.
Sorry if this is over simplified. BTW, fractals are characterized by a power-law function. OTOH, true fractal functions have constraints on what the power-law slope can be (Haussdorf dimension).
Now for something silly. What is the degree of freedom from Gore (Father of the Internet) to Slashot (the bastard child of the Internet)?
Re:Asimov's second foundations - first (Score:1)
The mathematical aspect was indeed the most interesting part of the Foundation books, and it always surprised me that Asimov played it down as the series continued. Sort of like how the problems with trying to implement an absolute ethical system into a being was the most interesting part of his Robot stories, yet he wormed his way out of that (zeroeth law etc).
-
Re:I fail to see how this could be useful. (Score:1)
1. We are given a real-world (probabilistic) distribution of link distances between pages (i.e. given two randomly chosen pages, what is the probability that the shortest/longest link distance between them is X?)
2. From the visualizations, we can see that the web is a graph containing a number of densely connected components which are themselves only fairly loosely connected to one another, and that this behavior is fairly scale-independent.
These two tidbits could lead to impressively improved Web crawlers. You could decide to stop following links once you've gone 25 deep, for example; you could try and determine on-the-fly if more than one of your crawler processes is working on the same densely connected component of the Web and combine their efforts (or move one of the processes over to a new uncharted component), thus effectively searching more of the web. Using similar statistics for distribution of in-link and out-link counts, you could improve crawler heuristics so that pages with a number of out-links significantly deviant from the mean are given more weight for future crawling.
Oh well, just some random thoughts.
Re:Islands (Score:1)
Since we're talking about a discrete set (a directed graph), you can forget about "upper limit" and just say "maximum of".
So what they are measuring ("the average distance between two random pages") does *not* match the mathematical definition of a diameter, in spite of their claims.
I must admit I dunno what is the right term for what they are measuring, though.
Re:But they fail to mention... (Score:2)
But on a more serious note (yeah, right), I once thought about the following. It is too bad that this site is not more pro Microsoft (boy, does this company's name have Freudian meaning, right Billy boy?). Then Rod can put up a link to assembler info. It would be:
http://slashdot.org/asm
Re:Here's another interesting site (Score:1)
I don't really know much about this stuff, but apparently somebody/thing at SURAnet (server: mae-east.ibm.net, located Vienna, VA) is "causing packets to be lost" on their way to Frisco...
Personal web sites (Score:1)
Every llama has their own website...what are you talking about??
Distance from slashdot.org to microsoft.com (Score:1)
www.microsoft.org [microsoft.org]
:)
Only one click! (Score:1)
Re:The Source code for that mandelbrot set. (Score:2)
Re:Clustering (Score:1)
I read that article, and I remember that it sounded a lot like what Google already does.
Re:Cool Article. (Score:2)
Those who don't care to do so, well, won't.
The real limit is on the number of people willing to be creative.
I don't think it has anything to do with "cool" technologies - whether I have JavaScript rollovers on my site or not doesn't affect the quality of the content itself.
D
----
But they fail to mention... (Score:5)
Re:Cool Article. (Score:2)
Using the tree analogy:
- Yes, the tree will get a *lot* bigger.
- Yes, the tree can only get so big.
- Yes, leaves (pages) and branches (sites) will fall off and hit the WWG (world-wide ground).
- Yes, there is a gardener, but he's only interested in a branch or two.
- No, I haven't gotten much sleep lately
The Lord DebtAngel
Lord and Sacred Prince of all you owe
More Pictures of the Web (Score:1)
Tried this before (Score:1)
My effort never quite got off the ground though
Re:Shape of the web (Score:2)
If you want to know geometrical shape that would be true. But topological shape is a little different. Topologically, a coffee cup and a donut both have the same shape. The shape in common is that they each have one hole.
Similarly, their power law reference means that the web is fractal in dimension, which is not the usual 1-2-3-4 dimensionality commonly meant. I would imagine this dimension is somewhere between 1 and 2. It's a set of lines (one dimensional) that are almost dense enough to fill an area (two dimensional.)
"But even if you only assume two or three dimensions, why 'clicks wide'? "
It sounds like they are looking at the web as a graph, a series of points (web pages) connected by edges (links). The width of a graph might be found like this:
For each pair of points in the graph, find the shortest path along edges between the points (in terms of number of edges.) The maximum length among all these shortest paths is the width of the graph.
Another good question (Score:2)
I fail to see how this could be useful. (Score:2)
And, ooh, the web exhibits properties of exponential growth, with some sites that have many more links to other sites. Like I couldn't figure *that* one out. Some people post their bookmarks, and lists of links, and other people only link within their interests. A graph of this might look interesting if done correctly, but I still don't see how this would be that useful.
The graph at the top was pretty, though, it looked like an IFS fractal. They look like stuff found in nature too, so I guess that gives this article a context to exist...
Re:Islands (Score:4)
Seen this before (Score:1)
Re:Cool Article. (Score:1)
-Rich
Re:Islands (Score:1)
Re:Dimensions of the web (Score:1)
The fractal dimension is an invariant of the topological space, so it's embedding in a superspace like 2D or 3D Euclidean space is not important in terms of it's fractal dimension.
I wasn't trying to imply that the lines have area, but a space filling curve has fractal dimension close to 2 because it "nearly" fills an area, and does so in the limit. It was a rough, and possibly poor, attempt at an analogy to this situation. You may be right that fractal dimension isn't important here, but it would be my guess that the increase in diameter from an given increase in nodes that was mentioned in the article was calculated could be based on that dimension, as it relates to how self-similar structures scale.
The map is not the Internet (Score:2)
I agree with many here that the analysis in the article described is not much to talk about, and yeah not much use except for generating some cool fractalized images, but the basic precept behind their development may be the only real way of performing a true mapping of the Net's shape and growth, something that will be important in the years to come.
You can't build a useful map of the Internet's structure in the way you map the streets that wind through your town. Future search tools will require a fair amount of intelligence not in the way they go about a search, but in they way they 'think' about a search (there is a difference). Topographical mapping -- not index cataloging -- will help developers figure out these ways of thinking.
Re:Shape of the web (Score:1)
I think the Web analogy is the most accurate, with multiple routes to each site.
Maybe Im just thinking two dimensionally, lousy brain.
WWW, galaxies, and Slashdot's black holes! (Score:2)
"It's alive! It's alive!!!"
Seriously, though, that's very interesting, but it's actually obvious when you think about it. The reason why galaxies are not distributed randomly is that there are centres of attraction that begin as random fluctuations in an evenly-distributed environment; but as matter condenses, these types of patterns emerge.
Now, replace "gravity" with "number of hits". A site with a lot of hits, of course, represents a centre of interest, where people congregate. And naturally, they will either link to the site, or try to get linked from it.
And so, the same patterns emerge.
Hey, that means Slashdot is kinda like a black hole generator! Once it aims its beam at a site, it submerges it with hits until the site reaches critical mass and implodes, dropping out of the known Uiverse!
"There is no surer way to ruin a good discussion than to contaminate it with the facts."
Re:People are closer? (Score:1)
I believe this is false. Counterexample:
There are people in remote region that do not have contact with many people outside their groups. Say there exists a tribe (lets call them a) in a forest in Indonesia. Suppose the only contact that this group has with the outside world is through some anthropologists. Now suppose there was another tribe(b) in the same region that had contact only with a. Assume there are also tribes c and d that is in the same situation as a and b but in a totally different region, maybe Africa. Now consider the degrees of separation between a child in tribe d and a child in tribe b. Clearly it would be something like child(b)->parent->tribe a-> anthropologist ->?->anthropologist->tribe c->parent->child(d). In order for the six degrees of separation to be effective, the two anthropologists have to know each other directly. This isn't necessarily true given the number of anthropologists around.
I believe the six degrees of separation came about when someone figured that everyone knows at least 30 other people. Therefore a given person is separated by one person from 30*30=900 people. Analogously a person is separated by 6 people from 30^7~18 billion. However this doesn't take into account the redundancy in the relationships.
For example, many of the people that your friends know are from small cliches so the real relationships appear like many tightly interconnected clusters with a few connections between clusters.
BTW, I think I've been doing too many math proofs.
RFC: A New Game, Webpardy (Score:1)
First example: from the /. home page to... (let's see, something really obscure...) this one [thegoodnam...etaken.com]. Ready, set, go!
Of course, me and my investors hope that this path is also packed full of computer-generated Web abs, since this is how we get rich^W^W make it more fun for everyone!
Re:People are closer? (Score:1)
Ahh! But! We are touching upon a _very_ important issue here: latency. Our's is tad high today, since she's in Europe and I'm in California. It'll prawly be around noon EDT before I get a response out of her.
An ICMP-like roundtrip could maybe be shorter than that (I have her hotel number), but the annoyance that that would generate would probably void the possibility to get a full blown TCP-like connection setup any day soon.
Breace.
Re:Islands (Score:3)
Re:People are closer? (Score:2)
You had to create a rather special example, although a real world worst case would probably involve anthropologist->?>anthropologist replaced by merchant->city resident->city resident->merchant, which is just a little longer.
I think an example of the real-life short circuits is my wife's friend of a friend. My wife is from another country. We quickly found a friend of a friend of hers five miles from our home in this country. It seems unlikely, but the real math is something like this:
Re:People are closer? (Score:1)
On the other hand, I'm sure that on average people know a lot more than 30 people. If you count unidirectionally knowing someone, it will be even higher.
Breace.
Separation (Score:1)
Re:The Internet the first in-organic life? (Score:1)
So in that vein, we may have already created our own encounter with a different life-form. But not the first -- the memetic order is, in the book, a separate order of life that survives in the normal universe as a mental parasite. That is, it's an idea (meme) that can spread and is hard to get rid of.
Great book, but you really need to read the first two -- Brightness Reef and Infinity's Shore to understand the third.
-- Dirt Road
Silliness (Score:1)
Me Fit Where? (Score:1)
hops (Score:1)
Islands (Score:2)
Interesting article (Score:2)
Hubs (Score:2)
FYI: Southwest Airlines doesn't use a hub-based system of flights, but does direct flights between cities. Many flights are thus 'direct' but not 'non-stop'. They're also pretty cheap. Don't factor the cheapness into your analogies.
Re:Hubs (Score:2)
That said, other than webrings, Search engines could be your Southwests. They get you to places directly, but they traverse the web themselves through a series of links.
Another possible interpretation: any real surfer is her own Southwest, who, when trying to find a piece of information, hits various hubs, and follows links to the source in a somewhat roundabout but usually successful manner. The analogy's weak.
There's a much closer analogy to Southwest airlines when you look at the Internet than the Web, clearly; then you do have well-defined hubs (the big backbone routers) and carriers, who, admittedly, use each others' networks. It would be as if Southwest could get you from Dallas to Porvoo, Finland by flying you on its planes through its hubs to New York, then getting you on a British Airways plane to London, then a FinnAir plane to Helsinki, then a FinnAir prop to Porvoo. Feel free to extend that analogy...
People are closer? (Score:1)
Anybody else knows more about this?
(I can't verify it right now cause she's not here...
Breace.
Cool Article. (Score:2)
"I think that we might end up in an era where, just as
people today have their own e-mail addresses, people will
have their own Web sites," he said. "But eventually it will
taper off. Eventually it has to be self-limiting."
That last sentance makes me think he isn't too sure of the Web's self-limiting qualities. I personally don't think it will ever taper off. Just about the time it starts to get stale, the Netizins will get a new toy (ala JS rolovers, Java applets, Flash, Shockwave, whatever). There will always be too much excitement and new technology.
And, just as it seems weve run out of things to do, we might actually have a moon base with a couple hundred-thousand miles of 100BaseT. Voila, brand new web to play with.
Re:Cool Article. (Score:1)
So, yeah, I agree with Zantispam. :)
The Source code for that mandelbrot set. (Score:4)
Joseph Elwell.
No way... (Score:1)
Search engines (Score:1)
Re:People are closer? (Score:1)
Shape of the web (Score:3)
Hmmm...
To figure a shape to the web I would think you would first have to decide how many dimensions it has. Perhaps by assigning a dimension to each method of getting to a page, or perhaps by counting each hyperlink into a page as a separate dimension. Either way it could get pretty hairy pretty quick.
For example, is a hyperlink on a search engine different in some way from a hyperlink on a personal page? How about a web directory? Bookmarks?
But even if you only assume two or three dimensions, why 'clicks wide'? Seems more like 'clicks deep' to me. I always think of clicking on a hyperlink as 'drilling down'. Showing my age again I guess...
Jack
Re:Search engines (Score:3)
But then do you calculate width by starting at point A and continuing to point B? Because if that were true then the search engine argument would still be relatively benign. As you would still have to reach a search engine from page A in less "clicks" than it would take you to go straight.
If the "true" diameter is required one could measure in any fashion as long as we agree on a definition of "click" (which I define - for myself - as only mouse presses). So such sites as yahoo might bring unrelated pages closer. But without typing would many be relatively fewer than 19 clicks away? Yahoo is still categorically sorted so sites that are unrelated would need to traverse up the Yahoo category tree after first leaving the first site.
Joseph Elwell.
Distance + Content = ? (Score:2)
It would be interesting to see if, for example, internet traffic patterns show any kind of focus or foci about certain domains or sites or even specific boxen, and how those machines are distributed in real space. . . where, essentially, are our eyeballs and electrons going?
As for "dimensions," a 3D rendering would be the easiest to comprehend. Perhaps a sphere representing the globe, with an atmosphere of satellite link channels, and a substrata of bandwidth pipes and routers. Or a flat geometric field with peaks to represent the Big Iron, fractal spires twisting off as homepages and smaller sites. And isolated islands or floating moons of self-contained networks, or pages that go nowhere.
Don't mind me, I just finished reading Snow Crash, Diamond Age, and Idoru, and would enjoy a virtual walk through the data we're all accumulating.
Rafe
V^^^^V
One thing I don't buy (Score:1)
Round trip? (Score:1)
Of course, it's no problem surfing the web continously for 80 days (with a T3 and a tank of coffee) but how far would you get by then?
Where does it all start? End?
-
The internet is full. Go away!
Re:Search engines (Score:2)
Re:Shape of the web (Score:1)
OK, I'll buy what you are saying. In fact I can imagine this 'shape' better by thinking of it as a 'cyclic directed graph' made up of nodes and edges than I can from the picture in the article.
But I am thinking of this in programming terms because cyclic directed graphs are a data structure I understand (somewhat). It isn't something I could easily describe to a non-programmer, even using a white board. The thing is, this view of the web is nothing new! As a data structure the web has always been, was even designed to be, a 'network' (read 'cyclic directed graph').
So now we are back where we started. We haven't learned anything new from the article other than the fact that the average number of links between any two 'nodes' is 19. A number that is meaningless because it is perturbed by the large number of sites that link many pages to a single home page (all pages one or two clicks away). For purposes of developing new search and crawling algolrithms it would have been more useful to show the number of links out versus links within a site.
Jack
Clustering (Score:3)
Wait, how old again? (Score:1)
I just think the people that made the foundations (like the creators of ARPAnet, etc.) should be given a little credit, that's all.
points (Score:2)
Re:Interesting article (Score:2)
Also, one must factor in whatever percentage exists for pages/sites that may link to the outside, but are not linked back to. These one-way linkages would spread off a click average.
The REAL question is: (Score:1)
"The number of suckers born each minute doubles every 18 months."
Here's another interesting site (Score:2)
And with this link, you're but a click away 8^)
http://visualroute.datametrics.com/ [datametrics.com]Re:Another good question (Score:2)
I for instance have a two websites I use to distribute private files and such to friends. But, to the best of my knowledge neither of these has links to it.
They need to recruit a few ISPs, examine all the pages on them (after stripping usernames) to see how many webpages don't have links to or from them , then use that figure to adjust their estimate of the total number of pages.
(They need to get an estimate of the number of pages they can't find by spidering, and the only way to do that is go to the source...)
Re:Hubs (Score:1)
Dear Lord no! There's too many of us here now.
Re:I fail to see how this could be useful. (Score:2)
However, it might be good to identify growth and stagnation if we're going to be that complex about it. Hmm.
Well, I guess it's food for thought, anyhow. But I don't think this is a radically new vision of the web...
Re:The Internet the first in-organic life? (Score:1)
-AP
Re:Islands (Score:1)
Sounds like Gibson (Score:1)
"The Matrix has a shape" etc..
Maybe when we have discovered that shape we will discover life on Alpha Centuri as well...
(No not the game
Flaw: Based on static, public hyperlinks? (Score:2)
What about sites with logins, where hundreds of pages are hidden from public view?
It seems to me that most of what's interesting about the emerging behaviour of the web is buried within one of those two types of sites... discuss?!
Nice posters of Web topology that you can buy (Score:1)
(No, I don't work for them
Example program (Score:2)
Re:hops (Score:1)
Re:The Source code for that mandelbrot set. (Score:1)
Clicks more interesting (Score:3)
Internet Distance Maps Project [umich.edu].
For more pretty pictures, check out the Internet Mapping Project [umich.edu].
sixdegrees (Was: Re:People are closer?) (Score:1)
I killed the browser window at that point.
Once my SO twigged to the fact that you *have to* spam people in order to join, she felt as bad as I did. Needless to say, she hasn't been back.
WWW != Internet (Score:2)
Hmm.. (Score:1)
Re:Wait, how old again? (Score:2)
Re:Me Fit Where? (Score:2)
It was on that badass little graphic. There are links to info about LOC records in Bind8 that hold longitude and latitude info. You can then be plotted on the map. Or something like that. I just went for more of the purtty pictures.
Re:Cool Article. (Score:1)
The whole process gives a whole new dimension to e-mail and general communication, with this posting as an example. I've started to think and communicate in hypertext.
Dimensions of the web (Score:2)
A second factor is that you must consider what you are calling dimensions. A representative graph may be made in any number of dimenstions - flattened to two, or made in 3d. But the dimension in the fractal since is a different animal. No matter what dimension you draw it in, the fractal dimension stays the same. Yes, the dimension would be between 1 and 2 because as the number of links -> infinity, the 'perimeter' does too, but the lines certainly don't have an area!
The shape of the web, however, is not about fractal dimensions. It's about summarizing and arranging the points and connections in such a way that clustering and localization phenomena begin to emerge. With 800 million+ nodes, this task is nearly impossible - however, an analogous structure of fewer nodes and clusters can be made that will have visible patterns.
Re:Wait, how old again? (Score:1)
If you're interested in testing the 6 degrees... (Score:1)
Plankton Utility (Score:1)
Re:Hubs (Score:2)
Yahoo does a good job of sending feelers out to other pages, but does a shitty job of getting linked to. A page like Netscape.com [netscape.com] probably gets a lot of references (This page brought to you by blah blah.) if not as many connections out.
We just need more pages that say This page brought to you by
Why Google ROCKS!! (Score:2)
Google, which ranks its results by link "importance"
Hmmm, follow the herd or go for something "immportant" (perhaps the perfect word for a search, a clue if you will) seems like a pretty simple decision. I suggested it to the folks in my company (along with www.m-w.com and babelfish) and they love it. I'm feeling lucky.....
(Get Andover to buy it, or maybe the other way around...)
Re:People are closer? (Score:2)
Yeah, but at least she's only one hop away from you. She's at least two hops away from most of us, so I would have to say that the best person around here to ask is .. um .. you.
---
Have a Sloppy day!
Re:No way... (Score:1)
Overall this reminds me of a game we used to play in the college computer clusters. You get a bunch of people to start at a really innocent site, say, whitehouse.gov, and then race to see who can get to playboy.com the fastest just by clicking. Pretty interesting stuff really.. it's amazing how close some really conservative sites are to hardcore pr0n, when measured in number of clicks.
misleading (Score:2)
the web is only 19 clicks wide.
To me, this means the maximum distance between any two sites is 19 clicks. Sort of the way that people claim that only six degrees separate us from any other person in the world. This would be an impressive display of the "web" aspect of the world wide web.
But this isn't the claim at all. If you read the article, it says that
there's an average of 19 clicks separating random Internet sites.
Different story altogether.
Asimov's second foundations (Score:2)
Every time I see an article about a statistical study of something created by man, I get a flash back to Asimov's second foundation : how mathematics can generally describe man, events, history,
This study has some distant similarities to it. Statisticians studying the average distance between two ramdomly chosen internet site. The catch is that the entire structure is created by man, there really isn't much ramdomness in it : Compared to the Bacon thing were you may have met someone, that knew someone, at one point or another walking down the street; having a link from your web page to another web page is a completely conscious action.
Which brings me to the counter-argument that news site such has slashdot or c|net have their content (and thus their links) influenced by random events of the outside world such has tornadoes or floods.
Which now brings me to a conclusion before I back home, how long will it be before someone attemps to make a measure of the amount of randomness in the web, that is the influence of events that cannot be predicted (to some extent or other, to be determined later) by man. Is the web something that could over time become completely predictable?
I really should reread these Asimov books.