Follow Slashdot stories on Twitter

Journal anaesthetica's Journal: The Math behind PageRank 131

Journal by anaesthetica on Wednesday December 06, 2006 @02:55PM

The American Mathematical Society is featuring an article with an in-depth explanation of the type of mathematical operations that power PageRank. Because about 95% of the text on the 25 billion pages indexed by Google consist of the same 10,000 words, determining relevance requires an extremely sophisticated set of methods. And because the links constituting the web are constantly changing and updating, the relevance of pages needs to be recalculated on a continuous basis.

This discussion has been archived. No new comments can be posted.

The Math Behind PageRank

Load All Comments

Search 131 Comments Log In/Create an Account

Comments Filter:

10,000 words (Score:5, Funny)

by ambivalentduck ( 1004092 ) writes: on Wednesday December 06, 2006 @06:48PM (#17139072)

But 9,000 of those words are slang for parts of the human anatomy. Go figure.

Share
twitter facebook
- Re:The two that matter (Score:1, Interesting)
  
  by Anonymous Coward writes:
  
  There's only two that really reflect the power of Pagerank: Click here. [google.com]
  About 1.2 billion pages, and surprise surprise, Acrobat Reader tops the list, followed by a who's who of internet applications and plugins. But around result #30 it gets a bit more interesting, and when you're a few dozen pages in, "new patterns begin to emerge."
  
  And to explain why not to use "click here", I found this [w3.org] buried on page 45. Thanks for the proof pudding guys, it's delicious.
- Re: (Score:1)
  
  by binaryacid ( 979368 ) writes:
  
  Not directly related to this reply, but putting it here for visibility. Not self-promotion. Just would like to provide some useful reference:
  The Anatomy of a Large-Scale Hypertextual Web Search Engine
  http://infolab.stanford.edu/~backrub/google.html [stanford.edu]
  - This paper tells you what PageRank really is, by the original author.
  Efficient Computation of PageRank
  http://dbpubs.stanford.edu:8090/pub/1999-31 [stanford.edu]
  - This paper tells you how they efficiently compute it
  
  And as far as I know about information retrie
- - - - Re: (Score:2)
        
        by MadAhab ( 40080 ) writes:
        
        On the other hand: explain Gallagher and Carrot Top. "Apparently" they are funny, because they have "careers". Yet everyone with an actual sense of humor knows they are just waiting to unhinge their jaws and swallow you whole.
PageRank doesn't seem to be based on keywords (Score:4, Informative)

by dada21 ( 163177 ) * writes: <adam.dada@gmail.com> on Wednesday December 06, 2006 @06:50PM (#17139112) Homepage Journal

I have sites with a PR of 6, and I can tell you that they got that way because of inbound links from other sites. In fact, when other sites dropped those links, my PR dropped (to 5, and even to 4). Getting more inbound links brought the PR back.

Think about those links, too. How often do you use common words in an HREF? I don't think there's a lot of weeding out of common words since the link to a site is usually either its name, or a description containing some important keywords.

I love seeing these technoscientists think they understand PageRank, but just like TimeCube, they're way, way off.

Share
twitter facebook
- - - Re: (Score:1)
      
      by Raenex ( 947668 ) writes:
      
      I'm behind on my Slashdot reading, but I wanted to offer you a supportive comment even if it isn't timely. You're right, the original poster only read the summary and got modded up for a stupid comment based on not RTFA.
      That said, your comment contained more insult than explanation (yeah he didn't RFTA, but point out the discrepancy in his argument). The more inflammatory your message, the less likely it will be considered. I know, it's tempting to flame, and I do it myself now and then, but not near
- Re: (Score:3, Informative)
  
  by markov_chain ( 202465 ) writes:
  
  There has been a PageRank paper out there since 2000 or so, so it's not exactly a secret how it works. Basically an initial set of relevant pages is pulled from the database and ranked by doing some computation on a connectivity matrix. The trick is to come up with a good initial set; and unless they managed to implement an all-knowing oracle they probably do it by doing a keyword search. Here's where the article summary makes sense; if most pages have the same keywords, a keyword search is going to come
  - Re: (Score:3, Interesting)
    
    by Anonymous Coward writes:
    
    It's not secret [google.com].
    - Re:PageRank doesn't seem to be based on keywords (Score:5, Funny)
      
      by kimvette ( 919543 ) writes: on Wednesday December 06, 2006 @09:42PM (#17140822) Homepage Journal
      
      You got the link wrong. Here is the correct URL. [google.com]
      
      Parent Share
      twitter facebook
- Re: (Score:2)
  
  by zootm ( 850416 ) writes:
  
  If you're referring to the article, it focuses on the "links" aspect when describing the PageRank algorithm. The summary on here is pretty misleading in that way.
- Re: (Score:2)
  
  by Pollardito ( 781263 ) writes:
  
  Think about those links, too. How often do you use common words in an HREF?
  interestingly, it appears that Adobe Acrobat leads the list of results [google.com] when you search for "here" on Google (you can download it here [adobe.com]).
  
  and who would have expected this [google.com]
- Re:Pagerank is cool (Score:5, Interesting)
  
  by silentounce ( 1004459 ) writes: on Wednesday December 06, 2006 @08:02PM (#17139982) Homepage
  
  Interestingly enough, google thinks so, too. [google.com]
  
  Of course, yahoo has its own opinion. [yahoo.com]
  
  Although, altavista seems to almost agree. [altavista.com] Check the second non-advertised result.
  
  I do find this [google.com] amusing though. Third place, how humble.
  
  I didn't expect such interesting results. The site with the search term in its url was tops for av and yahoo, but not google. Yahoo ranked the wiki entry above google, but av reversed that decision, google of course thought itself was more important than the wiki. Google's own reference site was number one in its own search and near the top in the other two, but pagerank.net wasn't even in the top 10 for google's search. I'm not sure what conclusions can be drawn from all that, but it is definitely food for thought.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by ben there... ( 946946 ) writes:
    
    I do find this amusing though. Third place, how humble.
    What I found interesting about that link was the description listed for google's entry:
    Google - 11:54pm
    Enables users to search the Web, Usenet, and images. Features include PageRank, caching and translation of results, and an option to find similar pages.
    www.google.com/ - 5k - Dec 5, 2006 - Cached - Similar pages
    Where did they get that text from? It's not anywhere to be found in the source [tinyurl.com]. Did they cheat? Or are they just tricky?
    - Re: (Score:2)
      
      by McDutchie ( 151611 ) writes:
      
      Where did they get that text from? It's not anywhere to be found in the source. Did they cheat? Or are they just tricky?
      They got it from the Google category [dmoz.org] at the Open Directory Project at dmoz.org [dmoz.org], mirrored at directory.google.com [google.com]. Google is a user of dmoz.org data but has completely de-emphasized that as of late.
      It's actually against the dmoz license agreement to use their data without a link back to the source, but nobody seems to care.
    - Re: (Score:2)
      
      by flood6 ( 852877 ) writes:
      
      Whenever possible, Google uses the DMOZ description [dmoz.org] for the snippet shown in the results.
  - Re: (Score:1)
    
    by krishn_bhakt ( 1031542 ) writes:
    
    Things change with key word "search engine"... MSN is 1st! [ http://www.google.com/search?hl=en&lr=&q=search+en gine [google.com] ]
  - Re: (Score:1)
    
    by MotF Bane ( 1016229 ) writes:
    
    Use the Google search, try for "best search engine". They don't list themselves. Then go try Yahoo search, try for "best search engine"....
Bad summary (Score:5, Interesting)

by Knights who say 'INT ( 708612 ) writes: on Wednesday December 06, 2006 @07:06PM (#17139318) Journal

The article specifically says the PageRank eigenvector is only recalculated once a month, approximately. Even though Google uses some clever numerics to calculate the eigenvectors to a 25 billion by 25 billion matrix by iteration, it still takes several hours to finish.

Share
twitter facebook
- Re: (Score:2, Funny)
  
  by The Zon ( 969911 ) writes:
  
  Even though Google uses some clever numerics to calculate the eigenvectors to a 25 billion by 25 billion matrix by iteration, it still takes several hours to finish.
  
  Please. I can do that on paper in, like, five minutes.
  - Re: (Score:1)
    
    by trentblase ( 717954 ) writes:
    
    Please, I can do that in my mind in, like, 5 seconds.
  - Re: (Score:1)
    
    by sasdrtx ( 914842 ) writes:
    
    42
- Re: (Score:2)
  
  by Firehed ( 942385 ) writes:
  
  Several hours for 25b x 25b? Jeez, it took Slashdot the better part of a day to update the comment id field type in their database... 16.7m by 1. OSTG, we demand that the servers running Slashdot be upgraded to something that could actually withstand a Slashdotting!
  - Re:Bad summary (Score:5, Insightful)
    
    by martin-boundary ( 547041 ) writes: on Wednesday December 06, 2006 @09:19PM (#17140646)
    
    It's nowhere near like that. A web matrix is very sparse, so if you did a true 25Bx25B matrix power iteration, you'd be multiplying zero by zero a gazillion times. Optimization is about not doing things you don't need to do, and optimizing PageRank is about figuring out clever ways to not do the full multiplication. Moreover, PageRank is calculated in parallel over a computer farm. Overall, you can expect a single iteration to take on the order of an hour, and you can expect around 50-80 iterations before Google gives up and says it's converged. You can also try and reuse the previous "converged" PageRank vector to cut down on the 50-80 iterations after you've crawled new pages.
    If google used a single computer to do all the work, and truly did 80*25B^2 operations, they'd be morons.
    
    Parent Share
    twitter facebook
    - Re: (Score:1)
      
      by Patent-Monkey ( 1036772 ) writes:
      
      Interestingly, Google does a lot of reindexing using existing searches and then builds upon a search listing and a page indexing review. For example in US Patent 6,526,440 [patentmonkey.com], "The search engine obtains an initial set of relevant documents by matching a user's search terms to an index of a corpus. A re-ranking component in the search engine then refines the initially returned document rankings so that documents that are frequently cited in the initial set of relevant documents are preferred over documents tha
Nouns maybe? (Score:4, Insightful)

by Bryansix ( 761547 ) writes: on Wednesday December 06, 2006 @07:07PM (#17139344) Homepage

It seems like it would be the nouns, pronouns, etc. that Google should be paying attention to. Who cares about all the verbs, adjectives, etc. that just muddy the indexing waters?

Share
twitter facebook
- Re: (Score:1)
  
  by kramulous ( 977841 ) writes:
  
  I believe that a race is on at the moment for semantic searching. Not only nouns, verbs etc, but whether the phases are subjective or objective. I know a blog search company that is working on this. They wanted to borrow some of my code.
- Re: (Score:2, Insightful)
  
  by abshnasko ( 981657 ) writes:
  
  Searching for pill and the pill should yield very different results. Yes nouns are more important, but articles and other words cannot be disregarded.
  - Re: (Score:2)
    
    by gfody ( 514448 ) writes:
    
    The is a stop word [wikipedia.org] and will most likely be excluded from your search term.
    - Re: (Score:1)
      
      by WaXHeLL ( 452463 ) writes:
      
      It's not entirely excluded.
      
      An index of "the pill" and "pill" are two different queries becuase matching the whole phrase will get you more relevant results. This is built into the code that interprets queries (this is completely different from PageRank, which deals with cross linking between sites to get the highest probability of relevance -- AFTER the query is interpreted and a set of pages is generated). Almost all search engines work that way.
- Re: (Score:1)
  
  by WaXHeLL ( 452463 ) writes:
  
  RTFA please. It deals with determining relevance, not the optimal method of indexing pages.
  
  In regards to your comment:
  Verbs play an extremely important role when dealing with relevancy based on phrases.
  
  The small snippet that was posted was just cut and pasted from the opening hook of the article. It just leads into a mathematical discussion how to sort through the thousands of results that are returned.
  - Re: (Score:2)
    
    by Bryansix ( 761547 ) writes:
    
    I actually thought about that after I posted. I know all the words are important for indexing. I'm just saying that looking at keywords and placing more importance on those is a part of the mix too. Those keywords are almost always nouns.
- - "The Who" vs the who (Score:1)
    
    by kenb215 ( 984963 ) writes:
    
    It seems that in searching "The Who", only that exact phrase is returned, but when searching the who, both words are searched, i.e. "the" appears as if it is being searched like a normal word here. If you try searching for the best, "the" is counted when used as part of the phrase "the best", but appears not to be counted when it appears by itself. The Google algorithm is apparently a lot more complicated than the usual explanations are.
- Re: (Score:1)
  
  by svindler ( 78075 ) writes:
  
  So if I want to look for dwarf throwing I'll have to wade through all dwarf related pages because throwing is not relevant for the pagerank?
A bit late? (Score:1)

by kramulous ( 977841 ) writes:

I read about this some time ago ... I think the paper was entitled "The 10 billion dollar Eignvector: The math behind google" or something to that effect. Sorry, but I've got a new laptop and cannot find the exact title. It was an excellent introduction for beginner computational scientists for an application of the eigenvector. I forget the American University responsible.
- Re: (Score:2)
  
  by mochan_s ( 536939 ) writes:
  
  Here's the bibtex reference. @article{bryan:569, author = {Kurt Bryan and Tanya Leise}, collaboration = {}, title = {The $25,000,000,000 Eigenvector: The Linear Algebra behind Google}, publisher = {SIAM}, year = {2006}, journal = {SIAM Review}, volume = {48}, number = {3}, pages = {569-581}, keywords = {linear algebra; PageRank; eigenvector; stochastic matrix}, url = {http://link.aip.org/link/?SIR/48/569/1}, doi = {10.1137/050623280} }
  - Re: (Score:1)
    
    by kramulous ( 977841 ) writes:
    
    Cheers for that. Helpful.
    
    Kind regards
I joke a lot on Slashdot, but serious question (Score:3, Interesting)

by CrazyJim1 ( 809850 ) writes: on Wednesday December 06, 2006 @07:10PM (#17139396) Journal

I skimmed the article and didn't find what I wanted to find. If you make a webpage that you want ranked high, what do you do? Do you make 100 geocities accounts and provide links to your main website, or what? I'm just wondering this out of curiosity, not out of need.

Share
twitter facebook
- Re: (Score:1)
  
  by Larry Lightbulb ( 781175 ) writes:
  
  At a very basic level a sites page rank is a reflection on how much other sites think it's relevent, and is based on how important the sites are that link to it. Get a link from the BBC, CNN, or somewhere like that and it's worth thousands or millions of links from Geocities sites.
- Re: (Score:1)
  
  by mojodamm ( 1021501 ) writes:
  
  That's kinda what I thought at first as well, but looking over the lower two-thirds of the article, I started to get a different impression. They talked about a 'strong web' idea, where if your webpage is disconnected from the 'main' web and set up in a sort of 'secondary web' with just your Geocities accounts, for instance, linking to it, then the actual websites that interconnected within your site matrix would rank a 0 overall.
  
  Not sure if this is correct or not, just the impression that I got from what
  - Re: (Score:1)
    
    by Hyperspite ( 980252 ) writes:
    
    If you read the entire article carefully, they deal with that by changing the way they search through the web. Instead of following every link, they assign a probability of .85 to following it. This makes their eigenvectors have nonzero entries because the search can jump out of the strong web and get back on track (if the random number falls into the .15 category it goes to a random indexed page from the entire internet). So yea, making a web of geocities accounts wouldn't do much more than you'd think it
- Re: (Score:1)
  
  by x_MeRLiN_x ( 935994 ) * writes:
  
  That wouldn't work, because they'd all be coming from the same domain.
- Re:I joke a lot on Slashdot, but serious question (Score:5, Informative)
  
  by Anonymous Brave Guy ( 457657 ) writes: on Wednesday December 06, 2006 @07:52PM (#17139894)
  
  The underlying idea behind page rank is pretty well-exposed at this point, and is described in TFA. Essentially, it's a big set of simultaneous equations: each incoming link to your page gets a score that is roughly the rank of the source page divided by the number of outgoing links on that page, and then the rank of your page is roughly the sum of the scores of all incoming links.
  
  Various fudge factors are introduced along the way. For example, if you break Google's rules about displaying the same content to bots as to humans, you can get slapped right down. More subtly, newly registered domains take a modest hit for a while. More nobody-knows-ly, Google's handling of redirects is unclear: information about exactly what adjustments are made is pretty scarce, and there's a lot of conjecture around. One thing that's pretty certain is that they penalise for duplicate content, which is why some webmasters do apparently unnecessary things like redirecting http://www.theircompany.com/ [theircompany.com] to http://theircompany.com/ [theircompany.com] or vice versa.
  
  So, if you want to get a page with a high rank yourself, then ideally you need would get many established, highly-ranked pages to link to your page and no others. In your example, all those Geocities sites wouldn't help a lot, because (a) they'd have negligible rank themselves, and (b) they'd be penalised for being new and lose some of that negligible rank before they even started. Many times negligible is still negligible, and so would be your target page's rank. OTOH, get a few links from university sites, big news organisations and the like, and your rank will suddenly be way up there. Alternatively, get a grass-roots movement going where a gazillion individuals with small personal sites link to you, and the cumulative effect will kick in.
  
  Parent Share
  twitter facebook
  - Re: (Score:3, Interesting)
    
    by TheLink ( 130905 ) writes:
    
    "if you break Google's rules about displaying the same content to bots as to humans"
    
    I notice many sites that do that and don't get slapped down - esp subscription sites. And seems Google doesn't cache those, so its probably collusion.
    
    You see the keywords and paragraphs in the search, but click on it you get a login page.
    
    They should have to pay a special rate be marked differently from the other search results. It's a waste of time otherwise.
    - Re:I joke a lot on Slashdot, but serious question (Score:5, Interesting)
      
      by oni ( 41625 ) writes: on Wednesday December 06, 2006 @08:39PM (#17140342) Homepage
      
      I notice many sites that do that and don't get slapped down - esp subscription sites.
      
      I wonder, if I changed my useragent to be whatever the googlebot reports itself to be - would I get by the registration screen on websites like the NYTimes??
      
      Parent Share
      twitter facebook
      - Re: (Score:2)
        
        by kimvette ( 919543 ) writes:
        
        No, because they check the IP you're coming from as well now - they grew wise to user agent spoofing years ago.
        
        Google for the "bugmenot" Firefox extension.
        
        Re: (Score:2)
        
        by jZnat ( 793348 ) * writes:
        
        Googlebot doesn't use the same IP address all the time (several servers running Googlebot I'd imagine), so filtering based on IP addresses would be infeasible (at least according to Google).
        
        Re: (Score:1)
        
        by Jotii ( 932365 ) writes:
        
        Still, Google has a few IP-ranges which are only for Google.
      - Re: (Score:3, Informative)
        
        by XorNand ( 517466 ) * writes:
        
        As pointed out, the Times site isn't fooled, but there are a good many out there that are fooled. Sometimes if you ever do a Google search, one of the results will contain a keyword or two. However, when you click on the link, you'll find yourself redirected to a subscription page. Useragent spoofing can frequently show you the same page that Google indexed.
        
        If you're a FF user, grab the Useragent Switcher extension [mozilla.org] and add in a UA of "Mozilla/5.0 (compatible; googlebot/2.1; +http://www.google.com/bot.html)
    - Re: (Score:3, Interesting)
      
      by suggsjc ( 726146 ) writes:
      
      Here is an email with associated response I received from Google on roughly this topic.
      This is a very general question. I'm creating a website. It is going to be a blogging platform. Obviouslly, the content of the site(s) is the most important thing. I've already started making the content of my site dynamic in the sense that I tailor it to the requesting agent (via the user-agent header). My intention for doing this is to make sure that the content renders correctly for *any* browser that accesses the sit
  - Thanks for all the replies (Score:2)
    
    by CrazyJim1 ( 809850 ) writes:
    
    I now have a nice basic understanding of Google page ranking system. Thats all I was asking for.
  - Re: (Score:2, Insightful)
    
    by l0cust ( 992700 ) writes:
    
    Thanks for the informative post. I have one question though. How does it help find the relevant information unless that information just happens to be on a popular page too? What I mean to say is that the idea behind grading/filtering systems like PageRank is to provide the most relevant information about the thing you are trying to search on the net. Now suppose Mr. A is looking for some obscure Indian text written in Sanskrit and Mr. B has (recently or not) put up a website with that text as one of the co
  - Interesting Appendix: Page and Brin on Advertising (Score:1)
    
    by jbourj ( 954426 ) writes:
    
    One of the references for the article is http://infolab.stanford.edu/pub/papers/google.pdf [slashdot.org]" >The Anatomy of a Large-Scale Hypertextual Web Serach Engine published in Computer Networks and ISDN Systems. At the end of the paper, they have a very interesting appendix: "Advertising and Mixed Motives"
    Currently, the predominant business model for commercial search engines is advertising. The goals of the advertising business model do not always correspond to providing quality search to users. For example, i
- Re: (Score:2)
  
  by linhux ( 104645 ) writes:
  
  If those 100 geocities pages each have a PageRank of 0 (which they would if they aren't linked to from other high-ranking pages), their total contribution to your main page PageRank will be 0.
- Re: (Score:1)
  
  by cvos ( 716982 ) writes:
  
  a webpage that you want ranked high, what do you do? Do you make 100 geocities accounts and provide links to your main website
  No this would definitely not work. The reason is that 100 new geocities websites would have a value of 0 so using the PageRank algorithm you would effectively have 100 links X 0 PR. Incoming links only have a positive impact if they have weight independent of other websites. This is why it is so crucial to have your own website in the oldest dataset possible. It takes a long time for websites created in 1995 to disappear.
Does PageRank count? (Score:2, Interesting)

by matr0x_x ( 919985 ) writes:

As a self proclaimed SEO expert - I honestly don't believe PageRank counts nearly as much as it did a few years ago! You'll find lots of PR5 sites ahead in the SERPS of PR9 sites!
- Re:Does PageRank count? (Score:4, Insightful)
  
  by Trieuvan ( 789695 ) writes: on Wednesday December 06, 2006 @07:35PM (#17139726) Homepage
  
  The pagerank that's reported from toolbar is really old. Google never want to let you know the real number or it will be easy to spam ...
  
  Parent Share
  twitter facebook
  - Re: (Score:1)
    
    by HalfBrown ( 1024165 ) writes:
    
    The pagerank that's reported from toolbar is really old.
    I think that at least part of this is indicative of the "Google Sandbox [wikipedia.org]" (if you believe it exists). I've noticed, with the Google Toolbar in IE and FireFox, that some sites seem to have stagnant PR's (even with noticable increases/decreases of traffic), but others move along in a relatively sistent manner.
    Just my 2 cents.
- Re: (Score:1)
  
  by dbmasters ( 796248 ) writes:
  
  PageRank is worthless in terms of SEO. What it can do is tell you if there is a problem, if you have a PR of 0 or 1 or something, but thinking it somehow affects your SERPs is a dillusion far to many people fall in to. Concentrate on SERPs, not PR, ASAP for SEO on the WWW.
  - Re: (Score:2, Funny)
    
    by Anonymous Coward writes:
    
    Concentrate on SERPs, not PR, ASAP for SEO on the WWW
    
    I searched on Google but I cannot find what "on", "not", "for" and "the" mean...
- Pagerank (Score:5, Funny)
  
  by Skythe ( 921438 ) writes: on Wednesday December 06, 2006 @07:37PM (#17139752)
  
  Because about 95% of the text on the 25 billion pages indexed by Google consist of the same 10,000 words, determining relevance requires an extremely sophisticated set of methods.
  
  They use a set of nested if-else statements
  *ducks*
  
  Parent Share
  twitter facebook
  - Re: (Score:1)
    
    by darekana ( 205478 ) writes:
    
    They use a set of nested if-else statements
    
    No, that would be waaay too many if-elses to write by hand...
    they use IoC and code generation tools.
Old guys bully new comers. (Score:1)

by eat bugs ( 529223 ) writes:

I asked some math website to put a link to http://www.mathpotd.org/ [mathpotd.org] Math Problem of the Day -- they don't bother to do so. They know the math and use it.
- - Re: (Score:1)
    
    by eat bugs ( 529223 ) writes:
    
    A system error caused the problem. It didn't insist it be 1/3 -- it was because choice B (which didn't correspond to any choice) was given as the correct answer. The website support has corrected the problem.
Here it is... Google's PageRank formula (Score:1, Funny)

by Reverend99 ( 1009807 ) * writes:

SELECT advertiser, description, link, adcost
FROM tblAdvertisers
WHERE adword LIKE %searchstring%
ORDER BY adcost
- you forgot.. (Score:5, Funny)
  
  by gfody ( 514448 ) writes: on Wednesday December 06, 2006 @08:18PM (#17140118)
  
  ORDER BY adcost DESC
  
  Parent Share
  twitter facebook
  - Re: (Score:1, Troll)
    
    by Reverend99 ( 1009807 ) * writes:
    
    Yeah... you and the sans-humor moderator who called this "Flamebait" should get together. I'm sure you'd make a match made in anal heaven.
- Re: (Score:1)
  
  by 8ball629 ( 963244 ) writes:
  
  I'd be mad if I were advertising...
  
  Shouldn't it be "ORDER BY adcost DESC"?
- Re: (Score:1)
  
  by kaizenfury7 ( 322351 ) writes:
  
  Interesting, I never knew the formula was
  
  #1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '%searchstring% LIMIT 0, 30' at line 1
OK, but... (Score:1, Informative)

by indigest ( 974861 ) writes:

The algorithms behind PageRank are no secret. Why not just read about them from the source [stanford.edu]?
Only three articles about Google on one page? (Score:3, Funny)

by colourmyeyes ( 1028804 ) writes: on Wednesday December 06, 2006 @10:01PM (#17141000)

I think we can get four or five tomorrow.

Share
twitter facebook
evolution (Score:2)

by drDugan ( 219551 ) * writes:

Great article.

The character of online content is changing now rapidly. We used to be in an Internet where mostly only the site provider determined the content on the pages they served (/. being a notable, early exception). Now, with the rise of "2.0" systems, user-generated content, and empowerment of the individual - the content being served on many sites is coming into sites from wide groups, and being moderated and curated by those groups.

So... a thought: as user-submitted and group-moderated content
- Re: (Score:2)
  
  by the_womble ( 580291 ) writes:
  
  I could not disagree more. Most of the sort of information people search for is not user generated: when did you last do a Google search for which a slasdot comment was the appropriate answer?
  
  The only exception that I can think of (form my searches) are forums that have answers to software problems. Google seems to have no problem finding these for me.
  - Re: (Score:2)
    
    by Vintermann ( 400722 ) writes:
    
    Sometimes you want to search through your old posts. Not all sites let you do that (slashdot does if you pay up, I think), and often forums are even norobots space.
  - Re: (Score:2)
    
    by drDugan ( 219551 ) * writes:
    
    The meme that Google helps us find all the information is a huge marketing Spin.
    
    Compared to "exactly the information you want, when and how you want it" - Google sucks. It is better that anything else now, but it still is not anywhere close to really solving the information access problem generally.
It's the World' s Largest Matrix Computation (Score:2, Informative)

by MadMagician ( 103678 ) writes:

For a different, somewhat more technical, but more succint discussion, Cleve Moler [of Matlab fame] wrote another view [mathworks.com] of this topic, about 5 years ago.

The math is the same, of course, but two points of view may provide a greater sense of perspective. So to speak. And Cleve is always worth listening to.
- Re: (Score:2)
  
  by jfengel ( 409917 ) writes:
  
  Actually, I'm not so sure it's the largest matrix computation. Weather and nuclear bomb simulations are done with matrix algebra, and it wouldn't surprise me to discover that they do some months-long calculations with even larger matrices.
Other google technologies (Score:1, Redundant)

by quakehead3 ( 988738 ) writes:

Pigeon Rank: http://www.google.com/technology/pigeonrank.html [google.com]
- Why doesn't Brin get some credit?? (Score:1)
  
  by moeinvt ( 851793 ) writes:
  
  ??
  
  Seems unfair that something Brin and Page developed together would bear only one of their names.
  
  "Page-rank"
  
  ??
Pages that don't exist anymore (Score:2, Interesting)

by namco ( 685026 ) writes:

I've seen links on google searches that don't exist anymore but were ranked highly when they DID exist and still exist in the top 10 of the query. What happens to those? Do they stay at their ranking till they get overtaken by other more popular pages on the same search? Get their ranking slowly reduced because they don't exist?
- - Re: (Score:1, Offtopic)
    
    by treeves ( 963993 ) writes:
    
    I don't have any Mod points right now, but isn't a reply to an Offtopic post pretty much automatically offtopic? Go ahead and mod me Offtopic, I'll consider that an affirmative answer.
- Re: (Score:1)
  
  by CptPicard ( 680154 ) writes:
  
  Frankly, any university with a CS program worth anything will have students take a linear algebra course in math as the first thing. It's a good weed-out-the-weak excercise early on, gets you up to speed with university level mathematics, and the stuff in itself comes in handy, for example in computer graphics. Being good at manipulating matrices has a lot of use in algorithmics too.
  
  Please, try to impress me about Stanford some other way once you've progressed further ;-)
- Re: (Score:1, Insightful)
  
  by Anonymous Coward writes:
  
  Blah. Ugly red clothes... Go Bears!
  - Re: (Score:1)
    
    by retiarius ( 72746 ) writes:
    
    shameless math plug(s) from my alma mater:
    
    - cal berkeley leads stanford in william lowell putnam competition fellows
    
    - as for killer math events
    
    stanford had streleski (v.i.z. wikipedia)
    but berkeley topped him with kaczynski (!)
    
    seriously, best
- Re: (Score:2)
  
  by RegularFry ( 137639 ) writes:
  
  Why does that make PageRank broken? That's not the problem it tries to solve. Google might be broken for slavishly adhering to PageRank, but that's a different matter entirely...
- Re: (Score:1)
  
  by namco ( 685026 ) writes:
  
  Porn has no quality (just cheese ;p) and is very popular! I'd love to see pageranking's on those every month!

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

"The Avis WIZARD decides if you get to drive a car. Your head won't touch the pillow of a Sheraton unless their computer says it's okay." -- Arthur Miller

The Math Behind PageRank More Login

The Math Behind PageRank

10,000 words (Score:5, Funny)

Re:The two that matter (Score:1, Interesting)

Re: (Score:1)

Re: (Score:2)

PageRank doesn't seem to be based on keywords (Score:4, Informative)

Re: (Score:1)

Re: (Score:3, Informative)

Re: (Score:3, Interesting)

Re:PageRank doesn't seem to be based on keywords (Score:5, Funny)

Re: (Score:2)

Re: (Score:2)

Re:Pagerank is cool (Score:5, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:1)

Bad summary (Score:5, Interesting)

Re: (Score:2, Funny)

Re: (Score:1)

Re: (Score:1)

Re: (Score:2)

Re:Bad summary (Score:5, Insightful)

Re: (Score:1)

Nouns maybe? (Score:4, Insightful)

Re: (Score:1)

Re: (Score:2, Insightful)

Re: (Score:2)

Re: (Score:1)

Re: (Score:1)

Re: (Score:2)

"The Who" vs the who (Score:1)

Re: (Score:1)

A bit late? (Score:1)

Re: (Score:2)

Re: (Score:1)

I joke a lot on Slashdot, but serious question (Score:3, Interesting)

Re: (Score:1)

Re: (Score:1)

Re: (Score:1)

Re: (Score:1)

Re:I joke a lot on Slashdot, but serious question (Score:5, Informative)

Re: (Score:3, Interesting)

Re:I joke a lot on Slashdot, but serious question (Score:5, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:3, Informative)

Re: (Score:3, Interesting)

Thanks for all the replies (Score:2)

Re: (Score:2, Insightful)

Interesting Appendix: Page and Brin on Advertising (Score:1)

Re: (Score:2)

Re: (Score:1)

Does PageRank count? (Score:2, Interesting)

Re:Does PageRank count? (Score:4, Insightful)

Re: (Score:1)

Re: (Score:1)

Re: (Score:2, Funny)

Pagerank (Score:5, Funny)

Re: (Score:1)

Old guys bully new comers. (Score:1)

Re: (Score:1)

Here it is... Google's PageRank formula (Score:1, Funny)

you forgot.. (Score:5, Funny)

Re: (Score:1, Troll)

Re: (Score:1)

Re: (Score:1)

OK, but... (Score:1, Informative)

Only three articles about Google on one page? (Score:3, Funny)

evolution (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

It's the World' s Largest Matrix Computation (Score:2, Informative)

Re: (Score:2)

Other google technologies (Score:1, Redundant)

Why doesn't Brin get some credit?? (Score:1)