Slashdot Log In
Google TrustRank
Posted by
CmdrTaco
on Tue Apr 26, 2005 07:16 AM
from the trust-no-one dept.
from the trust-no-one dept.
Philipp Lenssen writes "Google registered a trademark for the word "TrustRank", as Search Engine Watch reveals. Is this a sign we can expect a follow-up to Google's PageRank? An earlier, possibly related paper on TrustRank is available; it proposes techniques to semi-automatically separate good pages from spam by the use of a small selection of reputable seed pages."
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Will this have anything to do (Score:1, Insightful)
(http://www.rememberteh.name/ | Last Journal: Thursday September 29 2005, @10:59PM)
more censorship, unimpressed (Score:1, Interesting)
(http://axi0m.gclanparty.com/)
Re:more censorship, unimpressed (Score:5, Insightful)
(http://mailinator.net/ | Last Journal: Tuesday December 06 2005, @05:55PM)
Re:more censorship, unimpressed (Score:5, Informative)
This is a basic problem of filtering web-content. How do you avoid throwing out the baby with the bath water? I'm running into that problem in designing a custom filter to keep my son from inadvertently seeing pornography as he looks for his "r0mz," but that's peanuts compared to Google's dilemma.
The fact is, spam filtering is inherently censorship. This kind of interference will always have a negative impact on the marketplace of ideas that is the modern internet. On the other hand, as a side effect, removing blogs from search results (as this trust metric very likely will) may increase the usability of Google overall. I suspect there will be some people who are not as happy about that as I am.
Re:more censorship, unimpressed (Score:5, Insightful)
Google's primary responsibility now is to it's shareholders, which means increasing the chance that you and I find exactly what we are trying to look for, and not to unabashedly display every peddler that serves up content over http.
Re:more censorship, unimpressed (Score:5, Interesting)
(http://weill.org/ | Last Journal: Saturday October 01 2005, @01:18PM)
Google's primary responsibility now is to its shareholders. Google makes money from advertising. If Google can encourage you to patronize its advertisers instead of trusting its index for everything (which right now is pretty easily gamed), then Google makes more ad revenue and shareholders are happy.
Re:more censorship, unimpressed (Score:5, Insightful)
First of all, the only protection that is guaranteed you here is that the gov't will make no law abridging the freedom of speech.
Google, as influential as they might be, are not the government (insert 'Do No Evil' joke here). Therefore, they are not bound to this "Freedom of Speech" argument.
Secondly, "Freedom of Speech" is not this universal, higher-being ordained preserve at all cost idea that we have transformed it into.
Freedom of speech does not give you the right to spray-paint your slogan all over my front door, nor, in this case, does it give you a 'right' to be listed on Google. Nor do you have a 'right' to have your name printed on the front page of your local paper in 36pt font.
Not being listed in Google does not amount to censorship in any definition of the word. The net existed before google, and people still managed to find web-sites. Google gives (through PageRank or whatever mechanism they choose) free advertisement to 'good' sites. They have every right to only display sites that pay money, if they so desired. You have absolutely zero (0) 'rights' to be listed for free on Google.
Trotting out the Freedom of SPEECH argument is nothing more than whining about Big Brother coming to get you because what you have to say isn't worth hearing. Guess what? If you want to be heard, say something that's worth listening to. All that glitters is not gold, and much that is said (or printed) is worthless drivel. Much like this post.
Re:more censorship, unimpressed (Score:5, Insightful)
2) You have the option of not using Google. Yahoo is a completely independent search engine now.
Re:more censorship, unimpressed (Score:4, Insightful)
(http://www.ajs.com/~ajs/)
Yes.
Why is it that everyone is constantly striving to find Google's evil? Ranking the relevancy of pages to a search is Google's job. By ranking spam as relevant to my search they have failed. Using the concept of a web of trust to establish relevancy is a fairly obvious solution and has well established analogs in other fields (e.g. PKI).
If you're looking for evil, try GE, GM, or Unilever. Google doesn't even begin to rank on the evil-o-meter.
Conjuction? (Score:2, Interesting)
(http://www.barkinbarnyardkennels.com/)
Re:Conjuction? (Score:5, Informative)
(http://www.macrocosmictech.com/blog)
Potential abuse? (Score:4, Interesting)
Google may be better off as they are currently leaving all sites initally equal in influence before the Pagerank calculation.
Then again, Google has a great track record for testing their ideas before committing them to general service...
Re:Potential abuse? (Score:5, Interesting)
This to me keys that Google's trying to become a more involved company; instead of just sitting back, caching and searching the internet, they are now trying to serve you best and give you the results you are looking for. I would imagine with TrustRank, you will see a little star or something near a link on Google's home page, and the star would indicate if it is something in your field that you would be looking for. For example, if you were a Biologist and searched for a certain kind of fish, say "Blue Tuna", it would put stars next to sites with the fish's breeding habits, etc., but if you were a general consumer, it would provide links to the local fishery.
The internet is an extremely powerful tool, and search engines have simply evolved to the point that they are now "dumb technology". Without more user invervention (and not simply by throwing in more keywords and praying), they will continue to be as they are now. Once the company better knows what we'll be looking for, they can better serve us. And that's all I see this new tech as being.
psst, wanna buy some TrustRank? (Score:2, Funny)
Questions (Score:3, Interesting)
Will the owners of the pages / sites deemed to fall within the set of trusted seed sites get any money for all their hard work (i.e. hand-maintaining pages of links)?
What if such an owner decides to link to a page of commercial or spam links - will they get any money from the owner of the linked site? Is this a possible method of abuse?
Will that cool poster of links between websites now become 3D to give trusted links more prominence?
Re:Questions (Score:5, Informative)
(http://www.pjrc.com/ | Last Journal: Thursday June 27 2002, @04:31PM)
How is this different from applying a weighting to PageRank?
It attempts to detect clusters of pages which have few inbound links, which also propagating "trust" scores to all other sites by using their linking structure. For sites that have many inbound links (high scroring in pagerank), the authors claim this modification tends to classify spam and reputable sites differently.
Will the owners of the pages / sites deemed to fall within the set of trusted seed sites get any money for all their hard work (i.e. hand-maintaining pages of links)?
No.
However, they will get better search engine visibility, which is quite valuable.
What if such an owner decides to link to a page of commercial or spam links - will they get any money from the owner of the linked site?
The paper suggests using only highly reputable organizations with long-term stability for the seed pages. Government organizations, universities, very well known companies.
The analysis in the paper is based on a per-site graph, not per-page, by the way. They lacked the resources to try these computations on such a large data set.
Is this a possible method of abuse?
Presumably, the small set of seed pages/sites will need to be monitored by staff employed by the search engine company. If one of the trusted seed sites "went bad", they would need to be removed from the list.
Will that cool poster of links between websites now become 3D to give trusted links more prominence?
Probably not.
I can already imagine this... (Score:3, Informative)
(Last Journal: Wednesday August 18 2004, @07:52AM)
I can see this already....
This page contains very objectionable content.
If you are easily offended, don't enter.
Blah, blah, blah.
Blah, blah, blah.
Do you agree to these conditions?
Yes [goatse.cx] No [disney.com]
Probably the other way. (Score:4, Insightful)
(http://pigeon.psy.tufts.edu/ | Last Journal: Wednesday April 14 2004, @11:57AM)
If you're linked by a trusted page, then your rank goes up, but there's no negative for being linked by untrusted pages - your pagerank stays the same.
Similar to Advogato's? (Score:5, Interesting)
TrustRank link broken, session expired (Score:5, Informative)
A good sign (Score:4, Insightful)
Re:A good sign (Score:5, Insightful)
(http://www.pjrc.com/ | Last Journal: Thursday June 27 2002, @04:31PM)
Let's not get overly optimistic about what this is going to do for the web... such as:
By developing this tool, Google is helping to clean the Internet up and enable it to become the massive source of pure information it has such potential to be.
What exactly is "pure information" anyway?
Consider my little website [pjrc.com]. Lots of pages about how to design electronic stuff. But we sell components that support those activities, so it's not 100% "pure", is it? You could consider all those pages as a giant ad for the stuff on the store section of the site. But most people would consider my pages on the more informational side (and the vast majority really are).
About once every 2 or 3 weeks, I get a call from one of these search engine optimiztion companies. Not sure if it's the same couple companies... I usually just say "no" and ask to be on their do-not-call list. They're mostly a bunch of slimey people and probably don't honour such requests.
But sometimes, the idea is tempting. I resist because I believe it's unethical, and ultimately a bad long-term investment. Still, to anyone selling via the web, even a tiny little 2-person company like me, the sales pitch is quite compelling. Pay some fee, traffic goes up, more sales, increase in revenue offsets the cost for the SEO's work. Maybe it's not so bad if they don't stupe to cheating.
Still, I resist because I know it's not a black and white distinction. It's a fuzzy line between the obviously good techniques (improving site structure, rewording page titles, etc) and the obviously bad (cloaked pages). I also just don't trust them.
But even the distinction between "pure information" and "spam" is fuzzy. I'd like to think I'm leaning towards the "pure information" side, but we do indeed sell products. It wasn't always that way... in the mid-90's, the site was smaller and hosted at a university and no products were sold. I had several people begging me to sell them a few of the parts needed for a project. Eventually, a friend started selling some stuff (prices were high, service poor), and so I took it over. Satisfaction with the site has improved dramatically since then!
Still, it's a fuzzy area between pure information and purely commercial, or advertising or spam.
I can tell you it's a lot more work crafting really good web pages than just writing a check to a seedy SEO company. But if these ranking algorithms really do improve to perfection, the response is probably going to be more and more pages appearing in that gray region. Increasing sales can pay for a lot of man hours to author more material that's compelling for visitors and truely does help them to solve their products (especially if they buy the described products).
So, in a best case scenario, these algoriths reaching perfection (seems unlikely) is probably going to lead to a lot more very good content, but content that revolves around pitching products (eg, infomercials), and not "pure information".
Ignore them Lord Google (Score:1)
I always find it annoying to find irrelevant pages. If this works I'm happy, else I wont be mad at my Lord. Just a little disappointed.
Is the paper even the same thing? (Score:5, Interesting)
Trustrank explained (Score:4, Informative)
(http://www.udviklingschef.dk/ | Last Journal: Sunday April 18 2004, @02:52PM)
What happens is, that humans select some webpages which they trust. The idea is, that these trustworthy webpages only links to good sites. So, the trustworthy webpages are used as seed into a regular webcrawler.
At first glance, this looks like a low pass filter to me. Ie the same result could be achieved by cutting all PR 5 sites.
in toolbar (Score:2)
(http://sourceforge.net/projects/karekol/)
Yahoo behind trustrank (Score:3, Funny)
(http://www.julefrokost.info/ | Last Journal: Wednesday April 07 2004, @03:52AM)
Bring lawyers, guns and popcorn (Score:3, Insightful)
(http://home.primus.ca/~ronsharp/tororg.html)
* When I say sleezeballs and tweeking, I mean the people who will try outrageous stunts to game the system, rather than the consultants who will help you increase rank by the stunning tactic of actually improving your site. Radical, but sometimes it works.
Gmail spam filter? (Score:3, Interesting)
(Last Journal: Friday December 05 2003, @03:51PM)
Becase we gmailers are picky.
It would probably have to be integrated with something else, because I bet there are a few pr0n mailing lists that lots of people have.
PageRank is already no more what it used to be (Score:3, Interesting)
(http://highc.org/)
Even though it contains way too much rant for my taste, google watch [google-watch.org] is worth a full read by all
WTF? (Score:2)
(http://jonr.light.is/ | Last Journal: Saturday April 06 2002, @12:22AM)
Semantic Web... (Score:2)
If "Google trusts fooPage" becomes a standard, recognised triplet, I see no reason why this won't be extended to "Google trusts userX", which becomes "ebay trusts userX" etc.
It's very possible they're looking to the future, and have more in mind than "there's probably no pr0n on this page"...
Question. (Score:2, Interesting)
This info is not intended to be read by a human. (Score:5, Interesting)
(http://www.mathmatt.com/)
I guess we weren't supposed to read this. And you shouldn't have read *this*!
Personalised trust metrics (Score:5, Interesting)
(http://www.fluid-it.com/)
A possible system? (Score:2, Informative)
(http://www.chrima.ath.cx/)
weights in a trust system?
Links in messages identified as spam could be given a negative
weight. That weight could be determined by the number of people
identifying messages with that link as spam. Links from those sites
would being given less trust than a completely unknown page, unless they
are positively weighted themselves or linked to by a positively weighted
site. Links found in non spam messages could be given positive weights
by the same rules.
This would also have the advantage of offering spam filtering rules
based on trustrank weights. Setting a minimum trustrank would allow the
system to weight the email by checking the links in the email, and using
their trustrank for the message itself. The automated spam filtering
gmail offers could thus affect trustrank, increasing the impact of both
systems (email and searching) and possibly allowing it to be extended
to google groups/Usenet filtering.
Potential Examples
(moving each weight given by linking 1 point towards 0)
site1 [+5] - url found in 5 non spam messages
site2 [-5] - url found in 5 spam messages
site3 [+4] - url linked to from site1 (5 + -1)
site4 [-4] - url linked to from site2 (-5 + 1)
site5 [0] - url linked to from site1 and site2 (5 + -5)
site6 [3] - url linked to from site1, site3, and site2. (((5 + 4) + -5) + -1)
Email1 [-5] - contains links to site2, site4, and site6 (((-5 + -4) + 3) + 1)
Not perfect perhaps, but workable and easy to combine with a simple
rule set for weighting parts of a url to create an 'intelligent' system
guided by user preferences.
My vision for trust... (Score:2)
(Last Journal: Wednesday August 18 2004, @05:22PM)
Trust for things like email senders and web sites shouldn't be centralized. My web of trusted entities, which should be easy to maintain (unlike, say, blacklists or whitelists) and should evolve semi-automatically, should be based on the interaction of my trusted sites/entities, and, in turn, their trusted sites/entities. Sort of like TrustRank, but where each person determines their own initial seed of trusted sites/entities. Of course, if you didn't want to deal with choosing seeds, you'd just pick Google as your trusted site.
This is of course a horribly abstract idea, and I have no idea how I'd implement this for 1 or a million users, but hey, you gotta start with the vision.
Sounds like a confused algorithm (Score:3, Insightful)
Matches - spam - offtopic, sorted by relevence
not
Matches sorted by f(pagerank,trustrank)
Google used pagerank+on page text as a measure of how relevent a page is but thats not reliable anymore because the set contains spam pages.
The 'trusted' value tells you nothing about relevence, it only gives the likelyhood of the page being spam or not spam. If its spam you want it removed, if its not spam, then its page rank determines its relevence not some function of pagerank and trustrank.
i.e. they should not promote or demote pages because on trust rank, they simply define a cut off value K, if the trust is less than K then its likely spam and should be removed.
Since spam follows money terms, they should have K(keyphrase), so they can change the value of K on each keyphrase to remove the spam. Otherwise they will filter non money terms where no spam exists and their algo can only do harm!
maybe for Gmail (Score:2, Informative)
Censorship? (Score:1)
(http://www.alt-control.net/)
I haven't read the article, but the name suggest they will do something similar to how pagerank works, not actually trimming the results, but re-ordering them. It doesn't hide any content, just displays the content that is more likely to be what you want, higher up.
Or am I the confused one here?
hmmm ...... (Score:2, Interesting)
Probably wouldnt be that difficult to get around it but might help a bit
t
Vipul's Razor? (Score:1, Informative)
certain domains given a preference? (Score:1)
(http://www.fanteja.com/blog | Last Journal: Tuesday May 03 2005, @07:15PM)
Bayesian (Score:1)
(http://www.kevinmarsh.com/)
CollaborativeRank (Score:1)
Google and Semantic Web (Score:1)
Re:Cheeseh... (Score:4, Funny)
(http://mailinator.net/ | Last Journal: Tuesday December 06 2005, @05:55PM)
Re:Cheeseh... (Score:5, Insightful)
(http://nanotree.sourceforge.net/)
Should Google just throw away their many years of research, and start from scratch?
I find this trust-based approach interesting, but I wonder how it's gonna work for smaller sites (Which the few trusted seeds will not ever link to), but I guess the smaller sites don't really have a problem as it is, because only specific search-terms are targeted.
There's also the problem of allowing new websites into the game, but I guess that's for the Google developers to figure out.
Re:Cheeseh... (Score:1, Informative)
Re:Cheeseh... (Score:5, Informative)
Imagine going into your Gmail account settings, adding a string of a few websites you deem to be "superior" or of better quality, and then let TrustRank grab the collection of all of these, note where the highest votes go, and use these as more "Trustworthy" search results. Or, using PageRank, it simply adds an option "Vote these sites higher because they are linked to the user defined site settings."
Both schema make Search Engine spamming more controllable by Google (Simply by terminating accounts linked to spammers), and could have an interesting effect. Can't wait to see what happens with TrustRank.
Re:Cheeseh... (Score:2)
Hey, we're not talking about MS here, so drop the cynical patches line thank you. And "old tech" can still bring you the best general web search results out there, no matter av,yah,msn,dp,whatever. Ad 1, we don't exactly know what they will use this "new" word for, what solution will it cover under its terminology. Ad 2, if Google seems to work on something, that's always a bit of joy
Re:Google argghh! (Score:2)
Might impact the development of one of the most critical tools on the Internet, but you're right, that'd just be news for nerds.
Oh, wait...