
Google Letting Users Rank Search Results 332
Myriad writes "C|Net News is running an article about Google testing out a new system which would let users rank pages. From the article, 'Two weeks ago, Google began quietly testing a Web page voting system that, for the first time on a large scale, could eventually let Web surfers help determine the popularity of sites ranked by the company's search engine.'" As someone who has a lot of experience with systems where users self rate content, let me just wish Google the best of luck. Especially since for many unscrupulous businesses, ratings in search engines directly translate to dollars.
Just disqualify the money element... (Score:2, Interesting)
Thanks,
Travis
forkspoon@hotmail.com
Re:Just disqualify the money element... (Score:2)
Users who perenially search outside of corporate sites could be able to customize their setting so that they'd have to select when they want to include corporate sites. Could it work? I don't know.
br>
Re:Just disqualify the money element... (Score:4, Insightful)
Google already has a 'customized' interface that allows users to do things like change language, etc...
I think the sugestion of separating corporate and non-corporate searches has its merits. I hate searching for an anime fanfiction and being directed to Best Buy's website because they happen to carry the anime title I mentioned in the search query.
It has its problems too, however. Tagging each of the pages in Google's truly massive search database with a corporate or non-corporate tag is a non-trivial problem. For obvious reasons, website owners cannot be trusted to tag their own pages.
You're also opening a can of worms here, since many website owners will protest either a commercial or a non-commercial tagging.
Even if you tagged sites by domain, you'd still have hundreds of thousands... possibly millions of domains, not to mention sites that carry both corporate and private content like Geocities, Tripod, or other free webhosts.
Then you have to consider what to do with semi-for-profit pages? Many pages have 'tipping jars' now. Many open-source software development pages have information about for-profit works, or are developed by for-profit organizations. Should companies like Redhat be excluded from non-profit searches? Probably. How about Vorbis Ogg? That's not nearly so clear. How about web-comics, almost all of which give away their content freely, but sell merchandise, dead-tree books, or other premiums.
In the end, I think that I'd rather put up with having to sort through twenty or so highly relevant results to get the search result I wanted rather than having to search twice to make sure that I get all the possible relevant results.
Unscrupulous Businesses? (Score:5, Funny)
Re:Unscrupulous Businesses? (Score:5, Funny)
Re:Unscrupulous Businesses? (Score:4, Funny)
As a social misfit, disgruntled programmer, and militant loather of all things e-business, let me assure everyone that I will personally make your businesses appear on fuckedcompany in one (1) month.
Thank you.
Re:Unscrupulous Businesses? (Score:2, Insightful)
Now, if one were to write Code Red III which forms a distributed network and ranks up websites that are injected into the network...
a problem worth solving... (Score:2, Flamebait)
The problem is.. (Score:2, Interesting)
Re:The problem is.. (Score:2, Informative)
Nothing but abuse... (Score:2, Insightful)
I wish it would work, but it will be an abismal failure... in fact it wouldnt suprise me if some corperations hire people just to "vote" for their sites...
just look at ANY top 50/100 voting sites and you know what I am talking about
Re:Nothing but abuse... (Score:2)
Re:Nothing but abuse... (Score:2, Insightful)
Re:Nothing but abuse... (Score:2)
With a google-written client communicating with their server, they should be able to come close (or at least make it very difficult to vote twice). There are lots of techniques that could work...dynamically generated keys, encryption, etc.
"If you think encryption will solve your problem, you don't understand encryption and you don't understand your problem." Bruce Schneier
Anything that could be done by your hypothetical client could also be done by a person who has used a debugger on it. It's just not theoretically possible to prevent something like that with an authentication key embedded in the software. Or per-client keys...remember, they have to get the key somehow. How do you restrict it to one key per person? That's back to the original problem.
Ugh, ugh, ugh. (Score:2, Insightful)
This is particularly repugnant, especially given the goals set in the article (Google wants to make the search engine process more of a democracy, etc.) Is anybody else tired of soulless marketdroids essentially destroying all the good things that are the Net(C)(TM)(R)?
On the bright side, maybe there's room to add Slashdot-styled moderation and meta-moderation to Google rankings - imagine a "+1 Funny" rank for the Onion or a "-1, Offtopic" page rank for every time you go surfing for something honest and end up at Yet Another pr0n Site.
Re:Ugh, ugh, ugh. (Score:2)
And imagine a "-1, Offtopic" page rank for every time you go surfing for pr0n and end up at Yet Another Honest Site!
Google Attack Engine (Score:5, Informative)
http://www.theregister.co.uk/content/6/23069.ht
Plenty of room for abuse (Score:2, Interesting)
Options (Score:4, Insightful)
Better yet, they could have a slashdot-like user customization mechanism (i.e., where the user can set the threshold and moderate/vote a search result in many ways).
Anyway, I wish them luck too (Google rules
Vote early and often. (Score:2, Insightful)
How you do it: After putting the page up, write a tool to hit google's voting engine over and over and over... giving yourself good ratings.
Question: How would the system prevent this type of abuse from happening - especially the opposite approach - rating competitors' sites poorly to drop them in the list?
Devil's Advocate Question: If you don't allow this abuse to occur, doesn't that then unfairly give extra ranking to sites based on age? A new site won't have accumulated as many votes as an old one yet, and so the ranking would always favor old (and likely to be out of date) sites over new ones.
Re:Vote early and often. (Score:2, Interesting)
Also, I think they would know a thing or two about normalizing the data to correct for the age of a site.
Re:Vote early and often. (Score:2)
If I were Google, I would just keep a table mapping IP addresses to voting records. So your second vote for a page merely replaces your first vote, instead of counting as another vote. Would that be enough?
Re:Vote early and often. (Score:2)
Re:Vote early and often. (Score:2)
A) Insert a unique coded cookie to mark voters
Cookies can be deleted.
B) Prevent multiple votes from the same IP address or NIC address ( allowing NAT server)
IPs can be spoofed and some companies own a LOT of IPs.
C) Instead of Counting every vote, select random votes from differetn internet demographics simular to TV rating methods.
The TV ratings systems that I know of require a lot of demographic information about the people doing the ratings. I know that I wouldn't be willing to give out information about myself in order to rate some website. Others probably would though, so who knows how that would turn out.
How this new system might *reduce* abuse (Score:5, Interesting)
What I wanted then was a "moderate" button I could click beside the link to indicate that it was spam. With a voting system like this, Google could locate and remove spam a lot quicker. Maybe that's what this is all about.
Doug Moen.
Re:How this new system might *reduce* abuse (Score:3, Interesting)
Great, but .. (Score:5, Insightful)
To establish such a system, Google needs to get users to create accounts. A more feasible solution may be cooperation with instant messaging providers, using their identity pool and friends lists as filter criteria. But if they want people to create accounts, they need to turn Google into a community. The first thing to do this would be to have an automatic discussion forum for every major website.
That, again, would create a lot of traffic, so they might be better off using a peer-to-peer app residing on the users' systems instead, which would also allow you to add website-specific real time chat, file sharing, micropayments and other nifty things. It would also make it easier to create responsive user interfaces, which is always a problem with web UIs.
Re:Great, but .. (Score:2)
Which would require user accounts, as you said, but I wouldn't have any problem at all with having a Google cookie on my browser. Once you've got that, then maybe something like Amazon.com's system would work without setting up explicit user groups: "The following pages were high-ranked by users whose page rankings were similar to yours: ..."
Re:Great, but .. (Score:2)
Certainly the current Google ranking system (counts number of links) should always be of greater significance in any applied approach.
Re:Great, but .. (Score:2, Insightful)
I wouldn't call this working. Not in a million years would I say that. It depends entirely on the mentality of the original moderators. Dissident opinions (or in Google's case, if they were fucking stupid enough to implement such a system, sites promoting different views) from the original moderators are not modded up, thus you never gain the ability to moderate. Meta-moderation is a hack job at best, a fucking beast of a problem at worst. And nothing fixes the original problem of Slashdot's moderation system. Groupthink is promoted, dissident views are demoted, no matter how well reasoned. You don't believe me? Check the score of this post after a few hours.
Re:Great, but .. (Score:2)
Re:Great, but .. (Score:2)
And who is driving your browser? I may be a sh*thead (people that know me well know that is a fact, you just got a lucky guess), but I drive my own browser and point it where I want to go.
By the way, read my comment more carefully and think "math". (Perhaps you are a sh*thead too.)
Re:Great, but .. (Score:2)
Re:Great, but .. (Score:2)
i mean, if I had then I wouldn't have had to bother reading your post, would I? (unless it gets modded up in the future)
Re:Great, but .. (Score:3, Insightful)
1) The antitrust actions against Microsoft were gross abuses of government power.
2) RMS is fundamentally mistaken on the nature of property. The case for intellectual property is in fact stronger than the case for physical property, since IP is entirely the product of the creator's labor, while physical property includes preexistent matter to which no one can claim a natural right.
3) Money doesn't corrupt governments; governments corrupt money. "Unchecked corporate power" isn't a problem on its own. The problem is that whenever a government allows itself to move beyond lassiez-faire, it creates an incentive for the entrenched corporate powers to pay for regualtions and laws that protect them and squash competitors.
Re:Great, but .. (Score:2)
Re:Great, but .. (Score:2)
I'm not interested in taking the time to see how other people rated other sites
If Google stored a cookie with your ID and the visited page was smart enough to include a special "Google Moderate" link, you could easily and quickly rate a site when you visit it. If it is easy, you might do it.
I would welcome that ranking option as something that I could turn on or off on Google when I do a search.
Re:Great, but .. (Score:2, Funny)
Judge: OK, Bill, in addition to spreading Windows more effectively than finely ground anthrax in a crop duster over Los Angeles, you are also going to have to allow Google to integrate their services with your
Bill: Damn, I'm good.
Re:proxy based recommender system (Score:2)
The main problem with doing such things over a proxy is the lack of control you have over Google's database. While they can easily filter or sort their search results by any new criteria, you cannot. What you could do is have a central database of search phrases and related rated sites (which could have some semantic intelligence -- still I don't think you would get many hits on anything besides "sex" and "mp3").
This database could then be accsesed by users using UI controls added to the search result pages of major search engines by the proxy, which could be remote or local. As you would do a search over one of the supported search engines, the proxy would also query the db and add the highly rated sites on top of the result list. Trust could be implemented in a similar fashion, but the more complex your system gets, the more awkward the proxy method becomes, especially when Google changes their output format.
My preferred method would be making the rating controls and display independent from any particular site, having a small system-tray application instead that allows you to rate the currently viewed site (also filing it in a certain category -- fast UI is essential here). Then you could "browse recent ratings" and add users who share your tastes to your trusted user list manually, or let the server tell you about users who have rated things similarly to you, which would allow you to "browse recent ratings by friends" or "browse recent ratings by friends in category X". This concept could, of course, be extended to other document types.
Might work if... (Score:5, Interesting)
Re:Might work if... (Score:2, Insightful)
It'd be good to see something resembling peer review on the web after all.
Why not just monitor clickthroughs? (Score:2)
Why not just monitor which links searchers choose?
Re:Why not just monitor clickthroughs? (Score:2)
Re:Why not just monitor clickthroughs? (Score:3, Informative)
My only problem with the current implementation is that it's only supported in MSIE. It uses the google toolbar app. If only there were a google toolbar for Mozilla.
Re:Why not just monitor clickthroughs? (Score:5, Interesting)
Re:Why not just monitor clickthroughs? (Score:2)
LS
Re:Why not just monitor clickthroughs? (Score:4, Interesting)
Duh, you can adjust for that.
Compute the standard distribution of click-throughs according to position in the result set and any page outdoing this number gets "moded" up, any one underperforming this number gets moded down...
Re:Why not just monitor clickthroughs? (Score:2)
Re:Why not just monitor clickthroughs? (Score:4, Insightful)
Becuase that doesn't tell if you clicked and didn't like the page. Just because one person clicks on a junk page doesn't mean that page should be higher rated for the next person.
I like this feature (already implemented) (Score:2)
I do like this feature, it truly shows the worth of a web page in other peoples eyes, not just the eyes of the webmaster who created the thing...
Even just flagging bad links (Score:2, Interesting)
I know that their spiders go through the database and verify links but I'd be willing to bet that is takes months to go over it once. Why not flag links as broken and have the spider verify/remove those first?
Just cleaning up the broken links could improve the search results.
Help out Project Gutenberg!
Distributed Proofreaders [dns2go.com] http://charlz.dns2go.com/gutenberg
Win32 (Score:5, Informative)
The problem is that this GoogleBar only plugs in Internet Explorer, so *nix geeks won't be able to rate sites..
It consists on small faces on which you click. (happy or unhappy)
-J
Googlebar for non-ie (Score:4, Informative)
Well, yes and no. There is currently a project [mozdev.org] on Mozdev that aims to duplicate some if not all of the functionality of the toolbar for Mozilla, and while the current version 0.4 is still somewhat lacking, a new version that duplicates the look as well as the major search functionality (though not pagerank etc) is on the way soon, apparently. However, since this is an independent project and not affiliated with Google, I'm not sure if it would be able to access the rating system. Still, Mozilla users DO have the toolbar, and, since mozilla is cross-platform...
Re:Win32 (Score:2)
This is not true. Or rather, while it *is* true that the happy/sad face voting feature is only available on the toolbar, which is only available on Win32, I gotten search results recently from Google which included at the bottom a questionaire about the accuracy of the search results, allowing me to rank various items. I don't know if it was a temporary thing or a random selection, but it was on the page itself, and I was on a Mac. No toolbar, but still soliciting user feedback.
Other neat (cough) features: (Score:2, Troll)
This is truly idiotic, since robots.txt has never been a default part of any web server installation I've ever done, so it's completely a voluntary thing to create the file, and every webmaster should be WELL AWARE what this file does (by virtue of the fact that they had to create it). I mean, duh guys.
Yeah, so I'm off topic. But I just got the spam this morning, and I used to respect Google quite a bit, and witnessing them resorting to spam emails, begging us to let them spider our sites really tarnished their image, so let me rant a little. :p
Oh, and let's not forget about google suggesting robots.txt [google.com] as a method to protect sensitive data recently [cnet.com]. Be nice if they could decide if they wanted us to create robots.txt, or not..
Re:Other neat (cough) features: (Score:2)
I think it's perfectly clear in there, the line that states if it is your intention, ignore the rest of the email. I can understand your stance, but I hope you take into account that this was simply an informational email - and my guess is a lot of people don't know you can setup a robots.txt entry to allow one crawler but block others.
They still advocate and encourage robots.txt, and I personally find their email handy, well-worded, and non-intrusive. If they change their policy and start emailing them to you on a regular basis, let us know because then it's violating your personal space and I back your stance 100%. Till then, one email doesn't hurt - and it does provide useful information as well as acknowledging (not begging at all) that you may be perfectly content blocking the crawler bots.
Re:Other neat (cough) features: (Score:2)
> the email.
I think I understand it perfectly. They noticed a website has a robots.txt file, so they send an email to the webmaster making the webmaster aware the file existed (as if they weren't already), in effect asking us to remove it. It was veiled under the guise of being nice and polite and thoughtful, but they still requested it.
> but I hope you take into account that this was
> simply an informational email -
informational schminformational. Spam is spam is spam, and for me to remove robots.txt would be directly to their benefit.. it makes their engine more accurate, which makes people happier, which makes more customers, which makes them more money.
Just because they worded it nicely doesn't make it less of a spam email.
Re:Other neat (cough) features: (Score:2)
Second point: They never ask you to remove it, they suggest if you want to let google search, add User-Agent: Googlebot. Big difference.
Third point: SPAM is SPAM. Google sending one email to a site is not. It isn't a commercial email, even though by getting more traffic to their site and happier customers generate email. It would be spam if they advertised anywhere in there that you can be listed as a sponsored link for the low low cost of $4.95 for the first hit $0.99 each additional. It wasn't.
It was unsolicted email, yes. It is not bulk, nor commercial. There for it is only 33.3% spam.
Re:Other neat (cough) features: (Score:2)
I dunno, I view these kind of emails like the DMV notices or USPS notices that get mailed out. Benign, pointful (usually) and 99.9% of the time wasteful. However, in the electronic frontier it is caused by marketing dorks. As is most the problems in the electronic space.
Re:Other neat (cough) features: (Score:2, Insightful)
User-Agent: * /
Disallow:
Think about it - how many people do you think are out there with a half-clue who decide that they want to prevent evil robots from indexing their site without realizing that they therefore won't wind up on search engines? Apparently Google seems to have run into this situation and now e-mails webmasters who have potentially accidently blocked all robots from indexing their pages.
Now there may be a valid reason to completely block your site from all robots. But think about how pointless it really is - how many webmasters really want to drive away search engines? Most people want to show up on search engines, especially people whose site shows up as a domain (ie, http://slashdot.org/ as opposed to http://www.wherever.edu/~they/started/).
Seriously, why did you block the entire domain from web crawlers? While there definately are good reasons, it seems sensible for Google to send a "are you really sure you want to do that" message, especially since the linked "spam" was sent to someone who apparently had four domains they had blocked off from search engines. This sounds like something that an amatuer webmaster may have accidently done without thinking about the consequences. In which case the e-mail makes sense: "Did you really mean to do that? If so, ignore this message - if not, here's a way to fix it."
I really think you're overreacting to a fairly innocent e-mail.
Re:Other neat (cough) features: (Score:2)
Go read the email before posting lies.
Thank you, have a nice day.
Re:Other neat (cough) features: (Score:2)
Blah blah blah.
The email, which I don't trust 100% to be from Google, says, and this is verbatim:
By pointing this out to the reader, they clearly show that the purpose of the message is to get people to allow the bot to crawl their site. Disguising it as an educational message about the format of the robots.txt file doesn't make it better. The intent, in the end, is to be able to index more pages.
What's your problem? My belief is that you wouldn't think Google had commercial interest in indexing more pages even if the message had said "THIS IS A COMMERCIAL UNSOLICITED MESSAGE".
Re:Other neat (cough) features: (Score:2)
First off, lets relate directly to the spam definition you posted. First off, Google may (under the assumption it is Google) be sending these out, however it is far from a flood. Given that robots.txt are not a common place occurance on sites and from the email correlates emails to the different sites it blocks. Secondly, there is no counts or reports that this is a multiple-email therefor requiring no one any action to cease the receipt of the email.
Following the guidelines from spam.abuse.net (What is Spam?) this email really doesn't satisfy any of the direct arguments. I would say it is an annoying notification, similar to if you install Windows (My girlfriend got a new laptop with XP, let me tell you it is irritating) and it prompts you for updates and such. This is prompting you with information once. It's not a deluge that will cost anyone any amount of money (unless you are on a metered line with a 300bps modem) If you read the full What is Spam article, I think you will see that sending one email per server-admin while it correlates the email addresses to prevent the same email address getting blasted by 20 emails (in which case I would say you have a much stronger argument it is spam).
The commercial interest of it in no way ties to the recipient of the email, and in no way is their organization structured on the email. Yes, it may help get more sites indexed; but it is a far cry from email marketing. Calling any unsolicted email (which I back it is) spam is dangerous, because think of it this way: everytime you send a resume with an email to a company just to see if they have openings is the same course - yet not spam, right?
Re:Other neat (cough) features: (Score:2)
Thank you sir, that was the response I wanted, earlier in the thread.
Yes, I did indeed read the full What Is Spam? page and I picked out the first line of if because it comes closes to how I define spam; something that wastes my time and annoys me, takes up space in my inbox, was not requested by me, and is sent out of greed (of any degree). I very seldom get two or more copies of any one spam, so I can't see how the "this is a one time mailing" excuse could be taken serious in any type of correspondance, spam or not.
Many spams don't tie any commercial interest to the recipient. A letter saying "buy our product" does not tie any commercial interest to the reader. It only delivers the old "give us money" message.
The text that we're talking about follows the same lines. It says that Google can't index the site, implying that it would be a good thing if they could. So, in effect, we have a spam, something that says "you might benefit from using our product". Note that you very possibly will benefit from having your site indexed by Google but, and that's the important part, don't you think that the webmaster would be capable of figuring that out herself?
I don't quite agree. I would argue that you shouldn't send them a CV if you didn't know they had vacancies (many (most) companies publish their job openings on the web).
A concern (Score:2, Insightful)
I have seen all kinds of warez sites that force you to vote in order to get to parts of the site. Others could have frames that forge a vote each time a visitor comes to their site. While this is an intriguing idea, I don't see how it could work.
The whole idea of Google's PageRank [google.com] was to count each link from another indexed site as a vote. What was wrong with that scheme? Doesn't everyone currently think Google is the best engine out there? If so why "fix" it?
I like the suggestion someone else made about showing the vote results but not having them acutally affect the search results.
Taco fumbles again! (Score:2, Insightful)
The Problem (Score:2)
The existing Google ranking system is already exploited by users who set up hundreds of dummy sites that all link to a certain site using a variety of keywords, thus feeding the G! machine bogus "popularity" information.
A ranking system will just make this easier to do. Your average skript kiddie could easily bombard Google with a heap of "Yeah this is great!" ratings for his site, thus bumping it up many notches.
User-ranking systems work as long as there's no huge desire to do so. Slashdot doesn't have *too* many problems because nobody really cares that much if they get rated "+6 - Rad!"...however, there's a much greater motivation to have one's website come up tops in one of the most popular search engines....
Give Google some credit (Score:5, Insightful)
Think about it. According to the article, the system is currently just collecting information, it isn't affecting rankings -- yet. So in a couple of weeks Google will look at this new data, look at the corresponding pages, then figure out what should be done. Why are we assuming that they will just do a linar mapping between the number of happy faces and relevance?
I wouldn't put it past them to dynamically map relevance with a far more complicated function. User rankings are another non-random data stream. All information (even negative information) is useful. Just as long as one strips it from its labels, and looks at it blindly. Can you say neural networks?
Re:Give Google some credit (Score:2, Interesting)
Re:Give Google some credit (Score:4, Funny)
This is not useful, but there are alteratives... (Score:2)
What is more interesting is what a few companies have been doing recently in the search engine world (there really still is business after the dot-com fallout, even if it isn't profitable). At my work, we recently looked into a product by a company called Recommind. Their search engine was able to find similar words in documents, and could give you related documents that didn't have key words. It could even distinguish between java (the coffee), java (the language), and Java (the island near Jakarta)! Pretty cool stuff. Combine that type of "concept matching" instead of "keyword matching" with Google's technology, and you've got the next generation search engine.
All very cool stuff. I hope they don't kill it.
They will likely have the same problem as mp3.com (Score:2, Interesting)
Metamod? (Score:2)
Especially since for many unscrupulous businesses, ratings in search engines directly translate to dollars.
But we've all seen first hand how easy it is to stop unscrupulousness through meta-moderation!
How to make automated votes expensive (Score:5, Interesting)
It's not that hard to make it really expensive to forge votes. For instance, check out the captcha project [captcha.net] at CMU. (Basically, it generates images that are difficult for a computer to recognize, but easy for a human, and challenges the user to respond to them in some way to prove that they are human.) If they could find the right balance of convenience for humans and difficulty for perl scripts, I think they'd have a great thing going. I have always wanted this feature in a search engine
Disabilities interfere with these tests (Score:3, Interesting)
For instance, check out the captcha project [captcha.net] at CMU.
I looked at captcha and found that it may generate problems with disability legislation in some jurisdictions. For instance:
The only accessible test (fbw) doesn't always work, and the other three are not accessible to those with disabilities. Watch somebody get sued under the ADA.
Re: (Score:2)
They aren't stupid (Score:5, Informative)
Rather than using the votes to tinker with the specific rankings of particular pages or sites, he said, the feature would most likely be used to bolster the relevance of overall results.
"It will most likely have more of an aggregate impact," Krane said. "We have indexed more than 1.6 billion Web pages, so it is extremely inefficient to go after individual pages."
Also remember that this is only one of many of Google's tools to improve relevance. You can already do your part to stop spammers by reporting them to search-quality@google.com. [mailto]
what would be useful... (Score:2, Interesting)
eg, my site used to be called '/dev/random' but i changed the name when i realized that it was in the search engines for that term and that most people who were searching for '/dev/random' probable weren't looking for my weblog. i'd love to have some kind of 'anti-keyword' meta tag that i could use to tell the googlebots that i'd rather not be associated with that search term anymore.
i know... somewhat off topic and boring... sue me.
Meta-Rating (Score:4, Insightful)
Oh, by the way, if you're already a Slashdot moderator and want to know if you can Meta-Moderate, just check
-J
Re:Meta-Rating (Score:2)
Google Meta Mod? (Score:3, Funny)
Another piece of the Global Brain (Score:3, Informative)
It's all part of the process of creating a more "intellegent" web.
OpenDirectory (Score:5, Insightful)
Re:OpenDirectory (Score:2)
Come on, other open source projects can get good work done with volunteers. Anybody have an insight into why Open Directory apparently cannot?
Because Open Directory turns away volunteers. A lot of them. I've offered to edit four times now, all in neglected categories that were missing very obvious sites (that were nowhere else in Open Directory). I've been turned down every time with no explanation. I've submitted a dozen sites to perfectly appropriate categories and not one of them has appeared. Not one. Heck, the international company I work for, which is a fairly well-known partner of IBM, isn't in Open Directory at all, despite me submitting them twice. On the plus side, half of our competitors aren't there either, despite me submitting a few of them.
Has this been patented? (Score:2)
Epinions Web of Trust (Score:3, Interesting)
Basically, you can see what users rated the article as useful. If you think that certain people have similar tastes to you, you put them in your Web of Trust. You'll get articles posted in a different order depending on who you trusted.
It is actually more complicated than that, as there are epinions "Experts" who are judged by epinions to have good ratings. I think Amazon has a similar system (and has way more users, but the system still seems to work ok).
The big problem is that the internet at large has so many bloody users and so many bloody pages... I think introducing groups of users or groups of groups that you trust might be a better way for the Web of Trust idea to work with the internet at large.
Google makes a subtle statement about drugs (Score:2)
However, recently the ad which appears for marijuana [google.com] changed to NewScientist.com [newscientist.com], a science journal which has been publishing much more balanced and thorough information on weed, some of which advocates that weed is less dangerous than alcohol. Also the top result is NORML [norml.org], a legalization-advocacy group. (This is probably not due to tampering w/ the search engine, but is interesting)
I believe that The Powers That Be within Google have taken the more moderate, academic drug stance, as opposed to gov't-sponsored propaganda. Google's pretty influential, Internet-culture-wise. Food for thought.
(Offtopic, sort of, i know, but I saw a Google story and had to run with it!)
Possible solutions to search-spam (Score:2, Interesting)
2) Use the Yahoo! style system of having an image that you have to type the word in from to create an account. Keep changing the way the image is formed. This should *help* to prevent account creation spam.
3) Give people a certain number of points per day / week / month (ala
4) Make it so that everyone has to balance out +ves and -ves - that is, somehow make sure that they can't just do one or the other.
5) Make it so that each account can only rate a particular site once. Now this requires quite a bit of storage, because you've got to store every rating ever individually instead of just a counter, but that way you can prevent multiple rating on some corporate site.
Note that this prevents the idea of rating a site based on how appropriate it is for a particular search, which is admittedly one of the really exciting parts of this (that is, if I search for Transistors and get www.electronics.com then I rate it 'Good'. If I search for Open Source and get www.electronics.com then I rate it as 'Bad'.)
With this system instead of this I just rate www.electronics.com according to how good the site is, not how relevant it is. Maybe that's what they're aiming for, maybe it's not.
I think that would help stop it but it all depends on the security of the account creation process - if it's easy to spam then the whole system becomes a waste of time.
It also doesn't prevent the problem of people being paid for ratings, which is possible, or for a company getting every single one of its employees to vote for the company. Thinking about that, one solution could be to just say that a company's rating can't go above a certain level and can only increase at a certain speed.
Or you could have metamoderation. This sounds more and more like Slash based code all the time!
It won't be too hard (Score:2, Interesting)
Another (Different) Rating Method (Score:3, Interesting)
If Google (or another search engine) set up all links to visit an internal google page that quickly redirected the user to the target site, it could rate on how many people visited the site, instead of a potentially biased rating of users.
Of course, shady websites could still influence it, either by hitting the pages themselves, or by crafting their page so that the google-selected text is tempting to search engine users, but the system still has the advantage of not requiring active participation of users.
Just my $.02
Re:Good idea but... (Score:2)
Re:Good idea but... (Score:2, Insightful)
The article clearly states that Google will use the results to supplement, not replace, current methods. So, if someone wishes to manipulate the results, they will have to combine several forms of cheating to succeed.
The article also states that methods will be used to prevent this sort of abuse, though Google doesn't say (for obvious reasons -- why do spammers work for them?) what they are.
But there are obvious ways to defeat abuse. One way is to do IP matching, and cull results originating from a single domain. Another would to use only a random representative sampling of votes, rather than every vote, in counting results. Another is simple human oversite (or good AI), looking for unusual ranking changes.
Google's been great so far in avoiding the crapfloods. I doubt if they'd cut their own throats. The fact that they are testing this technology rather than just rolling it out is a good sign. When's the last time you heard of a search engine testing before implementation?
Barely-relevant anecdote:
The year that Excite debuted, I found my own credit card number, expiration date and phone number in their database. By pattern matching I found the same for a couple of dozen other people who had all patronized the same online bookstore (idiots momentarily had their customer database on the webserving machine, excite's spider found it).
It took about a week to find someone at Visa who knew what the Internet was (a security VP). He informed me that Excite had been designed with no means to edit the database. I found that hard to believe -- still do -- but my personal info remained findable for several weeks thereafter.
Re:Good idea but... (Score:3)
Personally I think their system ain't broke, though, so why fix it?
Re:Good idea but... (Score:2)
Re:Good idea but... (Score:2)
Re:Good idea but... (Score:2)
Re:The demise of a good search engine? (Score:2, Insightful)
Re:The demise of a good search engine? (Score:2, Interesting)
here it goes. (Score:3, Informative)
Whats funny about this is that the search engines already know this. The Marketing Director at the company I work for [centric.com] told me that this hasn't worked in a couple of years. Some engines send a second agent out to see if the page the page at that link is the same as one that got indexed. I think this a case of whoever wrote the artice isn't up to date on search engine technology.
1st Ranked on "web page cloaking" for money (Score:2)
. .
Hmm, well, before you posted with a description of "cloaking" I ran a Google search on - web page cloaking - [google.com] and got this result as the first hit :
Website,web page cloaking and stealth technology [webprominence.co.uk]
Which is some company trying to make money from doing this. See the next page in their pitch : http://webprominence.co.uk/promotion/costs.htm
No w, if cloaking is easily defeated by a second bot, and the Googlebot has this, shouldn't they at least put _their_ own_ link in top slot just saying, "by the way, this kind of stuff is dishonest, misleading and doesn't work" [and thus possibly a fraud?]
Yeah, sure, no - one's going to do this for the n categories across 1.6bln indexed pages for every possible scam scam, but I'd have thought this kind of thing would be obvious "customer protection / advice" on Google's part. I'd _want_ people to know my search engine couldn't be scammed (at the very least by techniques I could defeat and I was indexing for a possible cheater)
On second thoughts, I'd be just as happy to let the lame would - be tricksters / cloakers whatever waste their money.
On the other hand, isn't it that kind of thought (as I just had) which has left the whole web / Internet up for grabs by the slickest over the dumbest and ultimately hurt the bright people who cared?
Moderation and meta - mod on the scale of the *web*???!!! Man, that'd sounds crazy to me. Slashdot scaled n - fold . . . Can't bear to think about it . . gotta go . . .