Journal Sanity's Journal: Removing bias in collaborative editing systems 28
First, what do we mean by "bias"? It is a difficult question to answer exactly; examples would include political left or right-wing bias, nationalist bias, anti-Microsoft bias, and bias based on race. The dictionary definition is "A preference or an inclination, especially one that inhibits impartial judgment." Implicit in the mechanism I am about to describe is a more precise definition of bias; it is the aptness of this definition that will determine the effectiveness of this approach.
Visitors to websites such as Amazon and users of tools like StumbleUpon will be familiar with a mechanism known as "Automatic Collaborative Filtering" or ACF. Amazon's recommendations are based on what other people with similar tastes also liked, this is an example of collaborative filtering in action. There are a wide variety of collaborative filtering algorithms, which range widely in terms of sophistication and processor requirements, but all are designed to do more or less the same thing: anticipate how much you will like something based on how much similar people liked it. One way to look at it is that collaborative filtering tries to learn your biases and anticipate how they will influence how much you like something.
My idea was to use ACF to determine someone's bias towards or against a particular article, and then attempt to remove the effect of that bias from their vote. The effect of their bias is assumed to be the difference between their anticipated vote based on ACF, and the global average vote for that article. Having determined this, we can then take their vote, and remove the effect of their bias from it.
Let's look at how this might work in practice. Joe is a right-wing Bill O'Reilly fan who isn't very good at setting aside his personal views when rating stories. Joe has just found an article discussing human rights abuses against illegal Mexican immigrants. Joe, not particularly sympathetic to illegal Mexican immigrants, gives the article a score of 2 out of 5. On receiving Joe's rating, our mechanism uses ACF to determine what it might have expected Joe's score to be. It notices that many of the people who tend to vote similarly to Joe (presumably also O'Reilly fans) also gave this article a low score - meaning that according to our ACF algorithm - Joe's expected vote was 1.5. Now we look at the average (pre-adjusted) vote for the story and see that it is 3 - we then assume that Joe's anticipated bias for this story is 1.5 minus 3 or -1.5. We use this to adjust Joe's vote of 2 to make it an actual vote of 3.5 - which means that Joe's adjusted vote for this story is actually above average once his personal bias has been disregarded!
So, how well will this system work in practice - and what is it really doing? What are the implications of this mechanism for determining someone's bias? Is it fair?
I don't pretend to have the answers to these questions, but it might be useful to think of it in terms of punishment: when your vote is adjusted by a large amount, then you are being punished by the system as your vote will have an effect different from that which you intended.
The way to minimize this punishment is for your votes as predicted by the ACF algorithm to be as close to what the average vote is likely to be as possible. The worst thing you can do is to align yourself with a group of people who consistently vote in a manner in opposition to the majority.
I have been trying to think of scenarios where it might be bad for people to do the former, or bad for them to do the latter, but so far I haven't come up with anything. What kind of collective editor would such a system be? What kind of negative side effects might it have? I am curious to hear your opinions.
Why do you want to remove bias? (Score:2)
This prevents that... (Score:2)
Re:This prevents that... (Score:2)
That's a different problem (Score:2)
You still don't get it (Score:2)
No it doesn't (Score:2)
Ok, so you are not removing bias (Score:2)
No (Score:2)
Also - what does "win" mean? That assumes a binary outcome - which isn't the case in many situations. What if stories were presented to the user in the order of their score?
Re:No (Score:1)
Say you and four other people always vote down stories written by authors with long last names (to pick a silly example). So the system should predict your likely vote for an article based on what the four other short-namers voted. This pattern continues for a long time.
Suddenly you start reading a lot of articles about baseball. You still always vote with the short-namers, except on articles that mention astroturf, in which case you alwa
You misunderstand what the ACF is doing (Score:2)
I think you are getting confused because you forget that ACF finds biases that you share with other people - an ACF would be meaningless if it only looked at your behavior in isolation.
Basically ACFs work by finding other people that share your apparent voting preferences, looking at how they voted on the new thing, and using their vote to anticipate what your vote is likely to be.
If there were a group of people that shared your lon
Re:You misunderstand what the ACF is doing (Score:1)
Re:You misunderstand what the ACF is doing (Score:1, Troll)
Re:No (Score:1)
Some bias is good (Score:2)
Yes, you heard me correctly. A bias toward logical, informative, objective posts is one we should encourage. The system you describe doesn't seem like it would distinguish this bias from others, to the detriment of the community. Bias is not just measurement error; it's perspective as well. If someone truly loves right-wing posts that unexpectedly high rating would signal not a particularly good post but rather a particularly right-wing one - resulting in exactly the wrong response for both left-wing an
Re:Some bias is good (Score:2)
This would be a universal pattern in voting, not really a bias: it is unlikely that only a small group of people would vote for such articles while everyone else votes against them, and so this system wouldn't identify this as a "bias" as it is understood by this system.
kinds of bias (Score:1)
What is the mathematical difference between me voting against an article because I'm thinking "this article is written in a boring style" vs. because I'm thinking "this article is too right-wing for me"?
Maybe you could have a group of trusted moderators to first score each article on one pre-defined axis or another. For example "1 to 10: how strongly anti-Microsoft is this article?" or "1 to 10: how pro-Democrat is this article?" Then when the public vote
Re:kinds of bias (Score:2)
That is what the ACF algorithm does - it looks for patterns in my voting as it relates to other people's voting, nobody has to explicitly tell the algorithm to watch out for particular types of bias.
Because if you think it is boring - then that is likel
Re:kinds of bias (Score:1)
What about polarized issues? Say for some reason nearly all people vote strongly for or against a particular article based solely on their preferred political party. The "average" user will appear to be a moderate because all the left-wing and rig
Re:kinds of bias (Score:2)
In this case the article would be scored based on its merits once the effect of votes motivated by left or right-wing bias had been removed. If your goal is for peo
excellent article (Score:1)
Mental Masterbation. (Score:1)
So, if is something is objectively bad, (generally agreed to be stupid) is receves a poorer average rating. So then, when each adjusted vote is averaged together, (If it
Over biasing (Score:1)
Re:Over biasing (Score:2)
Wiki (Score:2)
There have been discussions to have a news s
Re:Wikipedia is run by censors (Score:1)