Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
User Journal

Journal Sanity's Journal: Removing bias in collaborative editing systems 28

A few weeks ago a friend of mine that had been thinking about reader-edited forums (like K5) posed an interesting question. He was concerned about how people's bias would influence their voting decisions and wondered whether there could be any way to identify and filter out the effects of such bias. Of course, in some situations bias is expected, such as political elections, however in other situations, such as when a jury must vote on someone's guilt or innocence, or when a Slashdot moderator must vote on a comment, bias is undesirable. After some thought, I came up with a proposal for such a system.

First, what do we mean by "bias"? It is a difficult question to answer exactly; examples would include political left or right-wing bias, nationalist bias, anti-Microsoft bias, and bias based on race. The dictionary definition is "A preference or an inclination, especially one that inhibits impartial judgment." Implicit in the mechanism I am about to describe is a more precise definition of bias; it is the aptness of this definition that will determine the effectiveness of this approach.

Visitors to websites such as Amazon and users of tools like StumbleUpon will be familiar with a mechanism known as "Automatic Collaborative Filtering" or ACF. Amazon's recommendations are based on what other people with similar tastes also liked, this is an example of collaborative filtering in action. There are a wide variety of collaborative filtering algorithms, which range widely in terms of sophistication and processor requirements, but all are designed to do more or less the same thing: anticipate how much you will like something based on how much similar people liked it. One way to look at it is that collaborative filtering tries to learn your biases and anticipate how they will influence how much you like something.

My idea was to use ACF to determine someone's bias towards or against a particular article, and then attempt to remove the effect of that bias from their vote. The effect of their bias is assumed to be the difference between their anticipated vote based on ACF, and the global average vote for that article. Having determined this, we can then take their vote, and remove the effect of their bias from it.

Let's look at how this might work in practice. Joe is a right-wing Bill O'Reilly fan who isn't very good at setting aside his personal views when rating stories. Joe has just found an article discussing human rights abuses against illegal Mexican immigrants. Joe, not particularly sympathetic to illegal Mexican immigrants, gives the article a score of 2 out of 5. On receiving Joe's rating, our mechanism uses ACF to determine what it might have expected Joe's score to be. It notices that many of the people who tend to vote similarly to Joe (presumably also O'Reilly fans) also gave this article a low score - meaning that according to our ACF algorithm - Joe's expected vote was 1.5. Now we look at the average (pre-adjusted) vote for the story and see that it is 3 - we then assume that Joe's anticipated bias for this story is 1.5 minus 3 or -1.5. We use this to adjust Joe's vote of 2 to make it an actual vote of 3.5 - which means that Joe's adjusted vote for this story is actually above average once his personal bias has been disregarded!

So, how well will this system work in practice - and what is it really doing? What are the implications of this mechanism for determining someone's bias? Is it fair?

I don't pretend to have the answers to these questions, but it might be useful to think of it in terms of punishment: when your vote is adjusted by a large amount, then you are being punished by the system as your vote will have an effect different from that which you intended.

The way to minimize this punishment is for your votes as predicted by the ACF algorithm to be as close to what the average vote is likely to be as possible. The worst thing you can do is to align yourself with a group of people who consistently vote in a manner in opposition to the majority.

I have been trying to think of scenarios where it might be bad for people to do the former, or bad for them to do the latter, but so far I haven't come up with anything. What kind of collective editor would such a system be? What kind of negative side effects might it have? I am curious to hear your opinions.

This discussion has been archived. No new comments can be posted.

Removing bias in collaborative editing systems

Comments Filter:
  • Some things just should not be posted, no matter how excellent they are. This applies to subjects that are of interest to very few, and therefore the rest are biased against those subjects. That bias is useful and should be kept.
    • ...if the story is genuinely only of interest to a minority then this mechanism will actually work against such a story, since it will determine that the minority were voting in a manner that was unusually favorable and remove the effect of their bias.
      • If you just let everyone vote, the majority would automatically send the story off to the eternal bitfields. However, in your case the majority do not give it a lower score than they would have given a random badly-written story on the same subject, so their votes are purely bias which will be removed. The minority recognizes the quality and vote way higher than they would for a badly-written story on the same subject. Their bias will be removed, but there will still be a positive vote left afterwards.
        • If too many people are voting negatively overall then just have them vote on a curve - that is a different problem to that which this mechanism tries to address.
          • It is not the case that too many people are voting negatively. The story in the example should be voted down. Removing the bias makes that not happen.
            • If a majority is voting the story down then it will be voted down - negative votes will only be disregarded if they are from a minority group who always votes negatively relative to everyone else on that type of story.
              • You are just removing bias that disagrees with the majority bias. What's the point? The majority doesn't need help to win.
                • by Sanity ( 1431 ) *
                  You are redefining "bias" to mean virtually any voting preference. This algorithm removes the influence of votes based on an identifiable pattern, rather than on the merits of the story.

                  Also - what does "win" mean? That assumes a binary outcome - which isn't the case in many situations. What if stories were presented to the user in the order of their score?

                  • I'm confused about how the ACF algorithm handles multiple biases.

                    Say you and four other people always vote down stories written by authors with long last names (to pick a silly example). So the system should predict your likely vote for an article based on what the four other short-namers voted. This pattern continues for a long time.

                    Suddenly you start reading a lot of articles about baseball. You still always vote with the short-namers, except on articles that mention astroturf, in which case you alwa
                    • Can ACF really detect and predict this conbination of bizzare biases?

                      I think you are getting confused because you forget that ACF finds biases that you share with other people - an ACF would be meaningless if it only looked at your behavior in isolation.

                      Basically ACFs work by finding other people that share your apparent voting preferences, looking at how they voted on the new thing, and using their vote to anticipate what your vote is likely to be.

                      If there were a group of people that shared your lon

                    • I guess I didn't phrase my question very well. What if you correlate perfectly with one group half of the time, and with a completely seperate group the other half of the time? Does ACF have any way to recognize this kind of complexity?
                    • What if you correlate perfectly with one group half of the time, and with a completely seperate group the other half of the time? Does ACF have any way to recognize this kind of complexity?
                      In short: yes.
                  • And also assume that there are exactly four other anti-astroturfers who you typically vote with but only on articles mentioning astroturf.
  • Yes, you heard me correctly. A bias toward logical, informative, objective posts is one we should encourage. The system you describe doesn't seem like it would distinguish this bias from others, to the detriment of the community. Bias is not just measurement error; it's perspective as well. If someone truly loves right-wing posts that unexpectedly high rating would signal not a particularly good post but rather a particularly right-wing one - resulting in exactly the wrong response for both left-wing an

    • A bias toward logical, informative, objective posts is one we should encourage. The system you describe doesn't seem like it would distinguish this bias from others, to the detriment of the community.

      This would be a universal pattern in voting, not really a bias: it is unlikely that only a small group of people would vote for such articles while everyone else votes against them, and so this system wouldn't identify this as a "bias" as it is understood by this system.

      If someone truly loves right-wing po

  • How do you determine what kinds of bias to track?

    What is the mathematical difference between me voting against an article because I'm thinking "this article is written in a boring style" vs. because I'm thinking "this article is too right-wing for me"?

    Maybe you could have a group of trusted moderators to first score each article on one pre-defined axis or another. For example "1 to 10: how strongly anti-Microsoft is this article?" or "1 to 10: how pro-Democrat is this article?" Then when the public vote
    • How do you determine what kinds of bias to track?

      That is what the ACF algorithm does - it looks for patterns in my voting as it relates to other people's voting, nobody has to explicitly tell the algorithm to watch out for particular types of bias.

      What is the mathematical difference between me voting against an article because I'm thinking "this article is written in a boring style" vs. because I'm thinking "this article is too right-wing for me"?

      Because if you think it is boring - then that is likel

      • So if the set of articles I voted for, or the set of articles I voted against, has a significantly different "shape" from those same sets for the imaginary "average" user, then I am defined as having a bias (or multiple biases... the system can't tell how many)?

        What about polarized issues? Say for some reason nearly all people vote strongly for or against a particular article based solely on their preferred political party. The "average" user will appear to be a moderate because all the left-wing and rig
        • What about polarized issues? Say for some reason nearly all people vote strongly for or against a particular article based solely on their preferred political party. The "average" user will appear to be a moderate because all the left-wing and right-wing votes will cancel each other out, and everybody will be flagged as "biased" by the system

          In this case the article would be scored based on its merits once the effect of votes motivated by left or right-wing bias had been removed. If your goal is for peo

  • Nice work. I'll comment more on it later when I can. I hav to pack though so not right now.

  • Even assuming that this works perfectly, if you are just looking at the net average rating, it is totally useless. This is because you have to measure bias against what the majority, or others think. If this is the case, then you are only taking into account to what extent someone is going to disagree with the average oppinion on any issue.

    So, if is something is objectively bad, (generally agreed to be stupid) is receves a poorer average rating. So then, when each adjusted vote is averaged together, (If it
  • What happens if Joe knows that ACF is in use and rates the story lower than he would normally in order to get the score that he wants to become the post-ACF score?
    • Why would he do that when all he has to do is give it the higher score in the first place? The ACF doesn't do anything that Joe couldn't do himself.
  • Wikis are very good for creating non-biased articles. If there is a socially enforced policy of neutrality, like on Wikipedia [wikipedia.org], controversial opinions are attributed to their adherents. Wikipedia's articles on controversial subjects like abortion and euthanasia are surprisingly neutral. Sometimes one side appears to have the better arguments, in that case, the other side simply has not worked on the article much yet. But the tone of articles is consistently fair.

    There have been discussions to have a news s

"It's the best thing since professional golfers on 'ludes." -- Rick Obidiah

Working...