Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
AI

'Algorithms Are Like Convex Mirrors That Refract Human Biases' (venturebeat.com) 169

Emil Protalinski, writing for VentureBeat: At the Movethedial Global Summit in Toronto yesterday, I listened intently to a talk titled "No polite fictions: What AI reveals about humanity." Kathryn Hume, Borealis AI's director of product, listed a bunch of AI and algorithmic failures -- we've seen plenty of that. But it was how Hume described algorithms that really stood out to me. "Algorithms are like convex mirrors that refract human biases, but do it in a pretty blunt way," Hume said. "They don't permit polite fictions like those that we often sustain our society with." I really like this analogy. It's probably the best one I've heard so far, because it doesn't end there. Later in her talk, Hume took it further, after discussing an algorithm biased against black people used to predict future criminals in the U.S.

"These systems don't permit polite fictions," Hume said. "They're actually a mirror that can enable us to directly observe what might be wrong in society so that we can fix it. But we need to be careful, because if we don't design these systems well, all that they're going to do is encode what's in the data and potentially amplify the prejudices that exist in society today." If an algorithm is designed poorly or -- as almost anyone in AI will tell you nowadays -- if your data is inherently biased, the result will be too. Chances are you've heard this so often it's been hammered into your brain.

This discussion has been archived. No new comments can be posted.

'Algorithms Are Like Convex Mirrors That Refract Human Biases'

Comments Filter:
  • Pretending... (Score:5, Insightful)

    by Rockoon ( 1252108 ) on Saturday November 16, 2019 @04:07AM (#59419476)
    Pretending that the data is lying.... is living in a fantasy world.
    • by AHuxley ( 892839 )
      What use is creating a new network, system, service that has a CoC that stops the data from getting used?
      Criminals and bad people will get listed as they are criminal.
      From jail, prison, federal, state, city, mil... all the images.
      Add in what CCTV has to report and play back..
      More criminals doing crime.
      Whats the AI with a CoC going to do?
      Hide the stats, the faces, the numbers, the results due to a CoC?
      • Data is data. Data can not be biased. The people behind generating it MAY be biased, but data can not be.

        My argument is simple. If one were to assume that black people commit crimes more frequently (because of higher changes of incidents leading to arrest for example), and look at the data and it shows that to be true; it may be a possibility that black people really are more prone to committing crimes, or that they are just discriminated against by police officers. We cant know that yet.

        Here comes
        • by HiThere ( 15173 )

          Saying that data cannot be biased is true only in the sense that data does not have agency. In every other sense data not only can be biased, but normally is. The problem is we often don't realize in what ways.

          One example is a disease that, in conjunction with other factors, kills people so quickly they never get to the hospital. As a result the data collected will show those other factors to have no bearing on the course of the disease, or may even show them as positive effects. (This example is based

          • But you missed my point. The advantage of AI and automation will only work to reduce or eliminate the bias that existed in the data's collection methods. I illustrated two examples which should result in less bias than otherwise would have existed.
    • Re:Pretending... (Score:5, Insightful)

      by MostAwesomeDude ( 980382 ) on Saturday November 16, 2019 @04:40AM (#59419508) Homepage

      Okay, but pretending that the data is honest is pretending that objectivity exists. Data is just bits; it's not capable of discerning its own truth. (This is formally known as Tarski's Undefinability.)

      More importantly, we interpret data relative to a model. Depending on what the model says, the data might support many different hypotheses. But if the model is garbage, then so is any interpretation. For example, race doesn't exist except as a social construction, but that doesn't stop AIs from believing that race is real. Similarly, there are more sexes than just men and women, but AI designed without knowledge of intersex people will be blind to them and always try to force them into one of two groups.

      • For example, race doesn't exist except as a social construction, but that doesn't stop AIs from believing that race is real.

        Are you claiming social constructions aren't real? And that they don't affect things?

        • Re: (Score:2, Insightful)

          Or sickle cell anemia.
        • By definition, social constructions don't need to correlate to anything external to them in the real world. If you stopped actively supporting them, they would stop affecting things instantly.

          • by green1 ( 322787 )

            Incorrect, they only stop affecting things if EVERYONE stops actively supporting them, but even then, some can have lasting effects for generations afterwards.

            If some group is actively demonized for a time and are as a result denied opportunities for education, or employment, and then have the normal resulting social problems from that situation, even if every single person stops doing it, the people from that group will still be at a disadvantage because they will be poor, under educated, and with social p

      • If binding them to a traditional gender gives accurate predictions, it's not wrong.

        That's the whole point of the article -- the AI doesn't care about social nicities (and legal ones, where we deny government the power to racially discriminate separate from any correlations).

      • Re: (Score:3, Insightful)

        by Rockoon ( 1252108 )
        Sorry, but the data IS the correct model.

        What you are really arguing is that some saintly person should develop a model that defies the data, that in spite of the data, that the saint is right.

        If you argue for better data, thats ones thing, but thats not what you are doing. You are arguing for a better model, that somehow isnt built on the data.

        The data continues to not lie.
        • by Sique ( 173459 )
          No. The data is just a result of the model that was used to gather the data. If that model is flawed, your data is flawed.

          If you want to estimate the chances of Donald Trump to win reelection, and you ask only people in Hawaii, San Francisco and in Seattle, you will get data. But that data is in no way a good predictor of Donald Trump's reelection chances. The data is lying to you, because the model you use to gather the data doesn't fit the problem.

          • If data is from models, then you're doing it wrong. Results from models are results, and a prediction of what could happen from the data, they are not data themselves.
        • Sorry, but the data IS the correct model.

          The data may be correct, but it's not a model. For instance, data can show correlation, but you can use it to create a model that assumes causation. This could lead to someone developing a loan application test that decides that someone is not eligible because they live in a certain ZIP code area. In this case, the data may be correct (that particular ZIP code has low score, and person is living in that ZIP code), but your conclusion (this person should not get a loan) could still be wrong.

          • Correlation is enough to optimize the probability for a desired outcome for a given random person. You don't need causation ... it's often impolite to use Bayesian mathematics in a social environment, but when push comes to shove we all have things we value over politeness.

        • by HiThere ( 15173 )

          No. Data is data, not a model. They are two very different things.

          Models take a set of inputs and make predictions about the results. Data doesn't make predictions, and any filtering of it you do adds bias. Whether this is appropriate is difficult to tell, because when you sample from a domain you can get outliers that will skew the interpretation. The ideal way to deal with this is to take the entire domain, but this is almost never practical, so you need to decide how to deal with problematic sample

      • Re: (Score:3, Insightful)

        by mi ( 197448 )

        race doesn't exist except as a social construction

        Wow... So all those obvious and undeniable hereditary differences we observe between different groups of people:

        • Melatonin levels.
        • Eye shapes and other facial features.
        • Predisposition to different diseases.

        are all "social constructions"?

        Let me ask the control question: How many sexes are there? — among mammals in generals and humans in particular?

        • By the same differences you can have an Italian race, a French race, a Netherlands race, etc. And Scientific Consensus for 100(0)'s of years is "2 sexes".
          • by mi ( 197448 )

            By the same differences you can have an Italian race, a French race, a Netherlands race

            Indeed, you can! Moreover, by scientific consensus there are races among animals and plants [wikipedia.org] too — hereditary differences substantial enough to be noticeable, but not pronounced enough to make them different species

            To call them "social constructs" is to deny Science.

            And so is to claim, they can affect only the outward appearance.

          • Name a disease restricted to the Netherlands people. And the scientific consensus on the number of sexes was and is correct - 2.
        • Let me ask the control question: How many sexes are there? - among mammals in generals and humans in particular?

          Repeat with me: sexual dimorphism is a bimodal distribution, not a partition.

      • Re:Pretending... (Score:4, Informative)

        by Cipheron ( 4934805 ) on Saturday November 16, 2019 @10:06AM (#59419980)

        "race doesn't exist except as a social construction, but that doesn't stop AIs from believing that race is real".

        If it was as simple as deleting the "race" variable from the data-set then the problem wouldn't exist. Don't you think people already thought of that? The AIs *don't* know that race even exists because to avoid this very thing we're not entering "race" into the algorithms to start with.

        The problem is that, for example, the police algorithm pins black people as a higher re-arrest risk *even if* you don't include race as a variable in the model. The actual problem is that there are other variables which predict whether you'll be re-arrested which are also correlated with race, and the algorithm merely predicts that people with those traits (who on average are more likely to be black, even though the algorithm has no idea about this) are more likely to be re-arrested.

        The point being that "sanitizing" the data of things like race, gender etc won't appease people concerned with biased algorithms. They care that the outcomes satisfy their specific biases, not whether you can prove you used good methods.

        • Also one point, if the AI "believes race is real" we're still anthropomorphizing too much. If you enter a field for "race" then the AI doesn't put any special significance on that. Adding that field would be as relevant to the AI in terms of value-judgements as your shoe size.

      • by HiThere ( 15173 )

        To say that race doesn't exist except as a social construct is incorrect. That said, the relevant alleles don't seem to be correlated with much that is significant. But if you find that someone is an extremely fast runner, that someone will more likely be black, so race is significant as other than a social construct.

        Most racial gene alleles are concerned with appearance. Except for those the amount of overlap is huge. But there ARE differences. If you need to pick someone to live under low air pressur

      • You really put my mind at ease there saying race doesn't exist! Thing is, my wife gave birth to two beatiful black children, and she swears that she hasn't been unfaithful. Knowing that race doesn't exist except as a social construction helps me make sense of that!
    • by xOneca ( 1271886 )
      There may be innocent people in the data, just like there are innocent people in jail.
    • Re: (Score:3, Insightful)

      by Sique ( 173459 )
      Data is lying. Pretending that it is not is living in a fantasy world.

      There is for instance selection bias. Which data you gather depends heavily on the way you gather data. If you fish with a net with meshes 4 inches wide in a lake, you will not catch any fish smaller than 4 inches. If you use the caught fish to find out the average size of fish in that lake, the data lies to you.

      And if you uncritically use the data to train an AI, the AI will show all the biases that went into designing the ways to ga

      • This is ignorant of the most basic anchor of all statistics: Your sample data is a measure of what you sample, no more.

        The data is not lying. The data represents exactly what it is. Some people lie about what it is, and people like you often misunderstand what it is - but that's the fault of those people, not the data.

        If you tape the subways to find crimes, you will find crimes on the subway - because that's what you are measuring. It's no more "the data is lying" to say you cannot use it to estimate hi

        • by Sique ( 173459 )
          Yes. And what you sample is depending on your method to collect the data. So all your data reflects is your method to collect it. Everything else is up to interpretation.
      • by HiThere ( 15173 )

        It is more correct to say that the data is corrupt. It is subject to numerous biases and distorting filters. So the interpretation of the data is difficult to read, and, to an extent, it tells more about the distortions placed upon the data that of the domain it should cover. Unfortunately, to read that message it would be necessary to have an accurate recording of what the data should have said were it not biased and otherwise distorted.

        All that can be determined from the existing data is a composite im

      • If someone says "the data is lying," they are almost invariably trying to deceive you.

        A scientist would rather try to get better data, or understand what the data actually represents, instead of trying to discard it altogether.
    • The problem is in biased sampling initially done by humans and then used as training data for a computer. This actually impacts all aspects of neural networks. For a trivial situation imagine if you rolled heads 95% of the time. A network could just predict heads all the time and be 95% accurate.

      I have a neural network that just approximates a math equation (no humans involved) and if I don't sample very well I bias the network and destroy its ability to give me good answers.

      You see this in other types of n

      • by HiThere ( 15173 )

        You oversimplify. One major reason for the distortion of the data is that only those caught are entries. E.g. very few bribe-takers or givers will be included, and the prevalence of the uncaught cannot be determined.

  • by karlandtanya ( 601084 ) on Saturday November 16, 2019 @04:12AM (#59419484)

    Lenses refract.

  • The attached clickbait is not even related to the rest of the article.

  • by mrwireless ( 1056688 ) on Saturday November 16, 2019 @04:23AM (#59419496)

    I think mirrors are still too optimistic.

    In his NIPS lifetime achievement award acceptance speech, Ali Rahimi (an inventor of ML) compared modern 'data science' to alchemy:
    https://www.youtube.com/watch?... [youtube.com]

    The best comparison I've found within the humanities is to compare machine learning to horoscopes. After all, ML is often used to sorts people in to groups an then offer an inflated sense of predictability of these people's lives. (I can't find the source now though).

    • After all, ML is often used to sorts people in to groups an then offer an inflated sense of predictability of these people's lives.

      What inflated sense are we talking about?

      Thats the rub. A model that says "80% likely to also..." and is based on a million samples... the 80% isnt "over inflated" .. a humans interpretation of that 80% figure might be, but thats a whole different thing, and in no way is damning of the model.

      Good models arent trying to be "fair" or "just" and trying to interpret them that way, or trying to will them into being that way, is in fact one of the many human flaws of interpretation.

      Every "math is racist" a

    • Lots of usages for machine learning are pretty much alchemy. At its core a neural network is a polynomial approximation for an unknown function. When you make your network with 100K variables you are saying there is some math equation that given the input and these 100K variables it can give the output you need.

      If you are approximating a very complex PDE then a neural network can be extremely useful. Things like fluid dynamics simulation, n-body etc can be run millions of times faster and you can verify the

    • by JBMcB ( 73720 )

      Are you talking about ML as in Meta Language or Machine Learning? If it's Machine Learning, then don't use ML as an acronym, as it's already taken by Meta Language.

      • Are you talking about ML as in Meta Language or Machine Learning? If it's Machine Learning, then don't use ML as an acronym, as it's already taken by Meta Language.

        Hate to be the bearer of bad news... the war is over and your side lost.

    • The best comparison I've found within the humanities is to compare machine learning to horoscopes. After all, ML is often used to sorts people in to groups an then offer an inflated sense of predictability of these people's lives. (I can't find the source now though).

      Machine learning works by finding trends (i.e. things that are true greater than 50% of the time), and using those trends to make predictions about future events.

      Horoscopes work by finding things that are true for almost everyone (i.e. true

  • But what's the point of the summary? Do people really believe their crappy websites are "making a difference"? Please, bitch get a sense of reality and perspective, nobody cares about your blog.

  • by BAReFO0t ( 6240524 ) on Saturday November 16, 2019 @04:34AM (#59419504)

    No matter how much this gets peddled in the public eye, bias is not a bad thing, and cannot be avoided anyway.

    The brain, being a neural net, is a cascade of biases to turn input into patterns and patterns into actions, altering said biases in the process, as a basis for future processing. It is literally a bias machine.
    And "algorithms" ("Now with more MOLECULES!") are a way of writing those down, to externalize such a process.

    That is their point and their use.
    If they wouldn'd bias anything, they wouldn'd *do* anything, and be essentially the identity function. Like multiplying by one. Leave it away and nothing changes.

    So this notion of "Ooohh, teh ebil biasus!" is silly and shows a fundamental cluelessness of the author.
    There is no such thing as unbiased information processing. Then it would not be processing, now would it?

    What a person saying "Yah biasus!" means, is "It does not bias things MY way!". While selfishly implying that their personal bias is "neutral"... *the* universal global rule set for everyone ... because it is, for them.

    And their wish is for everyone to obey that, or be forced/shamed into it. (The purpose of TFA.)

    Well, I'm sorry. Other people have interests that differ from yours. Boo hoo.
    Doesn'n mean it is evil. Just to watch out for who wrote your code and is your news source. Open bias is reliable. You can correct for it.
    And please check and accept sources outside your filter bubble every once in a while!

    You know what is really evil and creepy and manipulative? Sources that claim to be "unbiased" or "fair and balanced".
    But that might just be my bias. And I like my bias.

    • by AHuxley ( 892839 )
      Consider the decades of FBI datas sets.
      Look at each states crime data and images from all over the USA.
      What real time CCTV of who is doing the crime all over the USA shows the world.
      Now the AI and its CoC will do what? Not collect stats? Say a person did crime but the AI won't provide any details of criminals in the area due to the CoC?
      Under the CoC only select few can see the numbers and images and the media is told of a person of interest?
    • by Halo1 ( 136547 )

      No matter how much this gets peddled in the public eye, bias is not a bad thing, and cannot be avoided anyway.

      I fully agree. However, the issue is that this bias gets hidden or downright denied as things get automated. Especially in cases like neural nets, where even the designers/engineers don't have the slightest clue at how the system comes to a given conclusion.

      Such algorithms pick out people at airports to be searched, assign risk factors to people regarding criminal behaviour or financial stability, determine which stocks to buy/sell, consider whether your gait is suspicious or normal, how people get sentence

      • the issue is that this bias gets hidden or downright denied

        Yes, as I said, that is evil. I fully agree.
        What I thought was implied, but I failed to actually say, was that in the case of somebody claiming to have no bias, and hence said evil intention, a worst-case evil bias is reasonable to assume.

        where even the designers/engineers don't have the slightest clue at how the system comes to a given conclusion

        That is a good and interesting point. But actually, people mostly don't know that about themselves eithe

    • by xOneca ( 1271886 )
      "Human bias" just means "bias of Hume" ;)
  • Anybody who has ever had kids will tell you they work in a similar way to the 'naive algorithms' described in the article. Their simple world view reflects the complexities of adult society in an interesting way.

  • Refract... (Score:4, Interesting)

    by bradley13 ( 1118935 ) on Saturday November 16, 2019 @04:54AM (#59419530) Homepage

    The use of the word "refract" already foreshadows the quality of the work. Mirrors don't refract, lenses do.

    Most of the "failures" I have seen have been failures of political correctness. Objective pattern recognition notices, for example, that certain groups are not very good about paying back their loans. That certain groups are more frequently criminals. That certain groups do poorly in school. Or whatever.

    Now, it's absolutely true that group tendencies say nothing about any particular individual. However, if group information is all you have to go on, then it is entirely reasonable to make use of that. Only...you aren't supposed to. It's not PC. The only way out of this is to delete the non-PC attributes entirely from the data sets before processing. Of course, that pretty much destroys the purpose of the processing to begin with.

    • Re:Refract... (Score:4, Interesting)

      by SuricouRaven ( 1897204 ) on Saturday November 16, 2019 @05:49AM (#59419600)

      Even deleting the 'non-PC attributes' isn't as easy as it might seem. Sure, you can carefully omit the 'race' field from your criminals records before feeding them into the predict-o-matic - but there are a lot of attributes that correlate with criminality. The algorithm will swiftly work out that people in certain parts of town are more likely to commit crimes, and that people on low family income are more likely to commit crimes, and that people or poor educational attainment are more likely to commit crimes... and we just run again into the most fundamental problem: We want to enjoy all the benefits that prejudice allows, declaring people are guilty or innocent based on statistical correlations, while hiding the dirty business away inside a machine-learning black box so we can pretend these correlations are objective and not feel guilty about locking people away in prison for years based on where they happen to live or how much money their parents make.

      • so we can pretend these correlations are objective and not feel guilty about locking people away in prison for years based on where they happen to live or how much money their parents make.

        We lock people away in prison for years, when we do, because they commit crimes, not because of any of that stuff.

        • In this instance, the concern isn't really the locking-them-away part but the duration - the use of dubious AI to judge how likely someone is to reoffend, and thus how many years to imprison them for, or if they make parole. It sounds like a great idea when you think of the computer as an infallable oracle, but all it's really doing is hiding away disapointingly shallow judgements.

          If the 'AI' were a human in the same position, we probably wouldn't be too happy with such a judge. "I see here that your mother

      • by AmiMoJo ( 196126 )

        There aren't really any benefits to that kind of prejudice, people just think there are because they are making a common mistake. All it does is make society waste money locking people up and ensuring that they can never be productive members of society again instead of fixing the actual problems and saving money in the long run.

        It's simply much easier to sell punishment of the "bad" people than it is to sell helping them.

      • Sure, you can carefully omit the 'race' field from your criminals records before feeding them into the predict-o-matic - but there are a lot of attributes that correlate with criminality. The algorithm will swiftly work out that people in certain parts of town are more likely to commit crimes, and that people on low family income are more likely to commit crimes, and that people or poor educational attainment are more likely to commit crimes.

        Then the algorithm would be no different than the results sociologists get when they investigate crime and control for all of these factors instead of just naively looking at race. Once you do those things, you find out that being black doesn't really have anything to do with criminality because it is precisely the socioeconomic status and family make up (single parent families) that are the major drivers. It only appears as though the algorithm is racist because there are more black people who grow up in s

      • The algorithm will swiftly work out that people in certain parts of town are more likely to commit crimes, and that people on low family income are more likely to commit crimes, and that people or poor educational attainment are more likely to commit crimes... and we just run again into the most fundamental problem: We want to enjoy all the benefits that prejudice allows, declaring people are guilty or innocent based on statistical correlations, while hiding the dirty business away inside a machine-learning

    • However, if group information is all you have to go on, then it is entirely reasonable to make use of that.

      No, it's not. Treating people differently based on the characteristics of the group is assigned collective guilt to specific individuals. That is not reasonable, it's not how we would like to be treated and so it's incumbent on us not to treat others in this way.

      What's more, it's especially unreasonable when talking about a few percentage points of difference. Saying that some property is associated w

    • This is not about political correctness.

      This is about things like not sending people to jail unjustly.
      https://www.propublica.org/art... [propublica.org]

    • Well a specific problem is the assumption that some group represents disproportionately in the criminal statistics. It is true that some groups historically, and even today, are scrutinized far more closely than others. That difference in scrutiny obviously leads to more prosecutions among those groups. A good example is drug use, actual scientific studies have shown that the difference in drug use between blacks and whites is pretty small. But if you look at the numbers Blacks are imprisoned for drugs at 6

  • What kind of mad bogey-man BS is this?? Donâ(TM)t use algorithms, they are bad? How do you even argue against that. Your opponent has declared himself mad before you even gather your thoughts.

  • Comment removed based on user account deletion
  • by cascadingstylesheet ( 140919 ) on Saturday November 16, 2019 @09:30AM (#59419908) Journal

    "They don't permit polite fictions like those that we often sustain our society with."

    Well, and that's the "problem". Except the taboos are so strong that she can't say that the polite fictions are that certain groups aren't more criminal, that it's all due to bias, and so forth.

    Analyzing data does cut through the polite fictions, but she can't even say what those polite fictions are. That does trigger a lot of cognitive dissonance and thrashing about.

  • Mirrors reflect, lenses refract.
  • by gurps_npc ( 621217 ) on Saturday November 16, 2019 @10:11AM (#59419998) Homepage

    I loved the story of how the British decided where to add armor to their aircraft during World War II.

    They looked at places that had bullet holes in the planes that returned to base. Those places they left alone, because clearly the plane could survive being shot there.

    Everywhere they never found a hole, they added armor.

    You can use the Algorythms the same way. Feed it your current data, look for irrational decisions it makes, and modify your current processes to solve those problems

    • by jeremyp ( 130771 )

      Interestingly, there's no real source for that story and, if you look at late war British bombers e.g. the Lancaster, they tend to have very little armour anywhere. Only the pilot has any real armour protection.

      The Americans, on the other hand, did put significant amounts of armour on their bombers and there is a traceable story that matches what you said.

  • Watch Cathy O'Neil, a data scientist, talk about Weapons Of Math Destruction [youtube.com] (also here [youtube.com]).

    She lays out how algorithms can be biased in ways that you never imagined.

A person with one watch knows what time it is; a person with two watches is never sure. Proverb

Working...