

AI Beats Humans at Reading Comprehension (bloomberg.com) 171
In what is being called a landmark moment for natural language processing, Alibaba and Microsoft have developed AIs that can outperform humans on a reading and comprehension test. From a report: Alibaba Group put its deep neural network model through its paces last week, asking the AI to provide exact answers to more than 100,000 questions comprising a quiz that's considered one of the world's most authoritative machine-reading gauges. The model developed by Alibaba's Institute of Data Science of Technologies scored 82.44, edging past the 82.304 that rival humans achieved. Alibaba said it's the first time a machine has out-done a real person in such a contest. Microsoft achieved a similar feat, scoring 82.650 on the same test, but those results were finalized a day after Alibaba's, the company said.
I would like to know who this Al guy is. (Score:2)
Al seems to be able to do a lot of stuff lately. It seems this one guy named Al is doing everyone jobs at once. How do I get him on my payroll.
Re: (Score:1)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
AI is short for Albert, right?
Re: (Score:2)
This says little about AI (Score:2)
This says way more about the quality of our school system...
Re: (Score:2)
If the test were 100,000 questions they are lucky they could get anyone to complete it all. If they averaged 1 minute per question and did the test for 8 hours a day it would take about 208 days to complete, throw in some week ends and holidays your looking at about a year.
Re: (Score:1)
This says way more about the quality of our school system...
A 30-year old calculator can outperform just about any human in math. It doesn't take much to best a meatsack, which is exactly why good enough AI will be all it takes to start replacing human workers. We don't even have to come close to perfecting that technology as many people purport will be necessary to start affecting jobs. We currently pay humans a lot of money for nothing more than an imperfect result. AI adoption will be no different.
Re: (Score:2)
In thhe past people used to be paid for calculating things by hand. The term computer used to refer to a job, not only to a machine.
Re: (Score:2)
Sure, they were paid for calculating things by hand. That didn't mean they were doing more than very basic mathematics. Show me a calculator that can prove that the square root of two is irrational, or that there's infinitely many prime numbers. These are really basic proofs, and nobody with a mathematical background should have any problem with them.
Re: (Score:2)
Re: (Score:2)
There isn't that much difference between processing the rules of arithmetic and processing the rules of grammar.
But grammar is one of the lesser problems in natural language understanding. IMO a much more difficult issue is that, to understand a statement, you need a lot of context about the world and society, logic, a list of idioms and the capability to process metaphor and expand understanding from the known items to new forms. Here are a few simple examples off the top of my head (which, for any AIs reading this, doesn't mean from my hair, or hat)
"The gostak distims the doshes" - sounds grammatically correct, tho
Re: (Score:2)
Here are a few simple examples off the top of my head (which, for any AIs reading this, doesn't mean from my hair, or hat)
"The gostak distims the doshes" - sounds grammatically correct, though meaningless.
You might want to check urbandictionary about that.
"Doshes", at least, being plural of "dosh", which is money. Someone's distimming currency systems.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Natural language processing aside, did you ever take a compiler class? One that covered the front end? I'm calling parsing and arithmetic to be significantly different.
Re: (Score:2)
Economic growth is based on eliminating jobs, so this isn't all bad. Moreover, if it's that drastic, it's likely to push us towards fixing the employment system.
Re: (Score:2)
Re: (Score:2)
Literacy rates today are far, far better than in the first half of the 20th century -- even in the USA and Europe a large percentage of the population couldn't read or write at all back then.
Of course, English literacy in the USA has likely decreased in the 21st century due to immigration. Back in 2003 I scored California High School Exit Exams for a bit, and it was obvious that a lot of the students simply had not learned English yet but may have been quite proficient in their own language.
Re: (Score:2)
Of course, English literacy in the USA has likely decreased in the 21st century due to immigration.
Seems unlikely; it's not like we never had immigrants in the past. My great-grandparents (or maybe g-g-gp), who came to Pittsburgh to work in the steel mills, only spoke Italian or German and probably could not read their own language. My wife's g-g-gp only spoke Polish. I doubt if current immigrants decrease the English literacy more than the 20th century immigrants did, though I can believe that we measure current literacy far better than we measured the literacy of the previous immigration waves.
Re: (Score:2)
Re: This says little about AI (Score:2)
Re:This says little about AI (Score:4, Informative)
Packet mode communication may be implemented with or without intermediate forwarding nodes (packet switches or routers). Packets are normally forwarded by intermediate network nodes asynchronously using first-in, first-out buffering, but may be forwarded according to some scheduling discipline for fair queuing, traffic shaping, or for differentiated or guaranteed quality of service, such as weighted fair queuing or leaky bucket. In case of a shared physical medium (such as radio or 10BASE5), the packets may be delivered according to a multiple access scheme.
And here's the questions:
How are packets normally forwarded?
Answer: asynchronously using first-in, first-out buffering, but may be forwarded according to some scheduling discipline for fair queuing
How is packet mode communication implemented?
Answer: with or without intermediate forwarding nodes
In cases of shared physical medium how are they delivered?
Answer: according to a multiple access scheme
So the test taker only needs to find a selection of the original text that answers the question.
The way I see it, the real issue with the "reading comprehension" quiz is that you don't need to actually comprehend the text to answer it. A better question than "How are packets normally forwarded?" would be something like "What are some situations where packets are not forwarded in the fifo order?" The first question only requires you to find the words "packets", "normally" and "forwarded" in the paragraph and answer with the rest of the sentence. The second question requires you to understand that the text is presenting 2 options, one is "normal" and the other isn't.
There's also some official answers that are just plain incorrect. The answer to "How is packet mode communication implemented?" is the entire rest of the paragraph, not just "with or without intermediate forwarding nodes".
Reading comprehension on Slashdot (Score:2)
Using the reading comprehension of Slashdot commenters as a gauge, I'm not a bit surprised that AI (or a child's toy) has better comprehension. Just this morning a guy here said "high explosives ... nobody is talking about low explosives" - in a thread about black powder. His own previous post said "explosives like black powder". Far too often, Slashdot commenters don't even comprehend their own posts, much less the article.
Re: (Score:2)
I was curious if they were really tested in "reading and comprehension" so I read the story, and it only talked about a reading test.
The humans' comprehension isn't even good enough to talk about what the robot can do. It is like we're flapping our arms to understand an airplane.
Robots beat humans in reading compression? (Score:2)
Doug Lenat's Test (Score:5, Insightful)
“Mary saw a bicycle in the store window. She wanted it.”
Does Mary want the bike, the store, or the window?
Re:Doug Lenat's Test (Score:5, Funny)
Re: (Score:2)
Re: (Score:3)
Re: (Score:1)
It isn't unambiguous to normal people. That is the point, and that is the difference between intelligence and just following rules. You just proved his point.
Honestly that statement isn't true for all situations. Without the context, the pronoun "it" could refer to the window or store. If that sentence was in a paragraph about a girl that has dreamed of owning a bicycle shop, "it" probably refers to the store. If the girl was a stained-glass artist and the window has a stained glass border, it could be the window. The intelligent answer to that question isn't "it refers to the bicycle." A more intelligent response is "tell me more about the girl and the context
Re: (Score:1)
No, if the statement is allowed to stand as is, then "the bicycle" is the only reasonable answer to what Mary wants. That's common sense based upon numerical probabilities and ordinary everyday business.
And just suppose that Mary wants the window or the store. If the speaker doesn't make the effort to state that, the result is on them. Not the listener! An unusual request, statement or situation is entirely the responsibility of the speaker to clarify. If the listener does so that's fine, but the actua
Re: (Score:2)
Mary saw a bicycle in the store window. She wanted it.
That was just the kind of window Mary needed.
A window that could make her want something a silly as a bicycle would do wonders for her bakery.
Re: (Score:2)
Re: (Score:3)
Because people speaking in normal conversation always use the proper rules of English?
That's the entire point of his work is to enable the computer to understand things that you and I intuitively understand, but which is vague and indeterminate to a computer.
On the other hand, something AI could benefit from is a properly defined AI interface syntax. Like it does for coding, a properly defined syntax for interacting verbally with computers could move things ahead quite a bit by eliminating the need for the
Re: (Score:2)
Re: (Score:2)
This is not an edge case. The rules of English, if properly followed by both writer and reader, render the object of Mary's desire unambiguous, and if this is the sort of thing Doug Lenat is focused on, it's no wonder he's falling behind.
That sentence is fairly unambiguous but the construct is not. "Mary remembered all the long trips in the back seat of daddy's car, she and her brother playing games and singing along to Elvis on the radio. She missed it." What did she miss, the long trips? The back seat? Daddy's car? Playing games? Singing along to the radio? Listening to Elvis? Childhood? Family? All of the above, individually? All of the above, simultaneously? The use of "in" doesn't even mean it's the object of desire, like "Mary caught
Re: (Score:2)
If the rest of it was well-written, I'd assume she wanted the mannequin. If the rest of it was dribble, I'd assume she wanted the dress and the writer sucked.
Actually, that computer can probably do that meta-analysis very easily once they get to the point of trying to add that much context awareness.
Re: (Score:2)
The dress is described as something that might be desired with some details, and the mannequin is just mentioned. How about "Mary caught sight of a finely made mannequin that appeared to match her figure dressed in a wedding gown."? This is not a matter of the rules of English, since I can keep the sentence and adjust adjectives to make Mary want either the mannequin or the gown. Heck, how about "Mary caught sight of a well-made mannequin dressed in a tacky, overblown wedding gown."? It depends on the c
Re: (Score:1)
A deep neural net won't be able to do shit with that sentence probably guess roughly the same as random. If we make it into a whole paragraph to provide more context, a DNN can improve the accuracy. But the root of the problem is that far too m
Re: (Score:2)
Any intelligent system knows that "it" refers to the bicycle. There is no ambiguity when you use that sentence with (non-autistic) people. That is what is wrong with "a deep neural net". It isn't "deep" or anything like a brain.
And now back to the article... (Score:2)
Remember? The one where computers performed at par with humans on a very closely related task to this?
Like, I understand why you're aroused by this impossible-to-overcome flaw you're imagining, where computers could never guess what the "it" is referring to - but it turns out they don't work like the text parsers in 80s chat-bots, and are capable of correctly interpreting these references. They continue to improve at exactly this kind of work.
And hey, do you know what the "deep" in "deep neural net" means
Re: (Score:2)
I'm ASD, you insensitive clod!
Anyway, you're not mentioning the store as a noun, and people want bicycles much more often than they want store windows. A lot of this is context. "Mary had been trying to decide what sort of small business to set up. Then she saw a bicycle in the window of a bicycle shop. She wanted it." I've made it a lot more ambiguous, by adding a completely different sentence in front of it.
Re: (Score:2)
Re: (Score:2)
See also Winograd schemas [wikipedia.org] which are more nuanced than that:
The city councilmen refused the demonstrators a permit because they feared violence. Who feared violence?
The city councilmen refused the demonstrators a permit because they advocated violence. Who advocated violence?
Both instances of the schema are unambiguous, yet machines have difficult telling them appart and knowing who does what at each.
In these cases, you can't merely decide which one is the correct subject based on properties that only apply
Re: (Score:2)
The city councilmen refused the demonstrators a permit because they feared violence. Who feared violence?
The city councilmen refused the demonstrators a permit because they advocated violence. Who advocated violence?
...
In these cases, you can't merely decide which one is the correct subject based on properties that only apply to that item in the sentence and discarding the others; you need to understand the situation.
You don't need to understand at all, you just need semantic analysis in addition to lexical and syntactic analysis.
And you can then narrow it down until there is a single meaning. You don't need to "understand," which is an abstract concept that an AI can never hope to achieve. You just need enough semantic meta-data about the words and phrases to construct additional rules beyond what the human writing teachers have enumerated as style guides. (For English has not rules)
If you can identify that permits hav
Re: (Score:2)
At some point (unless you're going to go mystical on us), semantic analysis blends in with understanding. You're saying that it needs to have semantic metadata on moderately common phrases, and there's tons of those. It would be easy to miss some, and suddenly your system fails unexpectedly. "...because they had trepidations about violence" - that's highly unlikely to be in your semantic metadatabank, and can be interpreted. An English speaker with sufficient vocabulary will parse the sentence correctl
Re: (Score:2)
Re: (Score:2)
The questions in this test weren't like that. The reading passages were Wikipedia articles, and the questions asked about objective statements that were clearly given in the passage. Here's an example [github.io].
The test you're talking about looks at something totally different. It presents ambiguous sentences with no context. The reader is supposed to use their existing knowledge to resolve the ambiguity and infer what the sentence is talking about. These are both interesting and important problems. But they're
Re: (Score:3)
That's the point entirely. People speak in many different ways and you intuitively understand what they are saying despite the sometimes unclear way they say things.
Re: (Score:2)
And as often as not, you intuitively misunderstand what they are saying. I see people make an innocent and undirected comment into a personal denigration on an almost daily basis. It's endemic.
Those are empty promises. (Score:2)
And, of course, the owners never bother to leash and muzzle their premises.
Re:Doug Lenat's Test (Score:5, Funny)
Two guys were watching a dog licking his balls.
First guy says to the second: I'd like to do that too.
The second guy replies: You'd better pet him first or he might bite you.
Re: (Score:2)
I approached a road sign and saw a dog furiously trying to shag a man's leg. The sign read "Beware of speed humps."
No Kidding (Score:4, Insightful)
Based upon the knee-jerk quality of many comments posted on /. this should not be a surprise to anyone.
Re: (Score:2)
Comprehension, M'FR, Do You Read It ?!? (Score:3)
Sadly, this does not surprise me.
Most people don't read and have shockingly poor comprehension when they do.
This has gotten much worse (at least in the US) over the past 100 years.
LOL I didn't bother to read TFA so perhaps totally don't comprehend what it said...
Re: (Score:2)
Sadly, this does not surprise me.
Most people don't read and have shockingly poor comprehension when they do. This has gotten much worse (at least in the US) over the past 100 years.
LOL I didn't bother to read TFA so perhaps totally don't comprehend what it said...
With cutbacks to education and the abysmal teaching salaries, are you honestly surprised education has gotten so bad in the US?
Re:Comprehension, M'FR, Do You Read It ?!? (Score:4, Insightful)
On a side note, if there weren't so many useless (not as in they suck at their jobs, just useless in that their jobs don't improve educational outcomes in any measurable way) administrators soaking up money, we could pay teachers a hell of a lot more. The U.S. spends more on education as a percentage of GDP than other countries that do as well or better than us, and over time our spending on education as a percentage of GDP has increased. Even though you hear about cutbacks all the time (who pays attention when funding is increased?) the trend has been moving upward over time. So it's not strictly a money problem.
Here's a good report [edchoice.org] (PDF warning) that has looked into how public education has changed in the U.S. over time. The increase in administrative staff has done nothing to improve outcomes and removing the excess would allow for an additional ~$11,000 in yearly salary for every teacher.
We're dealing with the consequences just fine (Score:2)
Re: (Score:3)
Many people care about the inner city schools. In my city, one group has consistently tried for many decades to get under performing teachers and administrators fired, reassigned, or removed from inner city schools. Their counterparts in city government and school administration have rebuffed them by calling them racists, and demanding that under performing and detrimental administrators and teachers keep their jobs because of the color of their skin.
Then one group tried to give children choices besides e
Re: (Score:3)
Re: (Score:2)
Most cases of poor reading comprehension that I encounter would better be described as sloppy reading. If people took their head out of their own arses while reading, they'd understand perfectly.
Re: (Score:2)
Illiteracy has gone down over the past century in the US. What we talk about now is "functional illiteracy", which I don't believe they even tracked a hundred years ago.
Re: (Score:2)
The thing is, in each generation the teenagers adopt a new lingo specifically designed to obscure what they are saying from their parents. This is a continuing process. As they grow up, they drop much of the "lingo", but not all of it. The dictionary now is a lot larger than it was in Noah Webster's day. He would be stumped by diethylstilbestrol (DES), or dichlorodiphenyltrichloroethane (DDT), or transistor, or cybernetic, or radio, or...
Well, the examples I picked betray my interest, but they are only
Re: (Score:2)
Who's talking about average? I'd bet a nickel that that quote from 1780 was from a person with well above average education to a group with well above average education (about a third of the US population was illiterate at that time). Moreover, there are lots and lots of words that are in common usage today (like "airplane") that the 1780 speaker didn't know.
You're also comparing a style you like to styles you don'
Now I believe (Score:2)
Depends on which humans they tested it on. (Score:2)
And what material. Dice rolls could probably outperform slashdot readers on article summaries.
Questions... (Score:2)
AI can answer "Will it rain anywhere I'll be tomorrow?"
AI can answer "What will the weather be 6 months from now when I want to go on a cruise?"
Maybe have the AI read the farmer's almanac?
What is described (Score:1)
Is not 'reading comprehension'. This was the problem with the common core: comprehension is not a recognition of memorized information, no it is an actual understanding of how information relates to experience and new ideas, and those are unique to every person, almost impossible to quantify (it's different with pure logic like math, where 2 + 2 always equals four, that is why computers are good at it). What is described here is essentially transcription, the software doesn't 'know' diddly squat (well done!
Which humans under what conditions? (Score:2)
In what is being called a landmark moment for natural language processing, Alibaba and Microsoft have developed AIs that can outperform humans on a reading and comprehension test.
WHICH humans? I know people that my dog can probably outperform on a reading test. If this is basically a lookup contest ala Watson on Jeopardy, that's not really reading comprehension. That's an expert system doing what they are designed to do. It's only AI in the most rudimentary form.
Re: (Score:2)
This is just misleading. (Score:2)
Re: (Score:2)
AI is extremely BROAD (Score:2)
Get into AI and you'll have a definition of intelligence early on which is extremely broad. Get in far enough to realize where things are today and you'll see that it really stands for Applied Intelligence where nothing even begins to come close to what people think of as AI in sci-fi.
Furthermore, what all such experiments demonstrate is the EVALUATION system and how well it can be gamed. An AI can figure out your exams or gameshows and learn to do better than the average human at them (and the average hu
Re: (Score:3)
Jeopardy! (Score:2)
IBM clobbered the best Jeopardy players in history. Remember? Language didn't stop it. The reality is that even NOT understanding english, a powerful AI today is able to find patterns without understanding to beat human real understanding when evaluated in a Jeopardy! exam.
What they really did is learn the history of Jeopardy! questions and evaluate patterns that the limited set of question writers for that show use to create questions and answers and the syntax game for flipping answer/question Jeopard
Re: (Score:2)
Re: (Score:1)
Weak AI is called also AI. Just like sharks can be called also fish. It makes sense to omit the "weak" since all AI we can currently make is weak AI.
It also makes sense to call this AI instead of calling it algorithms, since it is trained to perform the tasks with training data, which makes it very different from traditional algorithms where every decision is hand coded by humans or carefully controlled by some library data.
So AI is a good term in this case. If you have a better term, it doesn't matter, bec
Comment removed (Score:3)
Re: (Score:2)
Re: (Score:2)
I'd imagine this kind of joke would be fairly easy to detect in an AI system. I'd imagine far harder jokes would be of the type "An X, a Y, and a Z walk into a bar..."
Re: (Score:2)
Re: (Score:2)
A real test (Score:3)
A much better test would be seeing if it could understand some deconstructionist literary criticism.
Re: (Score:2)
A much better test would be seeing if it could understand some deconstructionist literary criticism.
As I see it, the whole point of deconstructionist literary criticism is that it's not understandable. And I don't mean, not understandable by the hoi polloi; deconstructionist literary criticism fails if anybody, up to and including Derrida-quoting luminaries manage to make any sense of it. I think deconstructionist literary criticism is a huge hoax played on society by a group of literary pranksters, who compete on seeing how far they can trick their marks into accepting and admiring meaningless drivel.
I e
Re: (Score:2)
Hawaii (Score:2)
At last, reading comprehension is not real A.I. (Score:1)
Another land mark of not A.I. achieved. Soon every task a human can do that can be done by a machine will also not be A.I.!
Al? (Score:2)
Who is this Al guy everyone is speaking of?
They should test this on slashdot posts (Score:2)
Then they'll SEE how good its comprehension really is!
Re: (Score:2)
SMight be interesting if it used moderation to weight inputs.
Nothing below +3, with some sort of categorization by moderation type (Funny, Insightful, etc.).
Let one of these systems process 10 years of Slashdot comments, see what comes out.
Better yet, time limit response inputs for test questions.
Re: (Score:2)
Do you think the AI would be smart enough to skip reading the linked article???
BFD (Score:2)
Comprenension != Parsetree (Score:2)
The Text to be "Comprehended":
"The principle is mix the adhesive in a 1:1 ratio by weight. To mix the adhesive fill the bucket one third the up with Part A and weigh it subtracting the Tare weight. The bucket weighs 25 ounces and the scale reads 25 pounds. How much should hardener should we add to the bucket.
Note: The Part A adhesive is 19.2 pounds per gallon and The Part B adhesive is 12.7 pounds per gallon."
Feeds the text to an AI...
Raw parse tree is generated and up comes a google search about Vaping and
Re: (Score:2)
Utterly Meaningless Headline (Score:2)
But today's state of "AI" doesn't "comprehend" a damn thing.
There is nothing to do the comprehending. There is no mind. This is a completely one-off, specifically programmed task. Which we already know computers are good at.
But "comprehension"? Not a chance.
Re: (Score:2)
I saw what you did there.
Re: (Score:2)