integralclosure - Slashdot User

Submission + - Google Brain researchers make significant progress on language modeling (arxiv.org)

Submitted by integralclosure on Friday February 12, 2016 @09:23AM

integralclosure writes: Using neural networks, Google Brain researchers have significantly improved a computer's ability to model English (achieving extremely low perplexity score on a large dataset). Using the model they were able to generate random sentences, such as the following: "Yuri Zhirkov was in attendance at the Stamford Bridge at the start of the second half but neither Drogba nor Malouda was able to push on through the Barcelona defence." The sentences are generally coherent and mostly grammatically correct. Advances seem to be a replay of neural networks' dominance in the Imagenet competition.

Comment What it is, and what it isn't (Score 1) 230

by integralclosure on Sunday June 28, 2015 @08:39PM (#50008703) Attached to: WSJ Overstates the Case Of the Testy A.I.

Yes, the WSJ article is hyped.

On the other hand, this comment by the poster is not accurate, either: "At best, Google is programming (not teaching) a computer to mimic the conversation of humans under highly constrained circumstances. And the methods used have nothing to do with true cognition. "

Google didn't program the AI. Rather, they took one meta-level step back and used a very simple training algorithm that did the "programming" for them, using training data (the program is encoded as an LSTM neural net that processes word-vector encodings of tokens) . Based on direct tests, it looks as though the model learned (or, use scare-quotes, if you must -- "learned") things like:

The "rules" of discourse;
How to leverage context;
How to do some amount commonsense reasoning.

All these things are extremely hard to program into a computer using rules-based methods; but, as the authors show, a purely data-driven approach, instead, works fairly well.

And just to be clear, what they applied is not datamining; it is machine learning. Basically, machine learning is where you feed in a bunch of training data, and from that, an algrorithm builds a program -- see, for instance, this lecture by John Platt (former Microsoft machine learning scientist, now at Google) on the difference between AI, machine learning, and datamining:

John Platt Gigaom lecture

Using machine learning, it is possible to get a computer to "learn" a subset of the Python programming language, for example, such that you can feed into the model a little program + input, and it will produce for you the corresponding output. See:

Learning to Execute

What the authors of the conversation-generation paper wondered was whether they could get the computer to "learn" a whole dialog system (or "chatbot") from just conversation logs; and based on experiments, it looks like they succeeded (it's better than Cleverbot on the conversations they tested with) . They note in the paper:

We find it encouraging that the model can remember facts, understand contexts, perform common sense reasoning without the complexity in traditional pipelines. What surprises us is that the model does so without any explicit knowledge representation component except for the parameters in the word vectors. Perhaps most practically significant is the fact that the model can generalize to new questions. In other words, it does not simply look up for an answer by matching the question with the existing database. In fact, most of the questions presented above, except for the first conversation, do not appear in the training set.

This is not simply doing phrase-substitution, or some simple statistical tricks; it is more complicated than that... but, yes, it's not "true AI". In addition to that article on "Learning to Execute", see this blog posting by Yoav Goldberg, and skip down to where it says "So why am I impressed with RNNs after all?":

The unreasonable effectiveness of Character-level Language Models

Slashdot Top Deals