Forgot your password?
typodupeerror
User Journal

Journal Flamesplash's Journal: Reteaching the brain - aka human recognition 1

I am reading a paper for my AI course ( which just happens to be written by the professor ), and also reading Snow Crash. While reading the paper yesterday my subconscious put something together from the paper and Snow Crash.

The Paper covers some research on building a voice understanding system. I.e. speak into a microphone and your words appear on the screen, or cause something to happen etc.. While not being all the amazing now a days this paper was published in 1980. While the paper is mainly on figuring out what a person verbally said, and not on what a person logically said, I began to think about the whole idea of Human Computer Interaction.

One of the main problems with Voice Understanding systems, is that it lacks a large amount of contextual information. For instance if you are dictating a memo, and then all of a sudden say something to your officemate, the software will dictate what you just said, it doesn't know someone else is in the room and that you were actually talking to them. Additionally there is a lot of subtlety in the English language, that helps to infer what you actually meant, good examples of this are the use of commas grammatically, that aren't obvious in spoken language.

Now where Snow Crash comes in, is that there is a section of the book that talks of an ancient race of people that were language wise incredibly diverse from all others.

"Is Sumerian really that good?"
"Not as far as modern-day linguists can tell," the Librarian says. "As I mentioned, it is largely impossible for us to grasp. Lagos suspected that words worked differently in those days. If one's native tongue influences the physical structure of the developing brain, then it is fair to say that the Sumerians - who spoke a language radically different from anything in existence today- had fundamentally different brains from yours.

So, basically what it's saying is that language affects how our brains develop as we grow. Given this American brains would be more similar than and American and a French, both of which would be completely different from a Sumerian so as to not really have any similarity.

So, while it wasn't the main intent, when we were children we learned how to write and think our native language, which then became the communication method we primarily use with computers. Not only learning English, but learning to write English affected how our brains developed. Similarly, also learning to speak English affected our brain development. Two different forms of communication required a good bit of separate brain reconstruction.

Now, getting back to the point. Learning to operate a computer through our hands doesn't require a huge amount of learning, and probably very little additional brain reconstructions. Additionally it allows us to direct the information we want at it and only it.

However; when it comes to controlling a computer through voice, all this made me start to think, that maybe being able to effectively control a computer through voice will require a new interface than we are used to, one where ambiguity is removed as in written language and intention is more apparent. This may simply sound like only allowing a "small" set of specific commands, but I'm thinking more of an adaptation of our normal voice interface.

Continuing on with this is the idea of mind-controlled computers. While this is nothing new it really hasn't gone anywhere. I remember seeing a TV presentation a while back about a finger device that allowed physically handicapped children to control a video game car through simple thought commands as forward, back, left and right. While the commands are rather basic, the concept and ability to do such a thing is amazing. Going back to my overall idea, it would seem to me that one problem with this method is that whatever sensors are used in the whole process, they would be flooded with enormous amounts of information they simply don't care about. One way to do this is to try to fine-tune the sensors, but that is only useful to a point. What if we could teach ourselves how to think differently, how to change our internal communication mechanism to better interact with this new outside influence? Or better yet, find a way to integrate it into our childhood development so that talking to a computer was second nature just like talking to each other.

As and ending point, I'm not necessarily saying we should do any of this, I'm simply speculating on what the best way to perfect Human Computer Interaction would be.

This discussion has been archived. No new comments can be posted.

Reteaching the brain - aka human recognition

Comments Filter:
  • Interesting post. One thing I'd like to bring up though, is that all of Human Factors, including Human-Computer Interaction, is based around the notion that you should change the device to fit the person, rather than change the person to fit the device. Back when psychology was still really young, the basis for forming the interfaces between humans and tasks was Scientific Management (or Taylorism): study the task, find the most efficient method, and make the person work exactly that way. Scientific Management was initially successful and did have an effect on Human Factors, but on the whole the Human Factors approach of changing the device has been far more successful. (Where would you work better: somewhere where your chair's armwrests lock your arms into optimum typing position so that you type as quickly as possible, or somewhere where your chair's armwrests can detect when you're typing and when you're relaxing, and adjust accordingly?) There has been much wider success in making a device fit the strengths and account for the weaknesses in the human, rather than forcing the human to make up for their own shortcomings.

    As far as the cognitive aspects--the development of the brain as it processes knowledge--there are several theories on that too. Nativism, originally devised by Plato, basically says that we're both with all the knowledge already in us, and it doesn't become known to us until experience or contemplation draws it out. Empiricism, from Socrates, basically said our brain is like a blank chalkboard that stays empty until our experiences/thoughts fill it. Neither one of these is 100% correct, but they both have some merit. Learning to speak English rather than French really wouldn't affect our mind much. Language constructs only really affect a small portion of our short-term memory; sections of it are visually-based, and long-term memory is semantically-based, which doesn't differ for various cultures. Our brain does rewire itself over a LONG time (ie, if you play chess for 15 years your brain will become somewhat better at thinking about chess) but the effect is VERY localized (after playing chess for 15 years you won't really have any extra benefit if you change to a similar-but-different game.) Many things people do learn: the specifics of life (what is a tree? how do you dial a telephone number?) and various behaviors. Many other things develop about the same for everyone (how to you recognize and discern the tree? how do you recall the phone number?) Learning to do something is more a matter of placing certain semantic and procedural tidbits of info into the already-wired brain.

    Controlling a computer though voice is definitely going to require a new interface, as does pretty much every form of communication. My Human Factors professor was part of a group of psychologists that worked with Microsoft for developing the pen-based input for the Tablet PC. According to him, many of the engineers for the Tablet didn't really consider or even believe that a new interface would be needed, and originally they treated the pen as nothing more than a 3d mouse. This caused all sorts of problems when it was given to a user, since users already has a notion of how a pen works, and completely different a notion of how a mouse works. The proper interface to use for the Tablet would be neither of these: a mouse can do things that a pen simply isn't made for, while a pen is used in many more directions and has many more conventions than a mouse. The resulting interface recommended by the psychologists combined the most prototypical parts of each of these interfaces, as determined through user testing. Looking backwards it seems obvious that this would be a good solution, but in the design process these things are almost always overlooked. Designers form their own mental models based on how things work underneath, so that when they try to use the interface they have no problems: they already know what everything is supposed to do, so it makes sense to them. A user who has never experienced the device, however, has no knowledge of the interface and thus will usually find it frustrating and ultimately useless and nonfunctional.

    I think most attempts to create a voice interface for the computer have had similar problems: the designers usually try to use the sensor approach on audio ("we listen by hearing words right? so just make the computer detect words and we're done!") but have very limited success because of the flood of useless information and the ambiguity of useful information. The other solution is to limit the commands or preface each command with a keyword and spell things out ("menu file; open; m-y-f-i-l-e-.-t-x-t"; ok), which puts all the effort of learning these commands and getting them right on the user's shoulders. This works great for the programmer: just tell everyone who doesn't like it to RTFM. I don't know what form of interface would be best for the user, but I don't think anybody has come up with it yet. I think it will be something so insightful, so innovative, and so useful that nobody will really notice it's there at all. Movie sound effects are like this: their job is to be so completely transparent that nobody realizes that they're listening to sound effects. A voice interface should be so effortless and intuitive that I can just walk up to it, start talking, and not have to realize or care that I'm giving it voice commands.

    I don't think that making people learn a system or attempting to rewire them would be the best voice-computer interface, but it could be a workable one, better than what we have today. Although it would not get high marks from a Human-Computer Interaction standpoint, people are extremely adjustable and can, over time, learn to do pretty much anything in any manner. Our brain really doesn't need rewiring: it's already wired so well that it can conform to almost any task we need it to perform. If a not-unreasonable amount of effort or training leads to being able to talk to a computer transparently, then I'd be 100% for it. =) Now where are we going to find some test subjects? ;-)

"Who cares if it doesn't do anything? It was made with our new Triple-Iso-Bifurcated-Krypton-Gate-MOS process ..."

Working...