Human and Machine Readable Handwritten Language? 119
darrint writes "In some obscure corner of the Earth, has someone developed a human handwritten language which can be easily read by a machine? Why is the visual divide between what can be written by a human and what can be read by a machine so wide? At one extreme is the bar code, which I certainly cannot hand write. Machines can read it easily. Bank checks have a human readable account and routing numbers printed in special ink running along their bottom margins. These numbers can be read by a machine and are clearly legible to a human, but I doubt I could write them for input to a machine. My old Palm handheld could read something like handwriting in its little box. OCR exists but I've never thought of it as reliable. I would like to dash off little notes on stickies or in a tiny spiral notebook and be able to suck them into vim, a browser text-input box, and so forth. Perhaps I'd have to learn some kind of machine readable 'shorthand.' Has it been done?"
1994 called (Score:2, Funny)
PDAs cheat (Score:5, Insightful)
* pressure
* speed
* stroke order
* stroke direction
* pen-up and pen-down events
* timing
Not Palm. (Score:2)
Check out this representation of the alphabet in Graffiti [tomshardware.com].
You can do X as a reverse of a K in that alphabet; U and V were a bit different (V is easier to do right-to-left for the machine to recognize the stroke, but you can make the shape the same as a "real" v). I actually did some of my paper notes in Graffiti (in University) since they tend to be mono-strokes (rather than the polystrokes to make up them more complicated l
Re:Not Palm. (Score:3, Interesting)
Re:Not Palm. (Score:1)
Re:Not Palm. (Score:2)
Re:Not Palm. (Score:1)
Thanks for correcting me, and I'm sorry that I got that wrong... Usually, I'm very good with grammar. Oh, well.
Re:PDAs cheat (Score:4, Informative)
The problem now is that we're used to reading print. One of the main principles of palaeography is that you read the motions of the pen (or other writing tool) in the medium. Ink in particular is great for this sort of expression, because you can (especially with a flat nib) express all sorts of motions; and using a variety of analytical tools, you can reconstruct missed strokes, damage to the medium, overlapping words and the rest. Some of those analytical tools are, of course, analysis of the linguistic context. And that same context lets us get really fancy with our handwriting. For example, if something logically follows, I don't need to waste my time writing it out clearly.
To muddy the waters further, no two people use the same handwriting. Even in contexts where the formation of letters is strictly determined, everybody has their individual variations, epsecially in pressure, speed, stroke order, stroke direction, and lifting the pen. They also vary in how they form the letters.
So yeah, you can probably get decent success using handwriting OCR on things like addresses and bank account numbers -- because you've got a known context, and are basically looking for key numbers.
And I'm sure there's decent software recognition out there. But to get something that reads human script -- even a forced "machine-friendly" hand -- takes a lot of work, and a lot of training in areas that machines are not good at. You'd need a pretty big neural net.
Left Handed people retarted? (Score:1)
When I gave it to any of my righ-handed friends, no problem. But myself and another leftie just couldn't use it!
Anyone else have this problem?
I believe it has been done (Score:5, Funny)
Re:I believe it has been done (Score:3, Interesting)
For the original problem, I think the issue between computer recognizing handwriting is that shapes in everyones handwriting alter so much. I can't get my pda to recognize my handwriting even after training for several weeks, I just gave up and scribble notes as pictures instead.
Main issue to remember is that computers process in numbers, not letters, to completely solve this issue, we'd need a language that's completely
Re:I believe it has been done (Score:3, Insightful)
I don't know how you've reached that conclusion, there's actually not that much difference between numbers and letters to a computer - both have binary values. The only reason a computer might be able to recognise digits 0-9 easier than also including A-Z, is that there are less glyphs to recognise in the alphabet. All you'd by doing by writing do
Re:I believe it has been done (Score:3, Insightful)
If multithreaded vector processing sounds strange, maybe you're more familiar with the fuzzy logic buzzword.
Yes, I'm oversimplifying things, but I don't have readymade solution here, I'm just trying to explain concept.
"there are less glyphs to recognise" - You got my point, it's far more accurate to recognize 10 different symbols than it is to recognize 34, or more when we have accents. If we have language that's based on 10 symbols on
Re:I believe it has been done (Score:2, Insightful)
Not necessarily. Trying to write a phonologically complex language like English is bad enough when the number of symbols is half the number of sounds, as currently; if the number were 1/5th, as you suggest, then words would have to be much longer, and reading would become more difficult for humans.
English already has to use more than one letter to represent many sounds: "ch",
Re:I believe it has been done (Score:2)
Also your previous post was a bit confusing. You started off seemingly correctly (that handwriting is hard because it's individual) and then veered off into:
Re:I believe it has been done (Score:4, Interesting)
Re:I believe it has been done (Score:2, Informative)
Punch cards are not coded with pencil, they are coded with physical holes "punched" out of the paper (becoming... can we say it.. Chads!)
Unless your refering to MarkSense which turned marked cards into punch-cards by a machine that would sense the mark and punch it out.
Thanks to the last US Presidental election, the whole worlds knows the term chads, even if they don't all know what they mean. :)
Uh.... (Score:1, Informative)
Re:Uh.... (Score:2, Insightful)
An alphabet based on entirely straight lines would be easy enough for a computer to read if letters never touched. The software would first detect the line of text, then along the row of letters, find the first black pixel, then find all th
Re:Uh.... (Score:3, Interesting)
You took us from having a human-readable, non-machine-readable alphabet to the exact opposite. I don't want to be a barco
Re:Uh.... (Score:2)
I think a lot of effort has been put into it, but it seems as though it's not average software; it's up there with writing a seriously optimizing compiler in that it's a lot of heuristics (or guesswork). But, having multiple handwriting profiles would be a start... making the software 'adaptive' so that the more you use it (and the more you have to correct it) it gets better and better at recognizing 'your' handwriting.
Also, the
Re:Uh.... (Score:1)
OK... "Anonymous Coward".
No problem.
Re:Uh.... (Score:2)
Re:Uh.... (Score:4, Informative)
Depends on what you have it set to. My TabletPC is set to read each individual character at a time. It provides little spaces to write each character in, so you don't have to worry about spacing or anything. That's been my favorite, honestly.
Re:Uh.... (Score:1)
Recognition (Score:3, Insightful)
Re:Recognition (Score:1)
Sure, it's... (Score:3, Insightful)
Learn to write it neatly and the computer will have no problem reading it. Or humans either, for that matter. Write it poorly and both will have a hard time.
There's a solution. (Score:2)
Re:There's a solution. (Score:1, Funny)
Ideal handwriting style (Score:4, Informative)
OCR/handwriting recognition folks: what would the ideal handwriting for machine readability look like? Could simple variations on standard English cursive or printing approach 100% recognizability, or would the ideal have to be synthesized, like shorthand [wikipedia.org], and if so, what characteristics would such a script have?
Re:Ideal handwriting style (Score:3, Interesting)
Braille is very hard to write. (Score:2)
Re:Braille is very hard to write. (Score:2)
Re:Braille is very hard to write. (Score:2)
Chordpad (Score:2)
[For braille entry,] Usually a typewriter is used, which might as well be replaced with a computer keyboard.
Braille can be entered into a handheld device with a six-key chordpad [wikipedia.org] on the back and a space key on the front. Would that be so hard?
Re:Chordpad (Score:2)
The Xevious font! (Score:2)
As far as systems that are a little easier to write, but still machine-readable, a fun alternative might be the Fardraut font from Xevious and Solvalou.
http://hg101.classicgaming.gamespy.com/xevious/xev ious2.htm [gamespy.com]
I've got it as an actual font (called "CZP_Fardraut"), but I can't seem to find it anymore. I'm sure it's out there somewhere.
As alphabets go the letters aren't as distinctive as most, which would make
Re:Ideal handwriting style (Score:5, Insightful)
Okay. I'm attacking the point of the post.
There's no reason to reinvent the alphabet any more than there is reinvent the wheel.
If we change the alphabet so machines can read it, other people stop being able to read it. It's the wrong solution for the problem.
If my handwriting is good enough that I can read it two weeks later, and my peers and friends and family can read it perfectly (i've been told I have particularly good handwriting) then why should I have to change it so that my PC can understand it, but nobody else can?
I could memorize a second alphabet, having one for me and one for my PC... but why?
If I could tell the software "This is how I write a 'k' and this is how I write an 'R'", that would improve things a lot IMO. My 'k' might look like someone else's 'R'; but my 'k' and 'R' look absolutely nothing alike. My ampersand kind of looks like a plus sign; but it's totally distinguishable from my plus sign. If I could dawn this on the software...
Re:Ideal handwriting style (Score:2)
True, but that's not what the article's about, or at least, not what I think it's about. The question, as I understand it, is to find a script that people can use that's equally understandable by humans and machines.
Re:Ideal handwriting style (Score:2, Funny)
As a programmer, it is my job to every day reinvent the wheel!
Re:Ideal handwriting style (Score:3, Interesting)
Re:Ideal handwriting style (Score:2)
What these alphabets do is remove as much of the ambiguity as possible from printed charactors.
this page intentionally left blank (Score:3, Interesting)
If someone were to develop a language that was machine readable, human writable, it would probably consist of a series of straight lines. Letters would have to be larger, but lines are probably the way to go.
|_|__|-__-__-_||_|__
^like that.
Re:this page intentionally left blank (Score:1)
Re:this page intentionally left blank (Score:2)
It's not hard to make bar codes [google.com] and checks [google.com] if you are the least bit enterprising.
Re:this page intentionally left blank (Score:1)
Re:this page intentionally left blank (Score:1)
Re:this page intentionally left blank (Score:1)
() |\/| (, |_ () |_
That's just foul.
Re:this page intentionally left blank (Score:3, Interesting)
http://www.cdli.ca/CITE/r_alpha.gif [www.cdli.ca]
Re:this page intentionally left blank (Score:2)
Here is the Wikipedia article:
http://en.wikiped [wikipedia.org]
Morse Code (Score:3, Interesting)
I guess what would be interesting would be to have OCR look at 100 peoples handwriting and see if there are any letters that are typically difficult to recognize, and then come up with a substitute that would be easy for the computer to read. Block capital letters should be fairly unambiguous, but I think many people don't write solely in that. I tend to mix my caps and non-caps within words, and I could see where the comp would mistake my F and P and O and Q U and V.
Does anyone know how Palm came up with their graffitti handwriting? - they must have done some studies.
Re:Morse Code (Score:2)
As someone else pointed out, the PDA has a WHOLE lot more info. BTW, I've heard that V2 of the Newton's HWR system works the best of all of them on PDAs.
I think an OCR could simply be tuned to one style of human handwriting, preferably block, and people write in that style. Then again, my block-like handwriting is hard for some HUMANS to read, let alone a computer...
Re:Morse Code (Score:1)
morse code (Score:2)
Re:morse code (Score:3, Informative)
Um... (Score:1, Funny)
Typing vs. Handwriting (Score:1, Interesting)
Given the speed differences between typing and handwriting (even in non-computational contexts), I consider attempts to do handwriting recognition as a kludge.
The real solutions will come in the form of portable/projectable/virtual keyboards or an entirely new input method-
OCR Reliability (Score:5, Interesting)
MICR fonts, which are those funny looking numbers printed in magnetic ink at the bottom of most checks are designed to be human recognizable but machine readable, and have been around since the '60s. OCRA typically beats MICR today, but a good MICR line is still readable over 95 percent of the time.
Handwritten fonts are the most difficult to read, but the technology has been available to read handwritten numbers and letters for over 10 years, but typical read rates for something like a handwritten zip code or the numerical amount written on a check range from 60 to 80 percent, and are slowly getting better. Again, a lot depends on how much care is taken when writing out the text, and what kind of background clutter is present.
As for me, I typed out school reports in 8th grade in 1973, when our family's word processing hardware consisted of a 1940's vintage Underwood typewriter. Even humans had difficulty decoding my handwriting!
Re:OCR and MICR Reliability - a minor correction (Score:3, Informative)
Re:OCR Reliability (Score:4, Informative)
Any postcodes that could not be read, dark paper and red ink etc, were scanned and transmitted to a postal worker drone in another part of the country who would type in the postcode from their terminal. The machine would receive the code back a few seconds later and the letter would carry on its journey.
I was impressed.
What is this? (Score:2)
United States Postal Service (Score:2, Informative)
Apple Newton tried. (Score:3, Informative)
Maybe if someone tried again now, Newton would a better job.
Newton handwriting recognition technology lives on (Score:1)
OCR (Score:2, Funny)
Yes (Score:2)
Yeah, it's called Mathematics.
Re:Yes (Score:1)
Also the definitions used are designed to be "human-understandable" rather then "machine understandable". In proofs usually some of the things are written out, or left to the reader. I am not saying that a machine can't understand it, but mathematical writing is certainly not specificly written for the machines to understand. (of course you could make a system to do that)
IBM (Score:2)
Sera
Why in the hell... (Score:1)
Quit wasting your time trying to learn how to speak computer and spend it making the computer understand human.
(And yes, the fastest way to communicate with a computer currently is QWERTY)
Re:Why in the hell... (Score:1, Informative)
Yes, it has been done. (Score:2, Funny)
Obfuscated handwriting system (Score:4, Interesting)
1. It should NOT be easily readable by a casual observer (for notes I didn't want other people to read).
2. The most commonly used letters should be the simplest to draw, so it should be fairly fast to write, like cursive.
3. Letters should be as umambigious as possible, so even the most scribbled/hurried writing would be distinctly recognizable.
4. Each letter should try to hint to the original latin letter to some degree, whenever possible. Although goal #2 usually would take priority over this one when in conflict.
5. A mid-height clear horizontal marked the beginning/end of a new letter.
6. (just for fun) It should look kinda weird and cool in a sci-fi sort of way, so if someone came across my notes they would be kind of baffled =)
While #2 and #3 might work towards making this an easy-to-OCR handwriting system, #1 and #6 probably makes it moot, at least for the system I made. However, I imagine it wouldn't be too hard to make a less-obfuscated more-practical writing system which try to accomplish similar goals to #2-4 above.
I made a font out of my handwriting system a few years ago. If anyone is curious, here is an image chart of the font [josef.org]. =)
I'm curious what other more "efficient" writing systems may exist out there (other than standard and cursive). Does anyone know of any others?
Re:Obfuscated handwriting system (Score:2)
Heh, I guess many of us did stuff like this in highschool. Well, at least, many of us nerds... ;)
Must say I'm impressed with your system though; far easier to write quickly with it than with the system I developed. Mine was more like writing everything in captials, while yours has a nice flow to it... Would you care to share the font you created of it?
Re:Obfuscated handwriting system (Score:1, Informative)
Second - too much backtracking, try and avoid the 180 degree pen reverses. Lose them and you'll have a great system.
Re:Obfuscated handwriting system (Score:2)
TTF please (Score:2)
Re:Obfuscated handwriting system (Score:2)
Re: Obfuscated handwriting system (Score:2)
I had a quick look at both Pitman's [wikipedia.org] and Teeline [wikipedia.org], but neither seemed suitable. (For example, Pitman's needs you to distinguish light and firm strokes, and I tended to use a fountain pen which prevented that. It also needs a horizontal line, which I didn't want to rely on, and optimises for writing speed at the expense of paper used. Teeline looked better, but is alphabetic rather than phonetic, making it longer-winded than necessary. And neither seemed t
Braile? (Score:2)
Probably 100 times more legible once you get used to learning braile, and super machine readable so long as you'
Wouldn't Kana fit the bill here? (Score:2, Informative)
Re:Wouldn't Kana fit the bill here? (Score:2)
Re:Wouldn't Kana fit the bill here? (Score:2)
try a tekaki nyuu ryoku (Score:2)
Question of optimization (Score:2, Insightful)
So handwri
We have numbers down, at least (Score:3, Interesting)
different alphabets (Score:3, Interesting)
- It tries to guess Dutch words using an US English dictionairy, which is so much of a PITA that I switch off the entire dictionairy function.
- Dutch has a few characters that aren't in the standard US character set, this leaves me "international" as the only other option, but this also contains a lot of characters I will never use, and only cause confusion for the OCR system.
- Next to that I don't like that it forces you to learn it's alphabet instead of it learning yours.
In short I am very disappointed about my PocketPC, also because of some other limitations I was unaware of when I bought it. (remove battery and it forgets everything, coupled with an ActiveSync backup that doesn't work; I'm lefthanded, which makes the user interface very akward), I now have a Nokia Series 60 phone and prefer that.
Good question (Score:2)
However probably all caps would be easiest to read I would imagine, that or using grid paper which helped you to keep discipline. Of course this is assuming you don't use something to help capture in advance, like Anoto's system.
I just had a flashback from all those movies where you see ICBM operators get a phone call and they whip out a noteb
Combinations of strokes (Score:3, Interesting)
---
|\
| X |
|/ \|
---
From the 6 strokes here you have 64 total possible combinations. Discard the 24 that are disjoint and youve still got plenty for 26 letters and 10 numbers.
As to an english-based alphabet, the problem is that so many letters are far too similar, especially b / h / k, i / j, rn / m, and that handwriting is too fluctuous. Capital letters are an obsolete idea that only further complicates things.
The outdated nature of most written languages is mirrored in spoken alphabets. There is absolutely no reason for 'w' to have a 3 syllable name. I have encounterd a number of people who say "www" as "dub dub dub", and I am considering spending a week or two training myself to permanently replace "double-u" with "dub" in my vocabulary (that is how long it took me to unlearn 20 years of tying my shoelaces wastefully and ingraining a better faster way).
Re:Combinations of strokes (Score:1)
Okay, I'm dying to know... what's the better way?
(Please don't say it's velcro.)
Re:Combinations of strokes (Score:2)
Watch the little animated GIF at the top, then try to follow the directions. This cuts 25-50% off the total shoelace tying time, and up to 90% off the time for the actual 'bow' knot (not counting the base knot under the bow).
That site also has a bunch of other knots for different purposes. Boot lacing and tying methods, secure knots, even necktie knots
Re:Combinations of strokes (Score:2)
Square knot (left over right, right over left) resists slipping, stays tied.
Re:Combinations of strokes (Score:1)
in Dutch, we pronounce w as "Way", and it rhymes with v ("Vay")
Vee and Wee or Vee and Way would make more sense than 'dub'
Re:Combinations of strokes (Score:2)
In terms of English specifically, we have 26 letters to share about 10 vowel sounds (depending on dialect, accent
Gadgets (Score:1)
Most businesses that I know of... (Score:2)
Such devices are becoming quite popular because they're faster then handwriting! Hopefully we'll eventually find one in every office and home!
Re:Simple (Score:1)