Sunday, March 16, 2008

Communicating with Highly Physically Disabled People

I am not a linguist, I am a mathematician, computer programmer, amateur electrical engineer and all around renaissance geek. I just finished reading the book The Diving Bell and the Butterfly. It is a profoundly moving memoir dictated one letter at a time by Jean-Dominique Bauby, a stroke victim who suffered from locked-in syndrome.

Jean-Dominique was only able to communicate by blinking his left eye. An alphabet was developed that was re-ordered from the usual alphanumeric ordering, so that the most frequently used letters were at the beginning. To communicate, the speaker would cite each letter and Jean-Dominique would blink when the appropriate letter was said. Slowly, words would be built up and sentences would eventually form, conveying meaning. It was a tedious process and it had its drawbacks:
" "Want to play hangman?" asks Theophile, and I ache to tell him that I have enough on my plate playing quadriplegic. But my communication system disqualifies repartee: the keenest rapier grows dull and falls flat when it takes several minutes to thrust it home. By the time you strike, even you no longer understand what had seemed so witty before you started to dictate it letter by letter. So the rule is to avoid impulsive sallies. It deprives conversation of its sparkle, all those gems you bat back and forth like a ball - and I count this forced lack of humor one of the great drawbacks of my condition." (Pages 70 and 71, The Diving Bell and the Butterfly)

Since Jean-Dominique was a native French speaker, the letter frequencies in his special alphabet were based on the French language. The alphabet looked like this:


E S A R I N T U L O M D P C F B V H G J Q Z Y X K W


For the English language, the letter frequencies (which can be found here) would be:


E T A O I N S H R D L C U M W F G Y P B V K J X Q Z


As I was learning this system, it occurred to me that there had to be a more efficient manner to do the same thing. The main problem appears to be that it takes a great deal of time to get to a letter deep into the alphabet. In addition, the deeper in the alphabet you go, the more likely it is that an error will happen thereby missing the intended letter and causing the letter selector to have to start again at the beginning.

In order to improve communications speed, I came up with this alternate "tabular" method:

E A N D W
T I R M B
O H U P X
S C Y J PH
L G K Z LY
F V Q QU RY

Note: Since the standard Latin alphabet has 26 letters, a six by five table left four empty spots. I chose letter pairs to fill those spots. The letter pairs were guessed at, as my brief searching was unable to turn up any letter pair frequency tables. I have no doubt that an exhaustive lexical analysis of the English language would turn up the true top four letter pairs. Some pairs like IE and EA are very common, but they are pretty cheap to create one letter at a time, so it's not worth using them in the far bottom corner where it takes more steps to get to them.

This alternative method works as a simple Cartesian coordinate system. That's fancy mathematics speak for "select the row and then select the column". To find the letter M, the letter selector starts at the top row and works down row by row, until the patient blinks when the second row is chosen. This means the patient could be interested in the letters T, I, R, M or B. The selector would then work across the columns until the patient blinks when the M character is chosen. In total, six stops were made to get to the M character. The old system would have required fourteen; more than twice as many. In addition, by locking the selector into a given row, the possibility for error is greatly reduced. If the selector picks the wrong row or misses the target letter, they will know they've done so because the patient never blinks by the end of the row. With the old system, the selector would have to go to the end of the entire alphabet to find that they've missed the letter.

To give you a better idea of the benefit of this system, here's the same table with the relative costs of getting to each letter added. The letter represents the target letter, the number represents the number of steps, or "cost", required to get to that letter and the number in parenthesis is the cost the old system required to get to that letter:


E - 2 (1)A - 3 (3)N - 4 (6)D - 5 (10)W - 6 (15)
T - 3 (2)I - 4 (5)R - 5 (9)M - 6 (14)B - 7 (20)
O - 4 (4)H - 5 (8)U - 6 (13)P - 7 (19)X - 8 (24)
S - 5 (7)C - 6 (12)Y - 7 (18)J - 8 (23)PH - 9 (27)
L - 6 (11)G - 7 (17)K - 8 (22)Z - 9 (26) LY - 10 (29)
F - 7 (16)V - 8 (21)Q - 9 (25) QU - 10 (38)RY - 11 (27)

Some immediate observations are that the letters E and T actually require one more step in this system than the old. In addition O and A cost the same in both systems. However, it should be noted that the overall cost savings is dramatic when you start creating whole words. For example. Let's take the following sentence (chosen from a random poster I saw at my Son's elementary school):

SEE JANE RUN
The cost breakdown is as follows:


Word Old Cost New Cost
See 9 9
Jane 33 17
Run 28 15
 
Thus with the old system, it costs 70 letter stops to spell the test sentence. With the system I am proposing, it only takes 41. It is also important to note that the word SEE costs the same in both systems, which is an example of how the extra step to find the letters E and T are quickly absorbed by economies elsewhere.

Potential improvements on this would be to re-arrange the alphabet on a per patient basis. Since everyone uses a slightly different subset of their native language's words, their letter frequencies would likely be slightly different. If available, recordings and writings from the patient created prior to becoming disabled could be analyzed to alter the table layout. However, after the patient starts using the table, I would suggest that it not be altered unless absolutely necessary, as a familiarity will have been built up that will be difficult to overcome to take advantage of newer efficiencies. It would be interesting to study whether or not patients adapt their vocabulary to the table, thus removing any need to alter the table to introduce efficiencies after being introduced to it.

A potential objection to altering the table on a per patient basis prior to being introduced to it, would be that each patient should use the same letter table to keep communications uniform. I would overcome this objection with the idea that patients will not be using this system to talk directly to each other. This system would only be meant to facilitate communication between a disabled patient and an able bodied person who can work the board. The able bodied person working the board, should be able to adapt to different boards for different patients, especially considering that the incremental improvements in communications speed will far outweigh any inconvenience to the board operator. In addition, there is no reason why the intermediary could not be a computer, thus allowing similarly disabled patients to communicate with each other in real time. I wonder if it would be a positive thing for a patient to share their feelings with someone in the same situation?

It is important to note that this system is only useful for persons who already have the ability to read and can process information relatively normally. It is also only useful to those that have the ability to consistently gesture in a singular fashion, such as an eye blink, or some other "single bit" manner. If multiple gestures can be clearly and consistently mastered, there are much faster ways of communicating than the system that I am proposing. It would be very interesting to be able to study systems that apply to various numbers of feedback bits from the patient. As a general rule, the greater number of feedback bits available from the patient, the more robust and efficient the communication. I should coin the term CFB - Consistent Feedback Bits. A basic eye blink would be one CFB. An eye blink and a finger twitch, would be two CFBs and so on. The various systems of communicating could be indexed by CFBs. A specialist could assess the patients CFBs and perhaps use therapies to expand the number of CFBs, and then a system of communication could be chosen that best fits their unique situation. Again, many of these systems of communications would fall apart if the patient is simply cognitively unable to process information.

I believe that this system requires the ability to see out of at least one eye, but could possibly be used with a blind patient as long as they could hear well enough to memorize the table and give "single bit" feedback as they were learning. If the patient were blind and deaf, it may still be possible to communicate as long as they had relatively normal information processing abilities and could feedback to indicate to their teacher where they were in the learning process.

With an advanced enough computer, this could all be done automatically. It would not be that difficult to train a computer to analyze when an eye has blinked or a similar "single bit" gesture has occurred. Lights could be used on the selection table (or sounds if the patient was blind) to work through the table. Even further, if recordings of the patient's voice could be found, the patients words could be synthesized in their own voice! The only drawback would be that the table would have to be enhanced to include commands like "turn on/off synthesizer", numbers and some punctuation.

Going even further, a computer could learn to discriminate multiple gestures from what appears to be a "single bit" gesture to a casual human observer. For example, to most people an eye blink is an eye blink. A finger flex is a finger flex. To a computer, who can analyze millions of pixels of video data per second, the subtlest differences in movements can be discerned. Given enough training, a computer could learn to pick up on the many different shades of meaning a patient could build into a simple "single bit" gesture. For example, a patient could decide that a fast blink means one thing and a slow blink means another. Or perhaps a fast half blink means something different from a slow half blink. Or perhaps, the patient can actually move their eye, so a blink and a move to the right could mean something different from a half slow blink with a move to the left.

To be sure, it would be a long and arduous process for the computer and patient (and technician(s)) to get this language straight. Once it was learned though, it seems entirely likely that a completely paralyzed patient who only had the use of one eye (or some other single bit gesture), could communicate using a computer as quickly as you and I can with our voices.

I considered several other systems for organizing the alphabet in a manner that would require the fewest number of steps to get to a given letter. All of those that I could come up with were either too complex, or communication efficiency wasn't high enough. If you're interested, I'd be happy to talk over some of the ideas I've abandoned.

Oh, and if you've got a pile of money and want to see something tangible developed along the lines of what I have just described, feel free to throw it at me. I would love nothing more than to be able to work on this full time.

2 comments:

ydna said...

Is the speaker/author a "patient" in this context? The word strikes me as out of place here. (I realize I'm picking a nit.)

chuckwolber said...

Yes, the speaker/author is the patient.

(don't worry, picking nits is good, precise communications are essential when there are a lot of shades of meaning)