163
Views
2
CrossRef citations to date
0
Altmetric
Original Articles

A USER-DEPENDENT EASILY-ADJUSTED STATIC FINGER LANGUAGE RECOGNITION SYSTEM FOR HANDICAPPED APHASIACS

&
Pages 932-944 | Published online: 13 Nov 2009

Abstract

Unlike sign language, which usually involves large-scale movements to form a gesture, finger language, suitable for handicapped aphasiacs, is represented by relatively small-scale hand gestures accessible by a mere change of the bending manner of a patient's fingers. Therefore, we need a system that can tackle the specificity of each handicapped aphasiac. We propose a system that fulfills this requirement by employing a programmable data glove to capture tiny movement-related finger gestures, an optical signal value-parameterized function to calculate the finger bending degrees, and an automatic regression module to extract most adequate finger features for a specific patient. The selected features are fed into a neural network, which learns to build a finger language recognition model for the specific patient. Then the system can be available for use by the specific user. At the time of this writing, the achieved average success rate was 100% from unbiased field experiments.

Human beings are born to communicate among their surrounding society. Oral and body languages are the proven effective tools to express thoughts and feelings, and in return to understand what others express to them. It is through these means that functions of experience-sharing, opinion exchange, and private delicate interactions that constitute the major elements in social life become possible. Therefore, the matter of suffering from both physical disability and speech impairment (aphasia) is essentially equivalent to becoming handicapped or even paralyzed socially. To this end, for decades, various systems have emerged and attempted to relieve the unfortunate from this undesirable situation.

Loss of speech capability may stem from congenital disabilities (dysprosody or dysphasia) or accident damage, such as trauma or stroke. In this case, the patient cannot communicate normally and tends to remain in a state of loneliness and discomfort. Thus, it is very important to have a system (including proper hardware and software) that can assist the person to perform basic communication.

Finger Language Vs. Sign Language

A great majority of orally disabled people belong to the general deaf-mute category and do not suffer from serious physical disability simultaneously. Sign language is considered appropriate for their communication purposes. However, in order for ordinary people to understand sign language, industrial and academic circles have put forth a great deal of effort to develop assisting tools for language translation. For example, deaf-mute people may put on a data glove to perform sign language. A sign language recognition system then converts signals from the glove to text and finally to speech. Through such a sign language recognition system, deaf-mute sufferers can express themselves to ordinary people.

The situation is utterly different for the handicapped aphasiacs. For them, the problem with sign language is that it involves too many large-scale movements including combined expressions of fingers, palms, and even arms. For severely handicapped sufferers, this language system is simply too far-fetched in terms of their needs. Therefore, a proper communication system should be specifically made for this particular group of people.

Compared to the well-developed sign language scheme, the finger language approach is still at its burgeoning research stage. It originally came from a concept of the Southwest Research Institute in Texas, USA, in 1993 (Bessonny Gilden and Smallridge Citation1993) in which a mechanical hand was proposed for the general deaf-and-blind, which supposedly should convert the English alphabet to the proper hand gestures. Based on this concept, the Stanford Mechanical Department team led by Dr. Gilden, sponsored by the Rehabilitation Engineering Center of the Smith–Kettlewell Eye Research Foundation, pioneered the first communicating mechanical hand, DEXTER, which later had evolved from DEXTER-I to DEXTER-IV models. The DEXTER series made it real to convert the English alphabet to hand gestures from keyboards, computers, TV, and telecommunications devices. By these systems, even deaf-and-blind people became capable of “reading” by touching the gesture-varying mechanical hand (Meade Citation1987).

Some progress on sign, instead of finger, language systems for the Chinese language have been made at several institutes in China. Researchers at these institutions use CAS-gloves, Cyber-gloves, or colored gloves to conduct language translation from 30 basic hand gestures into the simplified form of Chinese characters through the use of the English alphabet plus four special consonants: ZH, CH, SH, and NG (Jiangqin et al. Citation2001; Yaxin et al. Citation2001; Wei et al. Citation2003). They did produce certain results, but their studies are still based on the positions and movement orbits in three-dimensional space of both hands. Moreover, due to the great difference in culture, local language habit, and even pronunciation, their systems are not directly applicable in Taiwan, which uses the traditional form of Chinese characters instead of the simplified form as used in China. Therefore, a finger language system specifically built for Taiwan, as well as for all traditional Chinese users, is definitely needed.

Finger Language and Data Glove

The fact that the finger language designed for disabled aphasiacs involves only very tiny finger movements imposes very stringent requirements on both the software and hardware of any prospective systems. As past experience points out, the major problems with most finger language systems are two-fold—one is system reliability and accuracy and the other is successive difference in the movements of each individual patient. To overcome such problems, besides developing data gloves capable of sensing the slightest movements, establishing a delicate tiny-movement-discerning software system is indispensable. Although there are several imported data glove systems of the optic fiber sensor type already available, they are overly complicated and costly for the intended finger language applications. This fact has motivated the authors to develop a reliable but low cost (under US $150) data glove which is described below.

The first generation, optic fiber-based five-sensor data glove, as shown in Figure (Fu and Ho Citation2007), was manufactured by the authors. With each finger wound up by a crawling optical fiber, bending of the finger(s) triggers optical signal emissions. Then, a scale of eight digits is used to partition the strength of the optical signals measured at each finger (255 for maximal strength meaning straight fingers, and 0 for totally bent fingers). However, it did not take much time for them to discover that the developed data glove was not a reliable instrument nor did the gestures remain the same every time the individual put the glove on. Moreover, factors such as difference in the bending style and residual stress of the optical fiber also led to uncertain signal measurement. As a matter of fact, plastic optic fibers routinely suffer from problems of fatigue, distortion, and breakage. Consequently, the signal change was not discernible from the above noises during finger(s) bending, making it difficult to distinguish the hand gestures solely by the output of optical signals. In short, the first-generation data glove (of the plastic optic fiber) suffered from nonlinearity, nonrepeatability, out of shapeness, rupture, and signal drift.

FIGURE 1 First generation optical fiber-based data glove. LED – Light-emitted diode; PD – Photodiode.

FIGURE 1 First generation optical fiber-based data glove. LED – Light-emitted diode; PD – Photodiode.

Learning from the previous experience, the authors proceed to develop a new generation of data glove that adopts only light-emitting diodes (LED) and photo-detectors (PD) (i.e., without optic fibers, light from LEDs will reach PDs on line of sight) in measuring the bending degrees of fingers, as illustrated in Figure (Fu and Ho Citation2008). The developed digital glove is a simple construct which ultimately renders merits including being lightweight, highly reliable, and an ease of specific adjustment and reprogramming. This new data glove is employed in the development of our finger language recognition system. Note that even though the new hardware now possesses merits of better reliability and repeatability (cf. first-generation glove), resolving the signal uncertainty problem as a result of undesirable coupling between movements of fingers would still rely on the discerning capability of proper software programming.

FIGURE 2 Second-generation data glove.

FIGURE 2 Second-generation data glove.

APPROACH FOR FINGER LANGUAGE RECOGNITION

In order to develop a robust recognition system capable of discerning static, small-scale hand gestures, a survey of communication needs within the handicapped-aphasia community was performed and the most frequently asked questions were compiled into 10 finger language sentences as listed in Table . For instance, Sentence 1 is represened by a straight little finger, meaning “I am thirsty and need water”; Sentence 5 is given by a mere straight thumb, meaning “I don't feel well. Please call the doctor.” In general, we can have Equation (Equation1) to relate sentence Y to the states y i (i = 1, 2,…, 5) of the five glove fingers:

TABLE 1 Ten Predefined Finger Language Sentences

with y i being 0 (when finger i is bent) or 1 (when finger i is straight), and d i , i = 0, 1, 2,…, 5 being the corresponding coefficients.

Given the fact that the finger situations for those handicapped aphasiacs are far more varying and complicated than those of normal people, traditional dichotomy models distinguishing merely the bending and stretching states are inadequate. To meet practical needs, gray box modeling is adopted instead, in which the concept of fuzzy sets is incorporated to address the issue of impreciseness, as is true of facing the ubiquitous vague or imprecise statements in real-world linguistic information. Important techniques involved in our approach are shown in Figure , including feature values calculation, feature selection, bending states decision, and rule-based classification, to be elaborated below.

FIGURE 3 Techniques used in the system.

FIGURE 3 Techniques used in the system.

Feature Values Calculation

This calculates the corresponding values for the input features. Possible features that may play important roles in recognition of finger language are analyzed below along with the method to calculate their values:

  1. O i : Original optical signal strength of each finger

    O i takes on values on the scale of 0 to 255.

  2. F i : Bending degree of each finger

    Equation (Equation2) below defines F i to be a function of O i to calculate the bending degree for finger i:

    In the equation, a i and b i denote the signal values for the least bent and the least straight situations of finger i, respectively. The function is illustrated in Figure . Table lists the respective a i and b i values for the five fingers of User 1 (Row 1, Table ). Take F 1 (bending degree function for thumb) as an example. Then, at some specific O 1, we have a corresponding bending degree F 1, as illustrated by Equation (Equation3).

  3. N i : Normalized value of original optical signal strength of each finger

    Equation (Equation4) below defines N i in terms of O i :

FIGURE 4 Bending degree function F i .

FIGURE 4 Bending degree function F i .

TABLE 2 Values a and b-Parameterized Function

TABLE 3 Feature Selection

Feature Selection

We are using the regression technique to select the features. Suppose an output finger language sentence is represented by value Y. Taking all the features described above, i.e., O i , F i , and N i , into account, then Y can be mathematically expressed as Equation (Equation5):

where C i 's with i = 0, 1, 2,…, 15, are coefficients to be determined.

By applying the backward stepwise procedure of multiple regression analysis over the data produced by User 1 (Row 1, Table ), it is found that O i and some N i are much insignificant statistically and can be neglected. With this, the above formula for a specific user can be secured as Equation (Equation6).

The goodness of fit (R 2) of Equation (Equation6) is calculated to be 0.99, 0 ≦ R 2 ≦ 1, which means this regressed equation can successfully explain 99% of variance of Y, that is to say, Equation (Equation6) has high goodness of fit.

By the equivalence of Equations (Equation1) and (Equation6), it is clear that we can use F 1, F 2, F 3, F 4, F 5, N 1, N 2 as the features to describe the finger language sentence Y for User 1.

Bending States Decision

We are employing a neural network, the architecture as revealed in Figure , to learn how to decide whether a finger is bent or not by the corresponding selected input features. The selected feature values calculated in the above are input at the input layer, and after going through manipulations in the hidden layer, five state values come out of the output end representing the bending states of the five fingers, respectively. Note that, considering computing efficiency and reliability of results, the number of nodes in the hidden layer eventually settled to nine after varying from 7 to 12 against field experiments. The learning is through the back-propagation algorithm and the model is built using the stratified tenfold cross-validation procedure on 1000 records subject to a system error rate set to less than 0.01 in 100 epochs.

FIGURE 5 Neural network.

FIGURE 5 Neural network.

Rule-Based Classification

Table lists the decision rules to classify the 10 predefined sentences for finger communication in which each rule number is made equal to its corresponding sentence number. For example, R5 is associated with sentence S5 in Table . Depending on the five output values from the above neural network module, different combinations of the five finger states give out different rules in the table.

TABLE 4 Rules to Classify Finger Language Sentences

Finally, our system can help the patients say the desired predefined sentences through implementing a generic text-to-speech (TTS) module. More advanced word or sentence construction can be considered and arranged, e.g., as in Fu and Ho (Citation2008), after a patient becomes used to the basic 10-sentence system.

SYSTEM DEVELOPMENT

The system works in two modes, namely, “model building” and “model application.” In the former mode, we build the recognition model for a specific user, while in the latter mode, the system is ready for daily use by the user. A typical scenario is illustrated as follows:

A. Model Building.

Step 1: Adjustment of the glove: The user adjusts PDs and LEDs associated with the five fingers, respectively, so that for each finger a dynamic range of the grayscale covering 0 to 255 (or maximally achievable) can be secured.

Step 2: Tailoring of values a and b: The values of a and b are specifically tailored for the user by Equation (Equation7), where i = 1, 2, 3, 4, 5 stands for finger i of the user.

Step 3: Creation of training data: The user is instructed to make static finger gestures designating numbers 0, 1, 2, to 9, and 100 times each, in an attempt to cover the widest variations in finger signal detection. In the end, each user should produce 10 × 100 = 1000 data.

Step 4: Feature selection: The features specifically work best for the users are decided by multiple regression analysis applied on the above training data.

Step 5: Model training: The training data in Step 3 that are related to the selected features are singled out as the training dataset and fed into the neural network. By way of back-propagation learning, coupled with stratified tenfold cross-validation, the neural network learns the recognition model and is ready for use.

B. Model Application.

Step 1: Recognition of finger sentences: The user puts on the glove, makes a gesture that expresses his intention, and the system recognizes the corresponding finger language sentence.

Step 2: Speech production: The intended finger language sentence is sent to the text-to-speech module for speech production.

SUMMARY AND DISCUSSION

This article describes a user-dependent static finger language recognition system. It supports a data glove, which is used to capture the signals of 10 predefined finger language sentences in terms of hand gestures from a specific handicapped aphasiac. The system then converts the signals into proper meanings to be spoken by a commercial TTS system. The system is customizable by allowing easy adjustment of the photo devices attached on the data glove to make it programmable. The system employs a function, the shape of which is easily adjustable for each finger, to calculate the bending degrees of the fingers. Finally, the system allows proper finger features to be automatically extracted by multiple regression analysis so that the recognition model can be correctly constructed by a neural network.

In Table , we showed the customized finger features for 10 users. Table shows the results of stratified tenfold cross-validation for the 10 users, showing 100% of correctness. This proves that customization works well. As a matter of fact, the customization is required because we find each handicapped aphasiac has his unique finger positions, sizes, shapes, and cross-finger influence while bending fingers. It poses great difficulty in developing a general finger recognition system. In general, the bending degree feature (F i ) works for most patients, as revealed in Table (Users 2, 4, 6, 7, 8). Some patients, however, did need more features to obtain a better result. Having taken a closer look at those users, we discovered most of them have rather peculiar shapes of fingers or movements. This further justifies our decision to develop a user-dependent finger language recognition system.

TABLE 5 Results of Stratified Tenfold Cross-Validation

In the future, we plan to make this customization process even faster by applying a slight adjustment to a “pseudo-general” finger language recognition system with the support of the development of a database that accommodates the finger gestures from most handicapped aphasiacs. We also plan to extend our research to the recognition of dynamic hand gestures, which allow more real applications of the system.

DECLARATION OF INTEREST

The authors certify that the above organizations hold no interests which might give rise to a conflict of interest or the perception of a conflict of interest in regard to the publication of this work.

The authors gratefully acknowledge the financial support provided by the National Science Council, Taiwan, ROC under grants NSC 93-2622-E-163-001-CC3 and NSC 94-2213-E-163-003. We would also like to thank the generous assistance of Dr. Lih-Horng Shyu, Department of Electro-Optics Engineering, National Formosa University with the construction of the data glove.

REFERENCES

  • Bessonny Gilden , D. , and B. Smallridge . 1993 . Touching reality: A robotic fingerspelling hand for deaf-blind persons . In Proceedings of Virtual Reality Conference . http://www.csun.edu/cod/conf/1993/proceedings/TR~1.htm (11 September 2004) .
  • Fu , Y.-F. , and C.-S. Ho . 2007 . Static finger language recognition for handicapped aphasiacs . In Proceedings of the Second International Conference on Inno Comp Info and Cont. Kumamoto, Japan
  • Fu , Y.-F. , and C.-S. Ho . 2008 . Development of a programmable digital glove . J. Smart Materials and Structures 6 ; 17 : 1 – 8 .
  • Jiangqin , W. , G. Wen , P. Bo , and H. Jingping . 2001 . A fast sign word recognition technique for Chinese sign language . J. Comm. of Hi-Tech 11 ; 6 : 23 – 27 (in Chinese) .
  • Meade , A. D. 1987 . A finger-spelling hand for the deaf-blind . In Proceedings of IEEE International Conference on Robotics and Automation, March, 4 , 1192 – 1195 .
  • Wei , Z. , Y. Kui , Z. Ai-yun , and Z. Hai-bo . 2003 . A recognition system of single-hand words in CSL . Journal of System Simulation 15 : 290 – 293 .
  • Yaxin , Z. , Y. Kui , D. Qingxiu , and Z. Wet . 2001 . A classification method for Chinese sign language recognition . Beijing Journal of the University of Science and Technology . 23 ; 3 : 284 – 286 .

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.