Arabic Dynamic Gesture Recognition Using Classifier Fusion

Sign language is a visual language that is the primary way used by hearing-impaired people in order to connect and communicate with each other and with their societies. Some studies have been conducted on Arabic sign language (ArSL) recognition systems, but a practically deployable system for real-time use is still a challenge. The main objective of this paper is to develop a novel model that is able to recognize the ArSL using Microsoft’s Kinect V2. This paper works on the dynamic gestures that are performed by both hands and body parts, and introduces an effective way of capturing and detecting the hand and skeleton joints from the depth image that is provided by Kinect. The model used two supervised machine learning algorithms, support vector machine (SVM) and K-nearest neighbors (KNN), and then applied Dezert–Smarandache theory (DSmT) as a fusion technique in order to combine their results. We compared the results of the proposed model with the Ada-Boosting technique and finally applied two most widely used methods that are used with dynamic gesture recognition, dynamic time warping (DTW) and hidden Markov model (HMM), to compare their results with the previous classifier fusion. Finally, we applied the model on ArSL dataset that is composed of 40 Arabic medical signs to ease the communication between hearing-impaired patients and their doctor. The accuracy of the model is improved when the classifier fusion is applied compared to using each classifier separately. The overall accuracies for SVM, KNN, DSmT fusion, and Ada-Boosting are 79%, 89%, 91.5%, and 90.2%, respectively. Also, DTW and HMM achieved overall accuracies of 82.6% and 79.5%, respectively.

paper Menu