Turkish Journal of Electrical Engineering and Computer Sciences

Utilizing motion and spatial features for sign language gesture recognition using cascaded CNN and LSTM models

Abstract

Sign language is a language produced by body parts gestures and facial expressions. The aim of an automatic sign language recognition system is to assign meaning to each sign gesture. Recently, several computer vision systems have been proposed for sign language recognition using a variety of recognition techniques, sign languages, and gesture modalities. However, one of the challenging problems involves image preprocessing, segmentation, extraction and tracking of relevant static and dynamic features related to manual and nonmanual gestures from different images in sequence. In this paper, we studied the efficiency, scalability, and computation time of three cascaded architectures of convolutional neural network (CNN) and long short-term memory (LSTM) for the recognition of dynamic sign language gestures. The spatial features of dynamic signs are captured using CNN and fed into a multilayer stacked LSTM for temporal information learning. To track the motion in video frames, the absolute temporal differences between consecutive frames are computed and fed into the recognition system. Several experiments have been conducted on three benchmarking datasets of two sign languages to evaluate the proposed models. We also compared the proposed models with other techniques. The attained results show that our models capture better spatio-temporal features pertaining to the recognition of various sign language gestures and consistently outperform other techniques with over 99% accuracy.

DOI

10.55730/1300-0632.3952

Keywords

Sign language recognition, gesture recognition, sign language translation, action recognition, Arabic sign language recognition, CNN-LSTM

First Page

2508

Last Page

2525

Recommended Citation

LUQMAN, H, & ELALFY, E (2022). Utilizing motion and spatial features for sign language gesture recognition using cascaded CNN and LSTM models. Turkish Journal of Electrical Engineering and Computer Sciences 30 (7): 2508-2525. https://doi.org/10.55730/1300-0632.3952

Download

Included in

Computer Engineering Commons, Computer Sciences Commons, Electrical and Computer Engineering Commons

COinS

Turkish Journal of Electrical Engineering and Computer Sciences

Utilizing motion and spatial features for sign language gesture recognition using cascaded CNN and LSTM models

Abstract

DOI

Keywords

First Page

Last Page

Recommended Citation

Included in

Issues by Year

Search

Turkish Journal of Electrical Engineering and Computer Sciences

Utilizing motion and spatial features for sign language gesture recognition using cascaded CNN and LSTM models

Authors

Abstract

DOI

Keywords

First Page

Last Page

Recommended Citation

Included in

Share

Issues by Year

Search