Turkish Journal of Electrical Engineering and Computer Sciences
DOI
10.55730/1300-0632.4078
Abstract
Learning a robust and invariant representation of various unwanted factors in sign language recognition (SLR) applications is essential. One of the factors that might degrade the sign recognition performance is the lack of signer diversity in the training datasets, causing a dependence on the singer’s identity during representation learning. Consequently, capturing signer-specific features hinders the generalizability of SLR systems. This study proposes a feature disentanglement framework comprising a convolutional neural network (CNN) and a long short-term memory (LSTM) network based on adversarial training to learn a signer-independent sign language representation that might enhance the recognition of signs. We aim to improve the feature representations by incorporating various regularization techniques to facilitate feature disentanglement. Particularly, Kullback-Leibler divergence between uniform distribution and output of a signer classifier is employed to reduce the effect of signer identity on spatial embeddings. Similarly, the optimal transport (OT) distance and mean square error are investigated to minimize the disparity between the spatial and temporal representations of the same signs performed by different signers. The proposed framework is evaluated on two Turkish isolated sign language datasets constituting varying characteristics and challenges. The qualitative results show that the proposed feature disentanglement framework helps reduce the influence of the signer’s identity on the sign representations. According to the quantitative analyses, the best performances of 94% and 89% classification accuracy are obtained for two Turkish sign language benchmark datasets, BosphorusSign22k and Ankara University Turkish Sign Language (AUTSL) datasets, respectively.
Keywords
Sign language recognition, adversarial learning, disentangled representation learning, deep neural networks
First Page
420
Last Page
435
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
Recommended Citation
BAYTAŞ, İNCİ MELİHA and ERDOĞAN, İpek
(2024)
"Signer-independent sign language recognition with feature disentanglement,"
Turkish Journal of Electrical Engineering and Computer Sciences: Vol. 32:
No.
3, Article 5.
https://doi.org/10.55730/1300-0632.4078
Available at:
https://journals.tubitak.gov.tr/elektrik/vol32/iss3/5
Included in
Computer Engineering Commons, Computer Sciences Commons, Electrical and Computer Engineering Commons