DOI

10.3906/elk-1408-77

Abstract

This paper presents an exploration and evaluation of a diverse set of features that influence word-sense disambiguation (WSD) performance. WSD has the potential to improve many natural language processing (NLP) tasks as being one of the most crucial steps in the area. It is known that exploiting effective features and removing redundant ones help improving the results. There are two groups of feature sets to disambiguate senses and select the most appropriate ones among a set of candidates: collocational and bag-of-words (BoW) features. We introduce the effects of using these two feature sets on the Turkish Lexical Sample Dataset (TLSD), which comprises the most ambiguous verb and noun samples. In addition to our results, joint setting of feature groups has been applied to measure additional improvement in the results. Our results suggest that joint setting of features improves accuracy up to 7%. The effective window size of the ambiguous words has been determined for noun and verb sets. Additionally, the suggested feature set has been investigated on a different corpus that had been used in the previous studies on Turkish WSD. The results of the experiments to investigate diverse morphological groups show that word root and the case marker are significant features to disambiguate senses.

Keywords

Bag-of-words features, collocational features, feature selection, supervised methods, word-sense disambiguation

First Page

4391

Last Page

4405

Recommended Citation

İLGEN, BAHAR; ADALI, EŞREF; and TANTUĞ, AHMET CÜNEYD (2016) "Exploring feature sets for Turkish word sense disambiguation," Turkish Journal of Electrical Engineering and Computer Sciences: Vol. 24: No. 5, Article 79. https://doi.org/10.3906/elk-1408-77
Available at: https://journals.tubitak.gov.tr/elektrik/vol24/iss5/79

Download

Included in

Computer Engineering Commons, Computer Sciences Commons, Electrical and Computer Engineering Commons

COinS

Turkish Journal of Electrical Engineering and Computer Sciences

Exploring feature sets for Turkish word sense disambiguation

DOI

Abstract

Keywords

First Page

Last Page

Recommended Citation

Included in

Issues by Year

Search

Turkish Journal of Electrical Engineering and Computer Sciences

Exploring feature sets for Turkish word sense disambiguation

Authors

DOI

Abstract

Keywords

First Page

Last Page

Recommended Citation

Included in

Share

Issues by Year

Search