Turkish Journal of Electrical Engineering and Computer Sciences
DOI
10.3906/elk-1305-112
Abstract
Over the last couple of decades, web classification has gradually transitioned from a syntax- to semantic-centered approach that classifies the text based on domain ontologies. These ontologies are either built manually or populated automatically using machine learning techniques. A prerequisite condition to build such systems is the availability of ontology, which may be either full-fledged domain ontology or a seed ontology that can be enriched automatically. This is a dependency condition for any given semantics-based text classification system. We share the details of a proof of concept of a web classification system that is self-governed in terms of ontology population and does not require any prebuilt ontology, neither full-fledged nor seed. It starts from a user query, builds a seed ontology from it, and automatically enriches it by extracting concepts from the downloaded documents only. The evaluated parameters like precision (85{\%}), accuracy (86{\%}), AUC (convex), and MCC (high positive) demonstrate the better performance of the proposed system when compared with similar automated text classification systems.
Keywords
Ontology, support vector machine, resource description framework, text classification
First Page
1393
Last Page
1404
Recommended Citation
MANUJA, MANOJ and GARG, DEEPAK
(2015)
"Intelligent text classification system based on self-administered ontology,"
Turkish Journal of Electrical Engineering and Computer Sciences: Vol. 23:
No.
5, Article 15.
https://doi.org/10.3906/elk-1305-112
Available at:
https://journals.tubitak.gov.tr/elektrik/vol23/iss5/15
Included in
Computer Engineering Commons, Computer Sciences Commons, Electrical and Computer Engineering Commons