Turkish Journal of Electrical Engineering and Computer Sciences




Over the last couple of decades, web classification has gradually transitioned from a syntax- to semantic-centered approach that classifies the text based on domain ontologies. These ontologies are either built manually or populated automatically using machine learning techniques. A prerequisite condition to build such systems is the availability of ontology, which may be either full-fledged domain ontology or a seed ontology that can be enriched automatically. This is a dependency condition for any given semantics-based text classification system. We share the details of a proof of concept of a web classification system that is self-governed in terms of ontology population and does not require any prebuilt ontology, neither full-fledged nor seed. It starts from a user query, builds a seed ontology from it, and automatically enriches it by extracting concepts from the downloaded documents only. The evaluated parameters like precision (85{\%}), accuracy (86{\%}), AUC (convex), and MCC (high positive) demonstrate the better performance of the proposed system when compared with similar automated text classification systems.


Ontology, support vector machine, resource description framework, text classification

First Page


Last Page