•  
  •  
 

Turkish Journal of Electrical Engineering and Computer Sciences

Author ORCID Identifier

RIDHO KUSUMA: 0000-0003-3992-0018

ERUM ASHRAF: 0000-0003-1933-0336

SELVAKUMAR MANICKAM: 0000-0003-4378-1954

SHANKAR KARUPPAYAH: 0000-0003-4801-6370

Abstract

This work presents SENTISEC, a hybrid LLM-based threat detection framework designed to classify security logs by integrating keyword heuristics, domain-adapted sentiment scoring, and Retrieval-Augmented Generation (RAG). The system achieves an overall accuracy of 93.67%, with 91.46% macro recall, 89.07% macro F1, and 95.15% threat recall, while maintaining a low false-positive rate of 1.68%. Its methodology incorporates strict keyword and IOC matching, a domain-tuned DistilBERT sentiment module, hybrid BM25–MiniLM retrieval enhanced with BGE reranking, adaptive quantile-based threshold calibration, and SHAP-based explainability. Comparative evaluations against keyword-only, sentiment-only, classical machine-learning models, and DistilBERT-only baselines show that SENTISEC consistently improves both true-positive and true-negative discrimination, particularly in semantically noisy and imbalanced log environments. The primary error pattern—benign logs incorrectly flagged as threats—is effectively reduced through calibrated FP/TN thresholds, contextual co-occurrence filtering, and weighted fusion of rule-based, sentiment-based, and semantic signals. SHAP analysis further indicates that RAG-derived semantic embeddings exert the strongest influence on final threat decisions, complemented by sentiment logits that refine ambiguous cues and rule-based indicators that anchor explicit IOC matches. Through the integration of symbolic evidence and deep semantic reasoning within a transparent calibration and explanation framework, SENTISEC provides an adaptive, interpretable, and operationally practical solution for analyzing system logs in spyware-oriented threat scenarios.

DOI

10.55730/1300-0632.4169

Keywords

Sentiment analysis, explainable AI (XAI), cyber threat detection, keyword heuristics, hybrid LLM, system log classification

First Page

164

Last Page

184

Publisher

The Scientific and Technological Research Council of Türkiye (TÜBİTAK)

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Share

COinS