Turkish Journal of Electrical Engineering and Computer Sciences
Author ORCID Identifier
ŞERAFETTİN ŞENTÜRK: 0000-0001-8330-6774
VAHİD GAROUSI: 0000-0001-6590-7576
NEJAT YUMUŞAK: 0000-0001-5005-8604
Abstract
Fuzzing is an automated process for detecting crashes and vulnerabilities in software system and it is classified as grammar- or mutation-based in terms of input generation. While the grammar-based fuzzing generates inputs from a specification and takes highly-structured inputs, mutation-based fuzzing generates inputs by modifying input files and abstract syntax trees randomly. There are not many case studies comparing the crash detection capabilities in the scope of mutation-based fuzzing. To add to the body of empirical evidence in this area, this case study compares fuzzing with different mutation strategies to evaluate their effectiveness in three aspects: fault detection effectiveness, fault detection performance and types of faults detected. Additionally we evaluate the effects of seed generation techniques to fuzzing effectiveness. We perform the fuzzing on three XML parsers, libxml2, Apache Xerces and Expat. To perform fuzzing, we use well-known mutation-based fuzzers that have different level of mutation strategies. We carry out the investigations to find the effects of seed generation to fuzzing by utilizing publicly-available seeds and the PCSG (Probabilistic Context Sensitive Grammar) based seeds. In terms of fault detection effectiveness and performance, we evaluate the number of crashes, and the number of test cases generated. With respect to mutation strategies, our results demonstrate that the bit/byte-level mutation strategy detects more crashes than tree-level mutation strategy. According to the fuzzing results, PCSG-based seeds can help detect higher number of crashes than publicly-available ones. In terms of generated test cases, whereas there is less test cases generated for PCSG based seeds compared to publicly-selected ones, bit/byte level mutations result in more test cases when compared to tree level mutation. Empirical results show that crash detection capabilities of fuzzing differ importantly based on the mutation strategy used.
DOI
10.55730/1300-0632.4116
Keywords
case study, Grammar-based fuzzing, intelligent fuzz testing, mutation-based fuzzing, smart seed generation
First Page
86
Last Page
105
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
Recommended Citation
ŞENTÜRK, ŞERAFETTİN; GAROUSI, VAHİD; and YUMUŞAK, NEJAT
(2025)
"A case study of gray-box fuzzing with byte- and tree-level mutation strategies in XML-based applications for exposing security vulnerabilities,"
Turkish Journal of Electrical Engineering and Computer Sciences: Vol. 33:
No.
2, Article 2.
https://doi.org/10.55730/1300-0632.4116
Available at:
https://journals.tubitak.gov.tr/elektrik/vol33/iss2/2
Included in
Computer Engineering Commons, Computer Sciences Commons, Electrical and Computer Engineering Commons