Over the last few years, exploit kits (EKs) have become the de facto medium for large-scale spread of malware. Drive-by download is the leading method that is widely used by EK flavors to exploit web-based client-side vulnerabilities. Their principal goal is to infect the victim's system with a malware. In addition, EK families evolve quickly, where they port zero-day exploits for brand new vulnerabilities that were never seen before and for which no patch exists. In this paper, we propose a novel approach for categorizing malware infection incidents conducted through EKs by leveraging the inherent "overall URL patterns" in the HTTP traffic chain. The proposed approach is based on the key finding that EKs infect victim systems using a specially designed chain, where EKs lead the web browser to download a malicious payload by issuing several HTTP requests to more than one malicious domain addresses. This practice in use enables the development of a system that is capable of clustering the responsible EK instances. The method has been evaluated with a popular and publicly available dataset that contains 240 different real-world infection cases involving over 2250 URLs, the incidents being linked with the 4 major EK flavors that occurred throughout the year 2016. The system achieves up to 93.7% clustering accuracy with the estimators experimented.
Exploit kit, web malware, drive-by download, URL analysis, unsupervised machine learning, cybercrime
"ZEKI: unsupervised zero-day exploit kit intelligence,"
Turkish Journal of Electrical Engineering and Computer Sciences: Vol. 28:
4, Article 4.
Available at: https://journals.tubitak.gov.tr/elektrik/vol28/iss4/4