•  
  •  
 

Turkish Journal of Electrical Engineering and Computer Sciences

Author ORCID Identifier

SHAN CHEN: 0009-0009-9219-1898

Na MENG: 0000-0002-1573-9786

HAOYUAN LI: 0009-0003-2782-7745

WEIWEI FANG: 0000-0002-6407-7467

DOI

10.55730/1300-0632.4084

Abstract

Environmental sound classification (ESC) is one of the important research topics within the non-speech audio classification field. While deep neural networks (DNNs) have achieved significant advances in ESC recently, their high computational and memory demands render them highly unsuitable for direct deployment on resource-constrained Internet of Things (IoT) devices based on microcontroller units (MCUs). To address this challenge, we propose a novel DNN compression framework specifically designed for such devices. On the one hand, we leverage pruning techniques to significantly compress the large number of model parameters in DNNs. To reduce the accuracy loss that follows pruning, we propose a knowledge distillation scheme based on feature information from multiple intermediate layers. On the other hand, we design a two-stage quantization-aware knowledge distillation scheme to mitigate the accuracy degradation of mandatory quantization required by MCU hardware. We evaluate our framework on benchmark ESC datasets (UrbanSound8K, ESC-50) using the STM32F746ZG device. The experimental results demonstrate that our framework can achieve compression rates up to 97\% while maintaining competitive inference performance compared to the uncompressed baseline.

Keywords

deep neural networks, Environmental sound classification, knowledge distillation, microcontroller units

First Page

501

Last Page

515

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Share

COinS