XAI-based Comparison of Audio Event Classifiers with different Input Representations

20TH INTERNATIONAL CONFERENCE ON CONTENT-BASED MULTIMEDIA INDEXING, CBMI 2023(2023)

引用 0|浏览2
暂无评分
摘要
Deep neural networks are a promising tool for Audio Event Classification. In contrast to other data like natural images, there are many sensible and non-obvious representations for audio data, which could serve as input to these models. Due to their black-box nature, the effect of different input representations has so far mostly been investigated by measuring classification performance. In this work, we leverage eXplainable AI (XAI), to understand the underlying classification strategies of models trained on different input representations. Specifically, we compare two model architectures with regard to relevant input features used for Audio Event Detection: one directly processes the signal as the raw waveform, and the other takes its time-frequency spectrogram representation as input. We show how relevance heatmaps obtained via Layer-wise Relevance Propagation uncover representation-dependent decision strategies. With these insights, we can make a well-informed decision about the best model and input representation in terms of robustness and representativity. Further, we can test whether the model's classification strategies align with human requirements.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要