Exploring the Interplay of Interpretability and Robustness in Deep Neural Networks: A Saliency-guided Approach

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Adversarial attacks pose a significant challenge to deploying deep learning models in safety-critical applications. Maintaining model robustness while ensuring interpretability is vital for fostering trust and comprehension in these models. This study investigates the impact of Saliency-guided Training (SGT) on model robustness, a technique aimed at improving the clarity of saliency maps to deepen understanding of the model's decision-making process. Experiments were conducted on standard benchmark datasets using various deep learning architectures trained with and without SGT. Findings demonstrate that SGT enhances both model robustness and interpretability. Additionally, we propose a novel approach combining SGT with standard adversarial training to achieve even greater robustness while preserving saliency map quality. Our strategy is grounded in the assumption that preserving salient features crucial for correctly classifying adversarial examples enhances model robustness, while masking non-relevant features improves interpretability. Our technique yields significant gains, achieving a 35\% and 20\% improvement in robustness against PGD attack with noise magnitudes of \(0.2\) and \(0.02\) for the MNIST and CIFAR-10 datasets, respectively, while producing high-quality saliency maps.

Related collections

Author and article information

Journal

Publication date Created: 10 May 2024

Article

ArXiV ID: 2405.06278

SO-VID: f4614944-8c9e-4d79-a46b-8b2372db24c1

License:

http://arxiv.org/licenses/nonexclusive-distrib/1.0/

History

Custom metadata

Categories cs.CV cs.CR

ScienceOpen disciplines: Computer vision & Pattern recognition,Security & Cryptology

Data availability:

ScienceOpen disciplines: Computer vision & Pattern recognition, Security & Cryptology

Exploring the Interplay of Interpretability and Robustness in Deep Neural Networks: A Saliency-guided Approach

Read this article at

Abstract

Related collections

Recursive Rule based Visual Categorization

Author and article information

Journal

Article

History

Custom metadata

Comments

Comment on this article

Similar content 62