On the explainable properties of 1-Lipschitz Neural Networks: An Optimal Transport Perspective. (arXiv:2206.06854v2 [cs.AI] UPDATED)
By: <a href="http://arxiv.org/find/cs/1/au:+Serrurier_M/0/1/0/all/0/1">Mathieu Serrurier</a> (IRIT, UT), <a href="http://arxiv.org/find/cs/1/au:+Mamalet_F/0/1/0/all/0/1">Franck Mamalet</a> (UT), <a href="http://arxiv.org/find/cs/1/au:+Fel_T/0/1/0/all/0/1">Thomas Fel</a> (UT), <a href="http://arxiv.org/find/cs/1/au:+Bethune_L/0/1/0/all/0/1">Louis Béthune</a> (UT3, UT, IRIT), <a href="http://arxiv.org/find/cs/1/au:+Boissin_T/0/1/0/all/0/1">Thibaut Boissin</a> (UT) Posted: June 23, 2023
Input gradients have a pivotal role in a variety of applications, including
adversarial attack algorithms for evaluating model robustness, explainable AI
techniques for generating Saliency Maps, and counterfactual explanations.
However, Saliency Maps generated by traditional neural networks are often noisy
and provide limited insights. In this paper, we demonstrate that, on the
contrary, the Saliency Maps of 1-Lipschitz neural networks, learnt with the
dual loss of an optimal transportation problem, exhibit desirable XAI
properties: They are highly concentrated on the essential parts of the image
with low noise, significantly outperforming state-of-the-art explanation
approaches across various models and metrics. We also prove that these maps
align unprecedentedly well with human explanations on ImageNet. To explain the
particularly beneficial properties of the Saliency Map for such models, we
prove this gradient encodes both the direction of the transportation plan and
the direction towards the nearest adversarial attack. Following the gradient
down to the decision boundary is no longer considered an adversarial attack,
but rather a counterfactual explanation that explicitly transports the input
from one class to another. Thus, Learning with such a loss jointly optimizes
the classification objective and the alignment of the gradient , i.e. the
Saliency Map, to the transportation plan direction. These networks were
previously known to be certifiably robust by design, and we demonstrate that
they scale well for large problems and models, and are tailored for
explainability using a fast and straightforward method.
Provided by:
http://arxiv.org/icons/sfx.gif