Quantifying vulnerabilities of deep learning computing systems to adversarial perturbations
Abstract:
Mechanisms are provided for generating an adversarial perturbation attack sensitivity (APAS) visualization. The mechanisms receive a natural input dataset and a corresponding adversarial attack input dataset, where the adversarial attack input dataset comprises perturbations intended to cause a misclassification by a computer model. The mechanisms determine a sensitivity measure of the computer model to the perturbations in the adversarial attack input dataset based on a processing of the natural input dataset and corresponding adversarial attack input dataset by the computer model. The mechanisms generate a classification activation map (CAM) for the computer model based on results of the processing and a sensitivity overlay based on the sensitivity measure. The sensitivity overlay graphically represents different classifications of perturbation sensitivities. The mechanisms apply the sensitivity overlay to the CAM to generate and output a graphical visualization output of the computer model sensitivity to perturbations of adversarial attacks.
Information query
Patent Agency Ranking
0/0