Publication Type
Journal Article
Version
acceptedVersion
Publication Date
4-2021
Abstract
The ability to read, reason, and infer lies at the heart of neural reasoning architectures. After all, the ability to perform logical reasoning over language remains a coveted goal of Artificial Intelligence. To this end, models such as the Turing-complete differentiable neural computer (DNC) boast of real logical reasoning capabilities, along with the ability to reason beyond simple surface-level matching. In this brief, we propose the first probe into DNC's logical reasoning capabilities with a focus on text-based question answering (QA). More concretely, we propose a conceptually simple but effective adversarial attack based on metamorphic relations. Our proposed adversarial attack reduces DNCs' state-of-the-art accuracy from 100% to 1.5% in the worst case, exposing weaknesses and susceptibilities in modern neural reasoning architectures. We further empirically explore possibilities to defend against such attacks and demonstrate the utility of our adversarial framework as a simple scalable method to improve model adversarial robustness.
Keywords
Task analysism, Cognition, Plugs;Perturbation methods, Memory modules, Computer architecture, Computational modeling, Adversarial examples, deep learning, differentiable neural computer (DNC), supervised learning
Discipline
OS and Networks | Software Engineering
Research Areas
Software and Cyber-Physical Systems
Publication
IEEE Transactions on Neural Networks and Learning Systems
First Page
1
Last Page
7
ISSN
2162-2388
Identifier
10.1109/TNNLS.2021.3072166
Publisher
Institute of Electrical and Electronics Engineers
Citation
CHAN, Alvin; MA, Lei; JUEFEI-XU, Felix; ONG, Yew-Soon; XIE, Xiaofei; XUE, Minhui; and LIU, Yang.
Breaking neural reasoning architectures with metamorphic relation-based adversarial examples. (2021). IEEE Transactions on Neural Networks and Learning Systems. 1-7.
Available at: https://ink.library.smu.edu.sg/sis_research/7050
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.