Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
1-2020
Abstract
Math word problem (MWP) is challenging due to the limitation in training data where only one “standard” solution is available. MWP models often simply fit this solution rather than truly understand or solve the problem. The generalization of models (to diverse word scenarios) is thus limited. To address this problem, this paper proposes a novel approach, TSN-MD, by leveraging the teacher network to integrate the knowledge of equivalent solution expressions and then to regularize the learning behavior of the student network. In addition, we introduce the multiple-decoder student network to generate multiple candidate solution expressions by which the final answer is voted. In experiments, we conduct extensive comparisons and ablative studies on two large-scale MWP benchmarks, and show that using TSN-MD can surpass the state-of-the-art works by a large margin. More intriguingly, the visualization results demonstrate that TSN-MD not only produces correct final answers but also generates diverse equivalent expressions of the solution.
Discipline
Artificial Intelligence and Robotics | Databases and Information Systems | Mathematics | Numerical Analysis and Scientific Computing
Research Areas
Intelligent Systems and Optimization
Publication
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence 2020: Yokohama
First Page
4011
Last Page
4017
Identifier
10.24963/ijcai.2020/555
Publisher
IJCAI
City or Country
Menlo Park, CA
Citation
ZHANG, Jipeng; LEE, Roy Ka-Wei; LIM, Ee-peng; QIN, Wei; WANG, Lei; SHAO, Jie; and SUN, Qianru.
Teacher-student networks with multiple decoders for solving math word problem. (2020). Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence 2020: Yokohama. 4011-4017.
Available at: https://ink.library.smu.edu.sg/sis_research/5320
Copyright Owner and License
Authors
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.24963/ijcai.2020/555
Included in
Artificial Intelligence and Robotics Commons, Databases and Information Systems Commons, Mathematics Commons, Numerical Analysis and Scientific Computing Commons