Publication Type

Journal Article

Version

publishedVersion

Publication Date

3-2022

Abstract

Backtracking search algorithms are often used to solve the Constraint Satisfaction Problem (CSP), which is widely applied in various domains such as automated planning and scheduling. The efficiency of backtracking search depends greatly on the variable ordering heuristics. Currently, the most commonly used heuristics are hand-crafted based on expert knowledge. In this paper, we propose a deep reinforcement learning based approach to automatically discover new variable ordering heuristics that are better adapted for a given class of CSP instances, without the need of relying on hand-crafted features and heuristics. We show that directly optimizing the search tree size is not convenient for learning, and propose to optimize the expected cost of reaching a leaf node in the search tree. To capture the complex relations among the variables and constraints, we design a representation scheme based on Graph Neural Network that can process CSP instances with different sizes and constraint arities. Experimental results on random CSP instances show that on small and medium sized instances, the learned policies outperform classical hand-crafted heuristics with smaller search tree (up to 10.36% reduction). Moreover, without further training, our policies directly generalize to instances of larger sizes and much harder to solve than those in training, with even larger reduction in the search tree size (up to 18.74%).

Keywords

Constraint Satisfaction Problem;Variable ordering;Deep reinforcement learning;Graph Neural Network

Discipline

Artificial Intelligence and Robotics

Research Areas

Intelligent Systems and Optimization

Publication

Engineering Applications of Artificial Intelligence

Volume

109

First Page

1

Last Page

12

ISSN

0952-1976

Identifier

10.1016/j.engappai.2021.104603

Publisher

Elsevier

Additional URL

https://doi.org/10.1016/j.engappai.2021.104603

Share

COinS