Unveiling the Limits of Learned Local Search Heuristics: Are You the Mightiest of the Meek?

Authors:

(1) Ankur Nath, Department of Computer Science and Engineering, Texas A&M University;

(2) Alan Kuhnle, Department of Computer Science and Engineering, Texas A&M University.

Table of Links

Abstract & Introduction

Related work

Evaluation for Max-Cut

Evaluation for SAT

Summary and Outlook, References

Supplementary Materials

Abstract

In recent years, combining neural networks with local search heuristics has become popular in the field of combinatorial optimization. Despite its considerable computational demands, this approach has exhibited promising outcomes with minimal manual engineering. However, we have identified three critical limitations in the empirical evaluation of these integration attempts. Firstly, instances with moderate complexity and weak baselines pose a challenge in accurately evaluating the effectiveness of learning-based approaches. Secondly, the absence of an ablation study makes it difficult to quantify and attribute improvements accurately to the deep learning architecture. Lastly, the generalization of learned heuristics across diverse distributions remains underexplored. In this study, we conduct a comprehensive investigation into these identified limitations. Surprisingly, we demonstrate that a simple learned heuristic based on Tabu Search surpasses state-of-the-art (SOTA) learned heuristics in terms of performance and generalizability. Our findings challenge prevailing assumptions and open up exciting avenues for future research and innovation in combinatorial optimization.

1 INTRODUCTION

Designing effective heuristics or approximation algorithms for NP-hard combinatorial optimization (CO) problems is a challenging task, often requiring domain knowledge and extensive trial-and-error. Thus, the idea of automating this demanding and tedious design process through machine learning to learn algorithms that exploit the inherent structure of a problem has gained significant interest among researchers (Bello et al., 2016; Khalil et al., 2017; Bengio et al., 2021; Dong et al., 2021). Particularly, a considerable portion of these works (Khalil et al., 2017; Barrett et al., 2020; Yolcu and P´oczos, 2019) have concentrated on employing Graph Neural Networks (GNNs) for CO problems. Despite the computational demands, these GNN-based approaches have demonstrated competitive performance compared to SOTA heuristics tailored to specific problems.

While these potential advancements offer great promise, some concerns remain. The superior performance of the learned heuristics can be attributed to the selection of specific instances and baselines. Specifically, if the baselines are weak, the learned heuristics can easily outperform them. Without hard instances and proper baseline selection, the learned heuristics can easily show comparable performance with the SOTA heuristics, and this can lead to an overestimation of the actual capabilities of the learned heuristics. Moreover, comparisons with SOTA heuristics are occasionally left out due to scalability challenges with deep learning architectures.

A subset of learning-based approaches (Khalil et al., 2017; Yolcu and P´oczos, 2019; Barrett et al., 2020, 2022) incorporate the functionality or behavior of traditional heuristics, potentially offering improved or enhanced performance by integrating heuristics principles with machine learning components. If a comprehensive comparison with the integrated heuristics and an ablation study of the deep learning architecture are lacking, it becomes challenging to determine the specific contribution of the deep learning architecture. Consequently, if the integrated heuristics are robust, the learned heuristics can seamlessly outperform the baseline heuristics, while the deep learning architecture plays a little role.

Another great achievement of learned heuristics (Khalil et al., 2017; Barrett et al., 2020; Toenshof et al., 2019), initially trained on smaller and specific instances from a particular distribution, exhibits impressive performance when tested on larger instances from different distributions. This achievement stands as a significant accomplishment, aligning with the core objective of employing learning-based approaches: to mitigate the necessity for extensive customization and domain-specific knowledge often required by heuristics. Although classical heuristics with hyperparameters may also encounter challenges if they are fine-tuned for a specific distribution, they may also generalize across different distributions. The primary inquiry revolves around whether the learned heuristics indeed demonstrate superior generalizability compared to classical heuristics. A thorough and insightful comparison of learned heuristics against classical heuristics provides valuable insights into the generalizability of learned heuristics.

Learned heuristics often lack theoretical guarantees, making empirical evaluation the sole method to comprehend the strengths and limitations of proposed methodologies. We believe that several fundamental, yet pivotal questions remain unexplored in the empirical evaluation of these works. While these inquiries are pertinent to all types of learned heuristics, we ask and answer them by analyzing highly cited and recent peer-reviewed publications focused on learning local search heuristics. Our goal is not to provide exhaustive benchmarks for the CO problems discussed in our work but rather to assist future researchers in evaluating their research.

Concretely, we ask and answer the following questions:

1. Can learned heuristics be over-parameterized? Absolutely. Replacing the GNN in ECO-DQN with linear regression and utilizing a subset of features from ECO-DQN (Barrett et al., 2020) that links to Tabu Search (Glover, 1990), we introduce a pruned version of ECO-DQN termed SoftTabu. Our study demonstrates that SoftTabu showcases superior performance and generalizability in comparison to ECO-DQN for the NPhard Maximum-Cut (Max-Cut) problem.

2. Can baseline bias be attributed to the superior performance of learned heuristics? Yes, we demonstrate that SoftTabu, a vanilla learned heuristic, can outperform S2V-DQN (Khalil et al., 2017) for the Max-Cut problem and GNNSAT (Yolcu and P´oczos, 2019) for Boolean Satisfiability (SAT).

*3. Can learned heuristics demonstrate superior generalizability owing to instance selection bias?*Yes, we demonstrate that ECO-DQN shows poor generalization on harder instances and easily gets trapped in the search space.

This paper is available on arxiv under CC 4.0 DEED license.