학술논문

On the difficulty of benchmarking inductive program synthesis methods
Document Type
Conference
Source
Proceedings of the Genetic and Evolutionary Computation Conference Companion. :1589-1596
Subject
benchmarking
genetic programming
inductive program synthesis
machine learning
Language
English
Abstract
A variety of inductive program synthesis (IPS) techniques have recently been developed, emerging from different areas of computer science. However, these techniques have not been adequately compared on general program synthesis problems. In this paper we compare several methods on problems requiring solution programs to handle various data types, control structures, and numbers of outputs. The problem set also spans levels of abstraction; some would ordinarily be approached using machine code or assembly language, while others would ordinarily be approached using high-level languages. The presented comparisons are focused on the possibility of success; that is, on whether the system can produce a program that passes all tests, for all training and unseen testing inputs. The compared systems are Flash Fill, MagicHaskeller, TerpreT, and two forms of genetic programming. The two genetic programming methods chosen were PushGP and Grammar Guided Genetic Programming. The results suggest that PushGP and, to an extent, TerpreT and Grammar Guided Genetic Programming are more capable of finding solutions than the others, albeit at a higher computational cost. A more salient observation is the difficulty of comparing these methods due to drastically different intended applications, despite the common goal of program synthesis.

Online Access