Prioritize Program Diversity: Enumerative Synthesis with Entropy Ordering
Perry Alexander
Drew Davidson
Jennifer Lohoefener
Program synthesis is a popular way to create a correct-by-construction program from a user-provided specification.
Term enumeration is a leading technique to systematically explore the space of programs by generating terms from a formal grammar.
These terms are treated as candidate programs which are tested/verified against the specification for correctness.
In order to prioritize candidates more likely to satisfy the specification, enumeration is often ordered by program size or other domain-specific heuristics.
However, domain-specific heuristics require expert knowledge, and enumeration by size often leads to terms comprised of frequently
repeating symbols that are less likely to satisfy a specification.
In this thesis, we build a heuristic that prioritizes term enumeration based on variability of individual symbols in the program, i.e.,
information entropy of the program. We use this heuristic to order programs in both top-down and bottom-up enumeration.
We evaluated our work on a subset of the PBE-String track of the 2017 SyGuS competition benchmarks and compared against size-based enumeration.
Top-down enumeration guided by entropy expands upon fewer partial expressions than naive in 77\% of benchmarks,
and tests fewer complete expressions in 54\%, resulting in improved synthesis time in 40\% of benchmarks.
However, 71\% of benchmarks in bottom-up enumeration using entropy tests fewer expressions than naive enumeration, without any improvements to the running time.
We conclude entropy is a promising direction to prioritize candidates during program search in enumerative synthesis,
and propose a future directions for improving performance of our proposed techniques.