LibAFLGo: Evaluating and Advancing Directed Graybox Fuzzing.
Elia Geretto*, Andrea Jemmett*, Cristiano Giuffrida, Herbert Bos.
10th IEEE European Symposium on Security and Privacy (EuroS&P), 2025.
Abstract
While greybox fuzzing is routinely applied in production environments with great
success, directed greybox fuzzing has struggled to gain real-world
adoption—despite the great (intuitive) promise and the many optimizations
proposed in literature. In practice, directed fuzzers struggle for three critical
issues. First, popular implementations build on and compare to ancient baselines,
often derived from AFLGo. Unfortunately, none of the optimizations that are
essential for performance in modern greybox fuzzers are available in these
baselines. As a result, we find reported improvements in directed fuzzing are often
only “imaginary” and do not lead to better performance on a modern
baseline. Second, directed fuzzing evaluations commonly ignore or misinterpret
important factors affecting fuzzing overhead—such as build times and timeouts.
As design decisions now build on unreliable data, we find the directed fuzzers
perform worse than expected in practice. Third, while almost all directed fuzzers
rely on (expensive) analysis stacks, such as points-to and reachability analysis
components, they often opt for very different implementations. Since these
implementations have their own unique benefits and drawbacks, we find performance
differences of directed fuzzers are frequently due to these components rather than
the proposed directed fuzzing optimization.
In this paper, we investigate the practical impact of these issues by means of an
analysis and evaluation of a representative set of popular directed greybox fuzzers.
As a way forward, we then present LibAFLGo, a modular directed fuzzing framework
that addresses all three issues and allows one to directly compare different
directed fuzzing policies on top of a modern fuzzing stack. Our experimental results
on state-of-the-art directed fuzzing policies provide two main insights. First, the
original AFLGo policies outperform more recent directed fuzzing policies when
testing on a modern fuzzing stack. Second, none of the directed fuzzing policies can
favorably compete with (nondirected) LibAFL, which scored better overall performance
across benchmarks. As such, the quest for efficient directed fuzzing policies must
continue.