libFuzzer: Coverage-Guided Fuzzing Done Right

libFuzzer's Edge Coverage Engine Maximizes Path Exploration

libFuzzer runs in-process, feeding mutated byte arrays to a user-defined fuzz target while LLVM's SanitizerCoverage tracks executed edges and blocks. It prioritizes mutations that expand coverage, saving them to a corpus for future seeding. This evolutionary approach beats random input generation by focusing on undiscovered code paths.

Tradeoffs are clear: it's single-process per run (unless parallelized), demands deterministic targets to avoid wasting cycles on noise, and shines on structured inputs like parsers but struggles without seed corpora for complex formats. Authors shifted to Centipede for new development, but libFuzzer remains bug-fixed and mature.

"LibFuzzer is an in-process, coverage-guided, evolutionary fuzzing engine." This defines its core: no external processes, direct library linkage via a simple entrypoint.

Key decision: pair with sanitizers. AddressSanitizer (ASan) catches memory errors, UndefinedBehaviorSanitizer (UBSan) flags logic bugs, MemorySanitizer (MSan) hunts uninitialized reads—experimental but potent. Without them, you miss most bugs fuzzing reveals.

Fuzz Targets: Narrow, Fast, and Forgiving by Design

Start with LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size)—call your API, return 0. No dependencies on libFuzzer, so reuse with AFL or Radamsa.

Critical constraints ensure efficiency:

Tolerate any input: zero-length, gigabytes, garbage.
No exit()—crashes via signals or sanitizers only.
Join threads before return.
Deterministic: seed randomness from input bytes.
Sub-quadratic time: avoid logs, heavy allocs.
Minimal globals; narrow scope—one format per target.

Splitting targets (e.g., PNG vs. JPG) isolates formats, speeding coverage per run. Wide targets dilute focus, slowing discovery.

"The fuzzing engine will execute the fuzz target many times with different inputs in the same process." This demands resilience—design for billions of calls.

"Usually, the narrower the target the better. E.g. if your target can parse several data formats, split it into several targets, one per format." Narrow wins by concentrating mutations.

Corpus-Driven Mutation: Seed Smart, Merge Often

Seeds are king: copy valid/invalid samples (e.g., PNGs for image libs) to CORPUS_DIR. Empty starts work but crawl for structured data.

Run ./my_fuzzer CORPUS_DIR—new coverage savers auto-add to dir1. Minimize bloated corpora: ./my_fuzzer -merge=1 NEW_DIR FULL_DIR keeps only coverage-unique inputs.

Resume interrupted merges with -merge_control_file=PATH and SIGUSR1—vital for cloud VMs. Corpora double as regression suites: list files, no fuzzing, just validate.

Mutations include bit flips, crossovers, inserts—logged as MS:3 CrossOver-ChangeBit-InsertByte-. Reduce flag shrinks inputs without losing features.

Parallelism: Jobs Scale Workers, Fork Adds Resilience

Single-threaded per process, but -jobs=N launches parallel workers (default: jobs/2 cores) sharing corpus via periodic reloads (-reload=1). Logs to fuzz-<JOB>.log.

Experimental -fork=N spawns fork-exec children with corpus subsets, merging survivors. Ignores OOMs/timeouts/crashes (-ignore_ooms=1 default)—top process orchestrates. Replaces jobs/workers long-term.

On 12-cores: -jobs=30 runs ~6 workers x5 jobs each. Shared corpus accelerates collective coverage.

"This has the advantage that any new inputs found by one fuzzer process will be available to the other fuzzer processes." Parallel sharing beats isolated runs.

Flags: Tune for Speed, Depth, and Constraints

Clang 6+: clang -g -O1 -fsanitize=fuzzer,address mytarget.cc—auto-links libFuzzer main(). -O1 balances speed/debug; -g for stacks.

Core flags:

Flag	Effect	Default
`-runs=N`	Stop after N iterations	-1 (indefinite)
`-max_len=N`	Max input size	Auto-guess
`-timeout=S`	Per-input sec limit	1200
`-rss_limit_mb=M`	RSS cap	2048
`-max_total_time=S`	Total sec	0 (indefinite)
`-workers=N`	Parallel procs	auto

Dictionaries (-dict=FILE) seed keywords like kw1="blah" or \xF7\xF8—boosts for protocols. -use_value_profile=1 + trace-cmp treats CMP arg diffs as coverage. -only_ascii=1 limits printable.

"If a mutation triggers execution of a previously-uncovered path in the code under test, then that mutation is saved to the corpus." Coverage is the sole retention criteria.

Output Signals Progress and Bugs

stderr logs:

INITED: Post-seed coverage.
NEW: Coverage bump, saved.
REDUCE: Slimmer equivalent.
RELOAD: Parallel sync.

Stats: cov:42 ft:50 corp:100/1kb lim:4096 exec/s:10k rss:2Gb L:50/100 MS:2 ChangeByte-CrossOver-

Crashes: crash-<sha1>, timeouts timeout-<sha1>. -artifact_prefix=./ custom paths.

Toy example: Fuzz "HI!" trap—hits in seconds from empty corpus, writes crash-... with HI!.

Advanced: CMP Tracing and Value Profiles

-fsanitize-coverage=trace-cmp (default in fuzzer) intercepts CMPs, biasing mutations toward equal args—cracks parsers. Value profile (-use_value_profile=1) bitsets popcount(arg XOR) for finer signals.

Real bugs: tutorial.libfuzzer.info shows Heartbleed in 1s, more targets.

Key Takeaways

Write narrow, deterministic fuzz targets: one format, no globals, sub-quadratic—one per binary.
Always compile with -fsanitize=fuzzer,address for memory/UB detection; add UBSan parts.
Seed corpora with 10-100 diverse samples; merge/minimize regularly for efficiency.
Scale via -jobs=100 -workers=cores/2; try -fork=N for resilient cloud runs.
Monitor cov:, ft:, exec/s:—aim 1k+/s; tune -max_len, -timeout if stalled.
Use dictionaries for domain bytes (e.g., HTTP headers); enable value profiles for compares.
Regression test corpora: ./fuzzer file1 file2—no mutations, just validate.
-print_final_stats=1 for totals; -help=1 lists all (~50 flags).