Fuzzing Terminology | Max Ammann

Research in Fuzzing has gained a lot of traction in the last decade. A lot of open source fuzzers have been implemented and are available on Github. Everyone who already developed any application knows the pain of naming things. It is very difficult to have a common understanding of the terms used in a project. A standard software engineering practice is to use a glossary. This usually only scaled to a team or a small company but not to thousands of fuzzers on Github. Also, documentation is more of a gimmick than a comprehensive guide in most projects.

Therefore, the paper “The Art, Science, and Engineering of Fuzzing” tries to create a common terminology of fuzzing. It also gives a simple common view on how fuzzers work and includes a major survey of fuzzers and their relations to each other. In this post I’ll try to give an overview of the terminology.

The Term “Fuzzing”

The term “to fuzz” describes a program which “generates a stream of random characters to be consumed by a target program” ¹. And this is actually what a fuzzer is doing. It sends random input to a program to test it for reliability. Miller et al. wrote a tool called fuzz to test Unix utilities. The source code of it can be found here. One could argue that this is the first fuzzer!

A recent publication by Miller asks whether fuzz testing is solved by now ². I want to read this paper in the future.

Terminology

Now lets dive into the terminology. I want to do this in a table and give short explanations.

Term	Description
PUT	Program Under Test
Fuzz Input Space	Space of inputs for the PUT
Fuzzing	Execution of a PUT with input from the fuzz input space
Security Policy	e.g. “PUT does not write outside of indented memory”, “PUT follows the TLS state machine” ³
Fuzz Testing	Use of fuzzing to test whether a PUT violates a security policy
Fuzzer	The program which fuzz tests the PUT
Fuzz Campaign	Specific execution of the PUT
Bug Oracle	Decides if an execution violates the security policy
Fuzz Configuration	Parameters of the Fuzzer program
Seed	(Well-)structured input of the PUT which is mutated while fuzzing
Seed Pool	Collection of seeds
Seed Trimming	Reduce size of a seed without reducing quality like keeping same execution coverage
Driver Application	Helper program around the PUT to execute it. For example a wrapper about a library. This is also called a test harness

One may ask what the difference between fuzzing and software testing is. From a technical point of view there isn’t any. Both try to find bugs in software. But there is a significant difference in the intention of fuzzing. The term fuzzing is mostly used for security related testing ⁴. Often source code is not available when testing for security issues. Usually when testing software as a software engineer you have the source code which allows you to write unit tests.

Usual implementation of a Fuzzer

Manes et al. give a good overview the algorithm behind fuzzers ⁴. Of course this is very simplified. LibAFL uses a distributed fuzzing architecture with a broker and multiple clients for example.

 1bug_oracle = ...
 2
 3def fuzz(fuzz_configs):
 4  bugs = []
 5
 6  while continue(fuzz_configs):
 7    config = schedule(fuzz_configs)
 8    concrete_test_cases = input_gen(config)
 9    
10    new_bugs, exec_feedback = input_eval(config, concrete_test_cases, bug_oracle)
11    
12    fuzz_configs = conf_update(fuzz_configs, config, exec_feedback)
13    bugs = bugs.union(new_bugs)
14    
15  return bugs

I used pseudo-python code here and descriptive names to avoid writing what it does. It should be quite easy to understand.

Model-based vs Model-less Fuzzers

The goal of my master thesis will be to fuzz a TLS library. The approach we want to go is model guided. But what does that mean?

According to the literature research by Manes et al. there are model-based fuzzers and model-less fuzzers ⁴. The two approaches differ in the way how the inputs to the PUT are generated. Inputs can be generated using a model like a grammar or they are randomly generated by flipping bits of a seed.

Model-based fuzzers can further be divided into fuzzers which use a predefined or inferred models. Predefined models are for example grammars for text-based network protocols or descriptions of binary protocols. Inferred models are derived from datasets. That means the burden of designing the model is no longer on the fuzzer developers. Manes et al. mention that only a few fuzzers use this technique ⁴.

Maybe this could be beneficial for my model-guided approach to fuzzing. There is already related work by PULSAR ⁵, who infer a model from network captures and Ruiter et. al who used this technique to fuzz TLS ⁶.

Model-less fuzzers use the already mentioned seeds. A seed can be a text file, binary file or a network packet. Then bits are flipped in the seed to yield new inputs.

Optimization Opportunities

There are multiple points at which fuzzers can be optimized. Notably the important points are the function in the pseudo-code above:

Fn	Opportunity
schedule	Choosing configurations based on good algorithms (AFL uses an evolutionary algorithm)
input_gen	Generate good inputs which trigger bugs
input_eval	Faster execution by using in-memory fuzzing or avoiding stdout
conf_update	For evolutionary algorithms: choose a good fitness indicator

An other way to improve fuzzing coverage or speed is to skip checksum checks. For example it is notoriously difficult to generate an input which passes the checksum AND triggers a bug. This is also called PUT mutation.

Guided Fuzzing

Guided fuzzing includes an additional analysis to guide the fuzzer. According to Manes et al. there are two phases:

initial (static) program analysis
generate concrete test cases with the guidance of the analysis

An example for this is the hot bytes analysis in TaintScope. They used a taint analysis to find input bytes which “flow into critical systems”. For my thesis this could mean that I first analyze the TLS engine and then use these results to guide the fuzzer. By analyzing a lightly-annotated state machine of a TLS engine I could find out which input lead to certain states.

LibAFL and Google terms

LibAFL is a framework for fuzzing. You could compare it to TensorFlow, just for Fuzzing. Their terminology is similar, but of course adjusted to the AFL world.

The cool thing is that you can just read about the used terms in their documentation. There is also the AFL glossary which includes some of the terms used in LibAFL.

Term	Description
Target	Same as a PUT
Corpus	Same as a seed pool
Executor	Executes the PUT
Generator	Generates random numbers and characters
Observers	Observe information about the PUT during execution
Feedback	Information retrieved about an execution, like execution time, crashes. Also determines whether the execution was interesting.
Mutators	Mutate seeds and input the PUT
Stats	Statistics about the fuzzing progress, speed etc.
Cross-pollination	Using input generated for a target for another target.
Dictionary	Set of tokens which should be processed together while mutating
Reproducer/Test Case	Test input which can be used to trigger a bug

An empirical study of the reliability of UNIX utilities p. 34 “The Tools” ↩︎
The Relevance of Classic Fuzz Testing :Have We Solved This One? ↩︎
The security policy should be observable from the actual execution. If the execution leaks for example the reached lines of code, then this can be included in the security policy. For example a security policy could be. “Never reach line 42 in state_machine.c”. In LibAFL, everything that is observable can be found in the Feedback API. That means you give the feedback of an execution to a Bug Oracle and let it determine whether a security policy has been violated. ↩︎
The Art, Science, and Engineering of Fuzzing ↩︎ ↩︎ ↩︎ ↩︎
Pulsar: Stateful Black-Box Fuzzing of Proprietary Network Protocols ↩︎
ProtocolstatefuzzingofTLSimplementations ↩︎