
Evolutionary Fuzzing Techniques for Smart Application Testing
Explore the innovative VUzzer evolutionary fuzzing tool developed by Sanjay Rawat, Vivek Jain, and team, enabling smarter bug detection and improved testing efficiency across various applications.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
VUzzer: Application-aware Evolutionary Fuzzing Sanjay Rawat, Vivek Jain, Ashish Kumar, Lucian Cojocar, Cristiano Giuffrida, Herbert Bos
What we achieved A really smart fuzzer that understands application to formulate its fuzzing strategies by learning: A fuzzer that outperforms other fuzzers based on advanced techniques , e.g., Symbex, by order of magnitude less number of inputs to trigger bugs. A fuzzer that shows consistent performance over various applications (DARPA CGC, LAVA, other applications) Important offsets in the inputs Important values at certain offsets (magic-bytes) path prioritization
Introduction Fuzzing is a simple, yet powerful testing technique. There have been every effective fuzzers, like AFL. Useful in discovering low-hanging bugs (though!) Why?
Problem with Traditional Fuzzing Blackbox fuzzing: Aiming with luck!
Problem with Traditional Fuzzing smart fuzzing: Aiming with educated guess!
Problem Exemplified. Where is a ? What values? a==\xffd8 Hard-to-reach-paths (deeper buried bugs) Easy paths (superficial paths), error code
Issues identified For smart code-coverage based fuzzer, it is important to have some knowledge about: Where (which offsets in input) to apply mutation What values to replace with. How to avoid traps (paths leading to error handling code)
Fuzzing+Symbex Symbolic/concolic execution can answer such questions. But... Scalability?
Recent Observations on Fuzzing Lava: Large-scale automated vulnerability addition, in Proc. IEEE S&P 16. IEEE Press, 2016. quickly and automatically injecting large numbers of realistic bugs into program source code. Results are not very encouraging for fuzzing!
Recent Observations on fuzzing+Symbex Experience Report: How is Dynamic Symbolic Execution Different from Manual Testing? A Study on KLEE, In: ISSTA'15. Manually developed test suites perform better than KLEE-based test suites on covering hard-to- cover code KLEE-based test suites are less effective on exploring some meaningful paths and generating valid string structural inputs to go through the input parser.
Evolving Our Solution- VUzzer Lets start with something we know- AFL Bitflip, replace,ari thmetic Q No, (perhaps) try more mutation New edge ? Mutate at offset X inputs Execute and monitor edges (BB) Yes, add input to Q
Evolving Our Solution- VUzzer Mutate only interesting offsets and with interesting values (magic-bytes) Moving to Vuzzer Bitflip, replace ment,ari thmetic Is it error handing BB? Q No, (perhaps) try more mutation If so, not interesting. New edge ? Mutate at offset X inputs Execute and monitor edges (BB) Also perform taintflow to determine interesting offsets/values (O/V) Input preference with path prioritization- static analysis Yes, add input to Q
VUzzer: main insights Leverage application s control- and data-flow features to infer input properties: applications is designed to work with that input! Prioritize and deprioritize paths: Certain paths are difficult to execute as they are guarded by constraints (nested conditions)! VUzzer puts emphasis on learning these properties.
Control-flow features Used for paths preference Basic block weights (static analysis) CFG as Markov-chain (enumerating all paths is infeasible!) Nested blocks are hard to reach -> lower probabilities -> higher weights These weights are used in fitness function to raise/lower the input score.
Control-flow features Error code detection (dynamic analysis) Often fuzzing results in invalid inputs, thereby driving execution towards error handling code. Deprioritizing such paths improves fuzzing efforts Vuzzer detects them by comparing execution traces of valid and invalid inputs.
Data-flow features Used for inferring input properties that control the execution. Dynamic taintflow analysis important offsets (cmp, lea) -> mutation to focus upon Values (branch constraints) (cmp) -> magic-byte detection. Static analysis Constant bytes (branch constraint?)
Evaluation DARPA CGC binaries Various applications with binary input format as used in other work (VA) A set of buggy binaries recently proposed in LAVA
Results DARPA CGC binaries 29/23
Results LAVA
Results VA dataset
Crash Triage !exploitable (not very conclusive) Our heuristics based on library calls Manual Analysis: tcpdump (Out of bound read, fixed) Mpg321 (SIGSEGV, not fixed; double free, not fixed) Tcptrace (out of bound read (not fixed) Gif2png (out of bound read;not fixed)
Conclusions Evolutionary fuzzing in promising. It is worth spending time in analysis for creating new generation of inputs than executing millions of inputs per seconds. Symbex based analysis is promising but scalability is an issue. We show that taintflow analysis is a viable option for intelligent fuzzing, along with other light-weight analyses. We developed a fully functional fuzzer which is able to fuzz a variety of applications. Vuzzer Site: https://www.vusec.net/projects/fuzzing/