PosFuzz: augmenting greybox fuzzing with effective position distribution
Tóm tắt
Mutation-based greybox fuzzing has been one of the most prevalent techniques for security vulnerability discovery and a great deal of research work has been proposed to improve both its efficiency and effectiveness. Mutation-based greybox fuzzing generates input cases by mutating the input seed, i.e., applying a sequence of mutation operators to randomly selected mutation positions of the seed. However, existing fruitful research work focuses on scheduling mutation operators, leaving the schedule of mutation positions as an overlooked aspect of fuzzing efficiency. This paper proposes a novel greybox fuzzing method, PosFuzz, that statistically schedules mutation positions based on their historical performance. PosFuzz makes use of a concept of effective position distribution to represent the semantics of the input and to guide the mutations. PosFuzz first utilizes Good-Turing frequency estimation to calculate an effective position distribution for each mutation operator. It then leverages two sampling methods in different mutating stages to select the positions from the distribution. We have implemented PosFuzz on top of AFL, AFLFast and MOPT, called Pos-AFL, -AFLFast and -MOPT respectively, and evaluated them on the UNIFUZZ benchmark (20 widely used open source programs) and LAVA-M dataset. The result shows that, under the same testing time budget, the Pos-AFL, -AFLFast and -MOPT outperform their counterparts in code coverage and vulnerability discovery ability. Compared with AFL, AFLFast, and MOPT, PosFuzz gets 21% more edge coverage and finds 133% more paths on average. It also triggers 275% more unique bugs on average.
Tài liệu tham khảo
A Security Oriented, Feedback-driven, Evolutionary, Easy-to-use Fuzzer with Interesting Analysis Options. https://honggfuzz.dev/
American Fuzzy Lop. https://lcamtuf.coredump.cx/afl/
Andronidis A, Cadar C (2022) Snapfuzz: high-throughput fuzzing of network applications
Aschermann C, Schumilo S, Blazytko T, Gawlik R, Holz T (2019) Redqueen: fuzzing with input-to-state correspondence. In: NDSS, vol 19, pp 1–15
Böhme M, Pham V-T, Roychoudhury A (2016) Coverage-based greybox fuzzing as markov chain. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp 1032–1043
Böhme M, Pham V-T, Nguyen M-D, Roychoudhury A (2017) Directed greybox fuzzing. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, pp 2329–2344
Chen P, Chen H (2018) Angora: Efficient fuzzing by principled search. In: 2018 IEEE symposium on security and privacy (SP). IEEE, pp 711–725
Chen Y, Jiang Y, Ma F, Liang J, Wang M, Zhou C, Jiao X, Su Z (2019) Enfuzz: Ensemble fuzzing with seed synchronization among diverse fuzzers. In: 28th USENIX Security Symposium (USENIX Security 19), pp 1967–1983
Chen H, Xue Y, Li Y, Chen B, Xie X, Wu X, Liu Y (2018) Hawkeye: towards a desired directed grey-box fuzzer. In: Proceedings of the 2018 ACM SIGSAC conference on computer and communications security, pp 2095–2108
Flury BD (1990) Acceptance-rejection sampling made easy. SIAM Rev 32(3):474–476
Gale WA, Sampson G (1995) Good-turing frequency estimation without tears. J Quant Linguist 2(3):217–237
Gan S, Zhang C, Chen P, Zhao B, Qin X, Wu D, Chen Z (2020) GREYONE: Data flow sensitive fuzzing. In: 29th USENIX security symposium (USENIX Security 20), pp 2577–2594
Gan S, Zhang C, Qin X, Tu X, Li K, Pei Z, Chen Z (2018) Collafl: Path sensitive fuzzing. In: 2018 IEEE symposium on security and privacy (SP). IEEE, pp 679–696
Herrera A, Gunadi H, Magrath S, Norrish M, Payer M, Hosking AL (2021) Seed selection for successful fuzzing. In: Proceedings of the 30th ACM SIGSOFT international symposium on software testing and analysis, pp 230–243
Lemieux C, Sen K (2018) Fairfuzz: a targeted mutation strategy for increasing greybox fuzz testing coverage. In: Proceedings of the 33rd ACM/IEEE international conference on automated software engineering, pp 475–485
Li J, Zhao B, Zhang C (2018) Fuzzing: a survey. Cybersecurity 1(1):1–13
Liang H, Pei X, Jia X, Shen W, Zhang J (2018) Fuzzing: state of the art. IEEE Trans Reliab 67(3):1199–1218
Liang J, Jiang Y, Chen Y, Wang M, Zhou C, Sun J (2018) Pafl: extend fuzzing optimizations of single mode to industrial parallel mode. In: Proceedings of the 2018 26th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering, pp 809–814
Liang J, Wang M, Zhou C, Wu Z, Jiang Y, Liu J, Liu Z, Sun J (2022) Pata: Fuzzing with path aware taint analysis. In: 2022 2022 IEEE symposium on security and privacy (SP). IEEE Computer Society, Los Alamitos, CA, USA, pp 154–170
LibFuzzer - a Library for Coverage-guided Fuzz Testing. https://llvm.org/docs/LibFuzzer.html
Li Y, Chen B, Chandramohan M, Lin S-W, Liu Y, Tiu A (2017) Steelix: program-state based binary fuzzing. In: Proceedings of the 2017 11th joint meeting on foundations of software engineering, pp 627–637
Li Y, Ji S, Chen Y, Liang S, Lee W-H, Chen Y, Lyu C, Wu C, Beyah R, Cheng P et al. (2021) Unifuzz: A holistic and pragmatic metrics-driven platform for evaluating fuzzers. In: 30th USENIX security symposium (USENIX Security 21). USENIX Association
Lyu C, Ji S, Zhang C, Li Y, Lee W-H, Song Y, Beyah R (2019) MOPT: Optimized mutation scheduling for fuzzers. In: 28th USENIX security symposium (USENIX security 19), pp 1949–1966
Manès VJM, Han H, Han C, Cha SK, Egele M, Schwartz EJ, Woo M (2019) The art, science, and engineering of fuzzing: a survey. IEEE Trans Softw Eng
Nagy S, Hicks M (2019) Full-speed fuzzing: reducing fuzzing overhead through coverage-guided tracing. In: 2019 IEEE symposium on security and privacy (SP). IEEE, pp 787–802
Petsios T, Zhao J, Keromytis AD, Jana S (2017) Slowfuzz: automated domain-independent detection of algorithmic complexity vulnerabilities. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, pp 2155–2168
Rajpal M, Blum W, Singh R (2017) Not all bytes are equal: neural byte sieve for fuzzing. arXiv preprint arXiv:1711.04596
Rawat S, Jain V, Kumar A, Cojocar L, Giuffrida C, Bos H (2017) Vuzzer: Application-aware evolutionary fuzzing. In: NDSS, vol 17, pp 1–14
Schumilo S, Aschermann C, Gawlik R, Schinzel S, Holz T (2017) kafl: Hardware-assisted feedback fuzzing for OS kernels. In: 26th USENIX security symposium (USENIX Security 17), pp 167–182
Serebryany K (2017) Oss-fuzz-google’s continuous fuzzing service for open source software
Serebryany K, Bruening D, Potapenko A, Vyukov D (2012) AddressSanitizer: a fast address sanity checker. In: 2012 USENIX annual technical conference (USENIX ATC 12), pp 309–318
Shapiro HS, Silverman RA (1960) Alias-free sampling of random noise. J Soc Ind Appl Math 8(2):225–248
She D, Shah A, Jana S (2022) Effective seed scheduling for fuzzing with graph centrality analysis. In: 2022 2022 IEEE symposium on security and privacy (SP) (SP). IEEE Computer Society, Los Alamitos, CA, USA, pp 1558–1558
Sundermeyer M, Schlüter R, Ney H (2012) LSTM neural networks for language modeling. In: Thirteenth annual conference of the international speech communication association
Wang J, Chen B, Wei L, Liu Y (2017) Skyfire: Data-driven seed generation for fuzzing. In: 2017 IEEE symposium on security and privacy (SP). IEEE, pp 579–594
Xu W, Kashyap S, Min C, Kim T (2017) Designing new operating primitives to improve fuzzing performance. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, pp 2313–2328
You W, Wang X, Ma S, Huang J, Zhang X, Wang X, Liang B (2019) Profuzzer: On-the-fly input type probing for better zero-day vulnerability discovery. In: 2019 IEEE symposium on security and privacy (SP). IEEE, pp 769–786
Yun I, Lee S, Xu M, Jang Y, Kim T (2018) Qsym: a practical concolic execution engine tailored for hybrid fuzzing. In: 27th USENIX Security Symposium (USENIX Security 18). USENIX Association, Baltimore, MD, pp 745–761
Zong P, Lv T, Wang D, Deng Z, Liang R, Chen K (2020) Fuzzguard: filtering out unreachable inputs in directed grey-box fuzzing through deep learning