PosFuzz: augmenting greybox fuzzing with effective position distribution

Cybersecurity - Tập 6 - Trang 1-21 - 2023
Yanyan Zou1,2,3,4, Wei Zou1,2,3,4, JiaCheng Zhao5,6, Nanyu Zhong1,2,3,4, Yu Zhang2,3,4, Ji Shi1,2,3,4, Wei Huo1,2,3,4
1Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
2School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
3Key Laboratory of Network Assessment Technology, Chinese Academy of Sciences, Beijing, China
4Beijing Key Laboratory of Network Security and Protection Technology, Beijing, China
5State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
6Zhongguancun Laboratory, Beijing, China

Tóm tắt

Mutation-based greybox fuzzing has been one of the most prevalent techniques for security vulnerability discovery and a great deal of research work has been proposed to improve both its efficiency and effectiveness. Mutation-based greybox fuzzing generates input cases by mutating the input seed, i.e., applying a sequence of mutation operators to randomly selected mutation positions of the seed. However, existing fruitful research work focuses on scheduling mutation operators, leaving the schedule of mutation positions as an overlooked aspect of fuzzing efficiency. This paper proposes a novel greybox fuzzing method, PosFuzz, that statistically schedules mutation positions based on their historical performance. PosFuzz makes use of a concept of effective position distribution to represent the semantics of the input and to guide the mutations. PosFuzz first utilizes Good-Turing frequency estimation to calculate an effective position distribution for each mutation operator. It then leverages two sampling methods in different mutating stages to select the positions from the distribution. We have implemented PosFuzz on top of AFL, AFLFast and MOPT, called Pos-AFL, -AFLFast and -MOPT respectively, and evaluated them on the UNIFUZZ benchmark (20 widely used open source programs) and LAVA-M dataset. The result shows that, under the same testing time budget, the Pos-AFL, -AFLFast and -MOPT outperform their counterparts in code coverage and vulnerability discovery ability. Compared with AFL, AFLFast, and MOPT, PosFuzz gets 21% more edge coverage and finds 133% more paths on average. It also triggers 275% more unique bugs on average.

Tài liệu tham khảo

A Security Oriented, Feedback-driven, Evolutionary, Easy-to-use Fuzzer with Interesting Analysis Options. https://honggfuzz.dev/ American Fuzzy Lop. https://lcamtuf.coredump.cx/afl/ Andronidis A, Cadar C (2022) Snapfuzz: high-throughput fuzzing of network applications Aschermann C, Schumilo S, Blazytko T, Gawlik R, Holz T (2019) Redqueen: fuzzing with input-to-state correspondence. In: NDSS, vol 19, pp 1–15 Böhme M, Pham V-T, Roychoudhury A (2016) Coverage-based greybox fuzzing as markov chain. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp 1032–1043 Böhme M, Pham V-T, Nguyen M-D, Roychoudhury A (2017) Directed greybox fuzzing. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, pp 2329–2344 Chen P, Chen H (2018) Angora: Efficient fuzzing by principled search. In: 2018 IEEE symposium on security and privacy (SP). IEEE, pp 711–725 Chen Y, Jiang Y, Ma F, Liang J, Wang M, Zhou C, Jiao X, Su Z (2019) Enfuzz: Ensemble fuzzing with seed synchronization among diverse fuzzers. In: 28th USENIX Security Symposium (USENIX Security 19), pp 1967–1983 Chen H, Xue Y, Li Y, Chen B, Xie X, Wu X, Liu Y (2018) Hawkeye: towards a desired directed grey-box fuzzer. In: Proceedings of the 2018 ACM SIGSAC conference on computer and communications security, pp 2095–2108 Flury BD (1990) Acceptance-rejection sampling made easy. SIAM Rev 32(3):474–476 Gale WA, Sampson G (1995) Good-turing frequency estimation without tears. J Quant Linguist 2(3):217–237 Gan S, Zhang C, Chen P, Zhao B, Qin X, Wu D, Chen Z (2020) GREYONE: Data flow sensitive fuzzing. In: 29th USENIX security symposium (USENIX Security 20), pp 2577–2594 Gan S, Zhang C, Qin X, Tu X, Li K, Pei Z, Chen Z (2018) Collafl: Path sensitive fuzzing. In: 2018 IEEE symposium on security and privacy (SP). IEEE, pp 679–696 Herrera A, Gunadi H, Magrath S, Norrish M, Payer M, Hosking AL (2021) Seed selection for successful fuzzing. In: Proceedings of the 30th ACM SIGSOFT international symposium on software testing and analysis, pp 230–243 Lemieux C, Sen K (2018) Fairfuzz: a targeted mutation strategy for increasing greybox fuzz testing coverage. In: Proceedings of the 33rd ACM/IEEE international conference on automated software engineering, pp 475–485 Li J, Zhao B, Zhang C (2018) Fuzzing: a survey. Cybersecurity 1(1):1–13 Liang H, Pei X, Jia X, Shen W, Zhang J (2018) Fuzzing: state of the art. IEEE Trans Reliab 67(3):1199–1218 Liang J, Jiang Y, Chen Y, Wang M, Zhou C, Sun J (2018) Pafl: extend fuzzing optimizations of single mode to industrial parallel mode. In: Proceedings of the 2018 26th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering, pp 809–814 Liang J, Wang M, Zhou C, Wu Z, Jiang Y, Liu J, Liu Z, Sun J (2022) Pata: Fuzzing with path aware taint analysis. In: 2022 2022 IEEE symposium on security and privacy (SP). IEEE Computer Society, Los Alamitos, CA, USA, pp 154–170 LibFuzzer - a Library for Coverage-guided Fuzz Testing. https://llvm.org/docs/LibFuzzer.html Li Y, Chen B, Chandramohan M, Lin S-W, Liu Y, Tiu A (2017) Steelix: program-state based binary fuzzing. In: Proceedings of the 2017 11th joint meeting on foundations of software engineering, pp 627–637 Li Y, Ji S, Chen Y, Liang S, Lee W-H, Chen Y, Lyu C, Wu C, Beyah R, Cheng P et al. (2021) Unifuzz: A holistic and pragmatic metrics-driven platform for evaluating fuzzers. In: 30th USENIX security symposium (USENIX Security 21). USENIX Association Lyu C, Ji S, Zhang C, Li Y, Lee W-H, Song Y, Beyah R (2019) MOPT: Optimized mutation scheduling for fuzzers. In: 28th USENIX security symposium (USENIX security 19), pp 1949–1966 Manès VJM, Han H, Han C, Cha SK, Egele M, Schwartz EJ, Woo M (2019) The art, science, and engineering of fuzzing: a survey. IEEE Trans Softw Eng Nagy S, Hicks M (2019) Full-speed fuzzing: reducing fuzzing overhead through coverage-guided tracing. In: 2019 IEEE symposium on security and privacy (SP). IEEE, pp 787–802 Petsios T, Zhao J, Keromytis AD, Jana S (2017) Slowfuzz: automated domain-independent detection of algorithmic complexity vulnerabilities. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, pp 2155–2168 Rajpal M, Blum W, Singh R (2017) Not all bytes are equal: neural byte sieve for fuzzing. arXiv preprint arXiv:1711.04596 Rawat S, Jain V, Kumar A, Cojocar L, Giuffrida C, Bos H (2017) Vuzzer: Application-aware evolutionary fuzzing. In: NDSS, vol 17, pp 1–14 Schumilo S, Aschermann C, Gawlik R, Schinzel S, Holz T (2017) kafl: Hardware-assisted feedback fuzzing for OS kernels. In: 26th USENIX security symposium (USENIX Security 17), pp 167–182 Serebryany K (2017) Oss-fuzz-google’s continuous fuzzing service for open source software Serebryany K, Bruening D, Potapenko A, Vyukov D (2012) AddressSanitizer: a fast address sanity checker. In: 2012 USENIX annual technical conference (USENIX ATC 12), pp 309–318 Shapiro HS, Silverman RA (1960) Alias-free sampling of random noise. J Soc Ind Appl Math 8(2):225–248 She D, Shah A, Jana S (2022) Effective seed scheduling for fuzzing with graph centrality analysis. In: 2022 2022 IEEE symposium on security and privacy (SP) (SP). IEEE Computer Society, Los Alamitos, CA, USA, pp 1558–1558 Sundermeyer M, Schlüter R, Ney H (2012) LSTM neural networks for language modeling. In: Thirteenth annual conference of the international speech communication association Wang J, Chen B, Wei L, Liu Y (2017) Skyfire: Data-driven seed generation for fuzzing. In: 2017 IEEE symposium on security and privacy (SP). IEEE, pp 579–594 Xu W, Kashyap S, Min C, Kim T (2017) Designing new operating primitives to improve fuzzing performance. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, pp 2313–2328 You W, Wang X, Ma S, Huang J, Zhang X, Wang X, Liang B (2019) Profuzzer: On-the-fly input type probing for better zero-day vulnerability discovery. In: 2019 IEEE symposium on security and privacy (SP). IEEE, pp 769–786 Yun I, Lee S, Xu M, Jang Y, Kim T (2018) Qsym: a practical concolic execution engine tailored for hybrid fuzzing. In: 27th USENIX Security Symposium (USENIX Security 18). USENIX Association, Baltimore, MD, pp 745–761 Zong P, Lv T, Wang D, Deng Z, Liang R, Chen K (2020) Fuzzguard: filtering out unreachable inputs in directed grey-box fuzzing through deep learning