Detecting DeFi securities violations from token smart contract code

Arianna Trozze1,2, Bennett Kleinberg1,3, Toby Davies1,4
1Department of Security and Crime Science, University College London, London, UK
2Department of Computer Science, University College London, London, UK
3Department of Methodology & Statistics, Tilburg University, Tilburg, Netherlands
4School of Law, The Liberty Building, University of Leeds, Leeds, UK

Tóm tắt

Decentralized Finance (DeFi) is a system of financial products and services built and delivered through smart contracts on various blockchains. In recent years, DeFi has gained popularity and market capitalization. However, it has also been connected to crime, particularly various types of securities violations. The lack of Know Your Customer requirements in DeFi poses challenges for governments trying to mitigate potential offenses. This study aims to determine whether this problem is suited to a machine learning approach, namely, whether we can identify DeFi projects potentially engaging in securities violations based on their tokens’ smart contract code. We adapted prior works on detecting specific types of securities violations across Ethereum by building classifiers based on features extracted from DeFi projects’ tokens’ smart contract code (specifically, opcode-based features). Our final model was a random forest model that achieved an 80% F-1 score against a baseline of 50%. Notably, we further explored the code-based features that are the most important to our model’s performance in more detail by analyzing tokens’ Solidity code and conducting cosine similarity analyses. We found that one element of the code that our opcode-based features can capture is the implementation of the SafeMath library, although this does not account for the entirety of our features. Another contribution of our study is a new dataset, comprising (a) a verified ground truth dataset for tokens involved in securities violations and (b) a set of legitimate tokens from a reputable DeFi aggregator. This paper further discusses the potential use of a model like ours by prosecutors in enforcement efforts and connects it to a wider legal context.

Từ khóa


Tài liệu tham khảo

Aljofey A, Rasool A, Jiang Q, Qu Q (2022) A feature-based robust method for abnormal contracts detection in ethereum blockchain. Electronics 11(18):2937. https://doi.org/10.3390/electronics11182937 Bartoletti M, Carta S, Cimoli T, Saia S (2020a) Dissecting ponzi schemes on ethereum: identification, analysis, and impact. Future Gener Comput Syst 102:259–277. https://doi.org/10.1016/j.future.2019.08.014 Bartoletti M, Chiang JH-y, Lluch-Lafuente A (2020b) SoK: lending pools in decentralized finance. arxiv:2012.13230. Accessed 22 Mar 2022 Bartoletti M, Carta S, Cimoli T, Saia S (2019) Dissecting Ponzi schemes on Ethereum Binance Academy: (2021) Blockchain. https://academy.binance.com/en/glossary/blockchain Accessed 25 Nov Binance Academy (2021) Blockchain use cases: prediction markets 29 April. https://academy.binance.com/en/articles/blockchain-use-cases-prediction-markets Accessed 25 Oct 2021 BitcoinWiki (2021) ERC20 Token Standard—Ethereum Smart Contracts—BitcoinWiki 3 Feb. https://en.bitcoinwiki.org/wiki/ERC20 Accessed 18 Nov 2021 Blagus R, Lusa L (2015) Joint use of over- and under-sampling techniques and cross-validation for the development and assessment of prediction models. BMC Bioinformat 16(1):363. https://doi.org/10.1186/s12859-015-0784-9 Blockchain Association (2019) Understanding the SEC’s Guidance on Digital Tokens: the hinman token standard. https://blockchainassoc.medium.com/understanding-the-secs-guidance-on-digital-tokens-the-hinman-token-standard-dd51c6105e2a Accessed 10 Jan 2019 Cai W, Wang Z, Ernst JB, Hong Z, Feng C, Leung VCM (2018) Decentralized applications: the blockchain-empowered software system. IEEE Access 6:53019–53033. https://doi.org/10.1109/ACCESS.2018.2870644 Chainalysis (2022) The 2022 Crypto crime report. Technical report February Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority over-sampling technique. J Artif Intel Res 16:321–357. https://doi.org/10.1613/jair.953 Chen W, Li X, Sui Y, He N, Wang H, Wu L, Luo X (2021a) Sadponzi: Detecting and characterizing ponzi schemes in ethereum smart contracts. Proc ACM Meas Anal Comput Syst 5(2):26–12630. https://doi.org/10.1145/3460093 Chen W, Zheng Z, Ngai EC-H, Zheng P, Zhou Y (2019) Exploiting blockchain data to detect smart ponzi schemes on ethereum. IEEE Access 7:37575–37586 Chen W, Zheng Z, Cui J, Ngai E, Zheng P, Zhou Y (2018) Detecting ponzi schemes on ethereum: towards healthier blockchain technology. In: Proceedings of the 2018 World Wide Web Conference on World Wide Web—WWW ’18, 1409–1418. https://doi.org/10.1145/3178876.3186046 Chen L, Fan Y, Ye Y (2021b) Adversarial reprogramming of pretrained neural networks for fraud detection. CIKM ’21, 2935–2939. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3459637.3482053 Chen W, Guo X, Chen Z, Zheng Z, Lu Y, Li Y (2020) Honeypot contract risk warning on ethereum smart contracts, 1–8. IEEE Choi RY, Coyner AS, Kalpathy-Cramer J, Chiang MF, Campbell JP (2020) Introduction to machine learning, neural networks, and deep learning. Transl Vis Sci Technol 9(2):14. https://doi.org/10.1167/tvst.9.2.14 CipherTrace: (2021) Cryptocurrency Crime and Anti-Money Laundering Report, August (2021). https://ciphertrace.com/cryptocurrency-crime-and-anti-money-laundering-report-august-2021/ Coinbase (2021) Around the Block #14: DeFi insurance 13 May. https://blog.coinbase.com/around-the-block-14-defi-insurance-ebf8e278da13 Accessed 25 Oct 2021 Commodity futures trading commission (2019) CFTC Whistleblower Alert: Be on the Lookout for Virtual Currency Fraud May. https://www.whistleblower.gov/whistleblower-alerts/Virtual_Currency_WBO_Alert.htm Accessed 22 Nov 2021 Crytic: (2021) Ethereum VM (EVM) Opcodes and instruction reference. https://github.com/crytic/evm-opcodes Accessed 17 Nov 2021 Crytic (2020) Pyevmasm. Crytic DeFi Prime (2021) DeFi and Open Finance. https://defiprime.com/ Accessed 25 Oct 2021 Ethereum: (2021) Ethereum Wallets 23 October. https://ethereum.org/en/wallets/ Accessed 25 Oct 2021 Eversheds Sutherland Ltd. (2018) Navigating the issues securities enforcement global update. Report https://us.eversheds-sutherland.com/mobile/portalresource/lookup/poid/Z1tOl9NPluKPtDNIqLMRV56Pab6TfzcRXncKbDtRr9tObDdEpW3CmS3!/fileUpload.name=/Securities-Enforcement-Global-Update_Fall-2018.pdf FBI (2021) Securities fraud awareness & prevention tips. https://www.fbi.gov/stats-services/publications/securities-fraud Accessed 18 Nov Fan S, Fu S, Xu H, Cheng X (2021) Al-spsd: anti-leakage smart ponzi schemes detection in blockchain. Inf Process Manag 58(4):102587. https://doi.org/10.1016/j.ipm.2021.102587 Fan S, Fu S, Luo Y, Xu H, Zhang X, Xu M (2022) Smart Contract Scams Detection with Topological Data Analysis on Account Interaction. In: Proceedings of the 31st ACM international conference on information & knowledge management. CIKM ’22, pp. 468–477. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3511808.3557454. Accessed 2022-11-21 Fang F, Ventre C, Basios M, Kanthan L, Martinez-Rego D, Wu F, Li L (2022) Cryptocurrency trading: a comprehensive survey. Financ Innov 8(1):13. https://doi.org/10.1186/s40854-021-00321-6 Ferreira Torres C, Jonker H, State R (2022) Elysium: context-aware bytecode-level patching to automatically heal vulnerable smart contracts. In: 25th international symposium on research in attacks, intrusions and defenses, pp. 115–128. ACM, Limassol Cyprus. https://doi.org/10.1145/3545948.3545975. https://dl.acm.org/doi/10.1145/3545948.3545975 Accessed 2022-11-21 Gapusan, J (2021) DeFi: Who will build the future of finance? (2021). https://www.forbes.com/sites/jeffgapusan/2021/11/02/defi-who-will-build-the-future-of-finance/ Accessed 18 Nov 2021 Grossman PG (2021) Maura: artificial intelligence as evidence. In: Maryland State Bar Association Young Lawyer’s Section 25 August Han J, Kamber M, Pei J (2012) Getting to know your data. In: Han J, Kamber M, Pei J (eds) Data mining, 3 edn. The Morgan Kaufmann series in data management systems, pp. 39–82. Morgan Kaufmann. https://doi.org/10.1016/B978-0-12-381479-1.00002-2. https://www.sciencedirect.com/science/article/pii/B9780123814791000022 Accessed 19 April 2022 Hastie T, Qian J, Tay K (2023) An introduction to glmnet. https://glmnet.stanford.edu/articles/glmnet.html#introduction Accessed 10 May 2023 He Z, Song S, Bai Y, Luo X, Chen T, Zhang W, He P, Li H, Lin X, Zhang X (2023) Tokenaware: accurate and efficient bookkeeping recognition for token smart contracts. ACM Trans Softw Eng Methodol. https://doi.org/10.1145/3560263 He N. Wu L, Wang H, Guo Y, Jiang X (2019) Characterizing code clones in the ethereum smart contract ecosystem. arXiv:1905.00272 [cs] Hertig, A (2020) What Is DeFi? 18 September. https://www.coindesk.com/learn/what-is-defi/ Accessed 25 Oct 2021 Hu T, Liu X, Chen T, Zhang X, Huang X, Niu W, Lu J, Zhou K, Liu Y (2021) Transaction-based classification and detection approach for ethereum smart contract. Inf Process Manag 58(2):102462. https://doi.org/10.1016/j.ipm.2020.102462 Hu H, Xu Y (2021) Scsguard: deep scam detection for ethereum smart contracts. arXiv:2105.10426 [cs] Ibrahim RF, Mohammad Elian A, Ababneh M (2021) Illicit account detection in the ethereum blockchain using machine learning. In: 2021 International Conference on Information Technology (ICIT), pp 488–493. https://doi.org/10.1109/ICIT52682.2021.9491653 Jagati S (2021) DeFi lending and borrowing, explained 18 January. https://cointelegraph.com/explained/defi-lending-and-borrowing-explained Accessed 25 Oct 2021 Jung E, Le Tilly M, Gehani A, Ge Y (2019) Data mining-based ethereum fraud detection. IEEE, 266–273 Kamps J, Trozze A, Kleinberg B (2022) forthcoming. In: Wood, S., Hanoch, Y. (eds.) Cryptocurrency Fraud. A fresh look at fraud: theoretical and applied approaches. Routledge, forthcoming Karimov B, Wójcik P (2021) Identification of scams in initial coin offerings with machine learning. Front Artif Intel 4:718450. https://doi.org/10.3389/frai.2021.718450 Kolinska D (2022) Cryptocurrencies in the EU: new rules to boost benefits and curb threats https://www.europarl.europa.eu/news/en/press-room/20220309IPR25162/cryptocurrencies-in-the-eu-new-rules-to-boost-benefits-and-curb-threats Accessed 22 Aug 2022 Lašas K, Kasputytė G, Užupytė R, Krilavičius T (2020) Fraudulent behaviour identification in ethereum blockchain Li T, Kou G, Peng Y, Yu PS (2022) An integrated cluster detection, optimization, and interpretation approach for financial data. IEEE Trans Cybern 52(12):13848–13861. https://doi.org/10.1109/TCYB.2021.3109066 Li J, Baldimtsi F, Brandao JP, Kugler M, Hulays R, Showers E, Ali Z, Chang J (2021) Measuring illicit activity in defi: The case of ethereum. Lecture Notes in Computer Science. Springer, Berlin, Heidelberg, pp 197–203. https://doi.org/10.1007/978-3-662-63958-0_18 Liu L, Tsai W-T, Bhuiyan MZA, Peng H, Liu M (2022) Blockchain-enabled fraud discovery through abnormal smart contract detection on ethereum. Futur Gener Comput Syst 128:158–166. https://doi.org/10.1016/j.future.2021.08.023 Meta Research (2023) fastText. original-date: 2016-07-16T13:38:42Z https://github.com/facebookresearch/fastText Accessed 28 April 2023 Modi R (2018) Solidity Programming Essentials. Packt, https://subscription.packtpub.com/book/application- development/9781788831383/7/ch07lvl1sec81/the-view-constant-and-pure-functions Accessed 09 May 2023 Musiala RAJ, Goody TM, Reynolds V, Tenery L, McGrath M, Rowland C, Sekhri S (2020) Cryptocurrencies: Forensic techniques to meet the challenge of new fraud and corruption risks | FVS Eye on Fraud. Report, AICPA Winter https://future.aicpa.org/resources/download/cryptocurrencies-forensic-techniques-to-face-new-fraud-and-corruption-risks Nabi T (2022) Pure vs view in solidity. https://hashnode.com/post/pure-vs-view-in-solidity-cl04tbzlh07kaudnv1ial1gio Accessed 09 May 2023 Narayanan A, Bonneau J, Felten E, Miller A, Goldfeder S (2016) Bitcoin and cryptocurrency technologies Nutter PW (2018) Machine learning evidence: admissibility and weight comments. Univ Pa J Const Law 21(3):919–958 OpenZeppelin (2023) Contracts. https://docs.openzeppelin.com/contracts/2.x/ Accessed 09 May 2023 OpenZeppelin (2023) Math. https://docs.openzeppelin.com/contracts/2.x/api/math Accessed 15 May 2023 Perkis T (1994) Stack-based genetic programming. In: Proceedings of the First IEEE conference on evolutionary computation. IEEE world congress on computational intelligence, pp. 148–1531. https://doi.org/10.1109/ICEC.1994.350025 Podgor ES (2019) Cryptocurrencies and securities fraud: In need of legal guidance. Available at SSRN 3413384 Practical Law Corporate & Securities (2021) US securities laws: overview. Practice Note 3-383-6798, Thomson Reuters Prellberg J, Kramer O (2020) Acute lymphoblastic leukemia classification from microscopic images using convolutional neural networks. arXiv:1906.09020 [cs] Remix (2022a) Debugger. https://remix-ide.readthedocs.io/en/latest/debugger.html Accessed 15 May 2023 Remix (2022b) Debugging transactions. https://remix-ide.readthedocs.io/en/latest/tutorial_debug.html Accessed 15 May 2023 Rodler M, Li W, Karame GO, Davi L (2021) EVMPatch: Timely and automated patching of ethereum smart contracts. In: 30th usenix security symposium (USENIX Security 21), pp. 1289–1306. USENIX Association. https://www.usenix.org/conference/usenixsecurity21/presentation/rodler Santos I, Brezo F, Ugarte-Pedrero X, Bringas PG (2013) Opcode sequences as representation of executables for data-mining-based unknown malware detection. Inf Sci Int J 231:64–82. https://doi.org/10.1016/j.ins.2011.08.020 Schär F (2021) Decentralized finance: on blockchain- and smart contract-based financial markets. https://doi.org/10.20955/r.103.153-74. Accessed 22 Mar 2022 Scicluna MC, Debono J (2023) MiCA: landmark crypto regulation approved by EU Parliament. https://www.lexology.com/library/detail.aspx?g=152d8020-bc6e-47b9-b236-5b1f8c6b2b88 Accessed 15 May 2023 Scikit-learn developers (2023) sklearn.preprocessing.StandardScaler. https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html Accessed 10 May 2023 Sebastião H, Godinho P (2021) Forecasting and trading cryptocurrencies with machine learning under changing market conditions. Financ Innov 7(1):3. https://doi.org/10.1186/s40854-020-00217-x Secret network (2021) About Secret (SCRT). https://scrt.network/ Accessed 22 Nov 2021 Securities and exchange commission v. AriseBank, Jared Rice Sr., and Stanley Ford (2020). N.D. Tex. 23 January. No. 3-18-cv-0186-M Securities and exchange commission v. PlexCorps, Dominic LaCroix, and Sabrina Paradis-Royer (2019). E.D.N.Y. 2 October. No. 17-cv-7007 (CBA) (RML) Securities and exchange commission v. REcoin Group Foundation, LLC, DRC World INC. A/k/a Diamond Reserve Club, and Maksim Zaslavskiy (2018). E.D.N.Y. 14 May. No. 17-cv-05725 Securities and exchange commission v. LBRY (2022). D.N.H. November 7. No. 21-CV-260-PB Securities and exchange commission v. Natural Diamonds Investment Co., Eagle Financial Diamond Group Inc A/k/a Diamante Atelier, Argyle Coin, LLC, Jose Angel Aman, Harold Seigel, and Jonathan H. Seigel (2019a). S.D. Fla. 11 December. No. 19-cv-80633 Securities and exchange commission v. Reginald Middleton, et al. (2019b). E.D.N.Y. 1 November. No. 19-cv-4625 Shams SMR, Sobhan A, Vrontis D (2021) Detection of financial fraud risk: implications for financial stability. J Oper Risk Solidity Team (2020) Solidity 0.8.0 release announcement. https://blog.soliditylang.org/2020/12/16/solidity-v0.8.0-release-announcement/ Accessed 2023 May 09 Solidity Dev Studio: (2020) Exploring the new Solidity 0.8 Release. https://soliditydeveloper.com/solidity-0.8 Accessed 09 May 2023 The Solidity Authors (2023) Contracts. https://docs.soliditylang.org/en/v0.8.19/contracts.html#view-functions Accessed 09 May 2023 Trozze A, Kamps J, Akartuna EA, Hetzel F, Kleinberg B, Davies T, Johnson S (2022) Cryptocurrencies and future financial crime. Crime Science U.S. Securities and Exchange Commission (2019) Framework for investment contract analysis of digital assets. https://www.sec.gov/corpfin/framework-investment-contract-analysis-digital-assets Accessed 2023 Feb 20 U.S. Securities and Exchange Commission (2021a) SEC Awards $22 Million to Two Whistleblowers. https://www.sec.gov/news/press-release/2021-81 Accessed 22 Nov 2021 U.S. Securities and Exchange Commission: (2021b) Annual Report to Congress Whistleblower Program. https://www.sec.gov/files/2021_OW_AR_508.pdf Accessed 2022 Apr 04 U.S. Securities and Exchange Commission (2022) Cyber enforcement actions 19 January. https://www.sec.gov/spotlight/cybersecurity-enforcement-actions Accessed 10 Feb 2022 Uniswap: (2021) Uniswap Governance. https://gov.uniswap.org/ Accessed 22 November 2021 Uniswap: (2020) Introducing Token Lists 26 August. https://uniswap.org/blog/token-lists Accessed 18 Nov 2021 United States v. Costanzo (2018) D. Arizona 10 August. No. 2:17-cr-00585-GMS United States v. Murgio (2016) S.D.N.Y. 19 September. No. 15-cr-769 (AJN) Wang L, Cheng H, Zheng Z, Yang A, Zhu X (2021) Ponzi scheme detection via oversampling-based long short-term memory for smart contracts. Knowl-Based Syst 228:107312. https://doi.org/10.1016/j.knosys.2021.107312 Wang L, Sarker PK, Bouri E (2022) Short- and long-term interactions between bitcoin and economic variables: evidence from the US. Comput Econ. https://doi.org/10.1007/s10614-022-10247-5 web3.py. ethereum (2023) original-date: 2016-04-14T15:59:35Z. https://github.com/ethereum/web3.py/blob/acd5b24474dd5b13548dffa33e1d2872c3dccad9/docs/index.rst Accessed 28 April 2023 Wilder RP (2020) Heidi: Tracing cryptocurrency scams: Clustering replicated advance-fee and phishing websites. arXiv preprint arXiv:2005.14440 Wintermeyer L (2021) After Growing 88x In A Year, Where Does DeFi Go From Here? (2 November 2021). https://www.forbes.com/sites/lawrencewintermeyer/2021/05/20/after-growing-88x-in-a-year-where-does-defi-go-from-here/ Accessed 18 Nov Wood, G (2021) Ethereum: a secure decentralised generalised transaction ledger. Ethereum 2 Nov. https://ethereum.github.io/yellowpaper/paper.pdf Wu J, Lin D, Zheng Z, Yuan Q (2020) T-edge: Temporal weighted multidigraph embedding for ethereum transaction network analysis. Front Phys 8:204. https://doi.org/10.3389/fphy.2020.00204 Xia P, wang H, Gao B, Su W, Yu Z, Luo X, Zhang C, Xiao X, Xu G (2021) Demystifying scam tokens on uniswap decentralized exchange. arXiv:2109.00229 [cs] Xin Q, Zhou J, Hu F (2018) The economic consequences of financial fraud: evidence from the product market in China. China J Account Stud 6(1):1–23. https://doi.org/10.1080/21697213.2018.1480005 Xu M, Chen X, Kou G (2019) A systematic review of blockchain. Financ Innov 5(1):27. https://doi.org/10.1186/s40854-019-0147-z Xu J, Paruch K, Cousaert S, Feng Y (2021) SoK: Decentralized exchanges (DEX) with automated market maker (AMM) protocols. arxiv:2103.12732. Accessed 22 Mar 2022 Zapper: (2021) Your Homepage to DeFi. https://zapper.fi/ Accessed 18 Nov 2021 Zes (2020) Is it safe to Zap into all liquidity pools on Zapper?. https://zapper.crunch.help/zapper-fi-faq/is-it-safe-to-zap-into-all-liquidity-pools-on-zapper Accessed 18 Nov 2021 Zhang Y, Yu W, Li Z, Raza S, Cao H (2021) Detecting ethereum ponzi schemes based on improved lightgbm algorithm. IEEE Trans Comput Soc Syst. https://doi.org/10.1109/TCSS.2021.3088145 Zhou L, Xiong X, Ernstberger J, Chaliasos S, Wang Z, Wang Y, Qin K, Wattenhofer R, Song D, Gervais A (2023) SoK: decentralized finance (DeFi) attacks. arxiv:2208.13035