Detecting DeFi securities violations from token smart contract code
Financial Innovation - 2024
Tóm tắt
Decentralized Finance (DeFi) is a system of financial products and services built and delivered through smart contracts on various blockchains. In recent years, DeFi has gained popularity and market capitalization. However, it has also been connected to crime, particularly various types of securities violations. The lack of Know Your Customer requirements in DeFi poses challenges for governments trying to mitigate potential offenses. This study aims to determine whether this problem is suited to a machine learning approach, namely, whether we can identify DeFi projects potentially engaging in securities violations based on their tokens’ smart contract code. We adapted prior works on detecting specific types of securities violations across Ethereum by building classifiers based on features extracted from DeFi projects’ tokens’ smart contract code (specifically, opcode-based features). Our final model was a random forest model that achieved an 80% F-1 score against a baseline of 50%. Notably, we further explored the code-based features that are the most important to our model’s performance in more detail by analyzing tokens’ Solidity code and conducting cosine similarity analyses. We found that one element of the code that our opcode-based features can capture is the implementation of the SafeMath library, although this does not account for the entirety of our features. Another contribution of our study is a new dataset, comprising (a) a verified ground truth dataset for tokens involved in securities violations and (b) a set of legitimate tokens from a reputable DeFi aggregator. This paper further discusses the potential use of a model like ours by prosecutors in enforcement efforts and connects it to a wider legal context.
Từ khóa
Tài liệu tham khảo
Aljofey A, Rasool A, Jiang Q, Qu Q (2022) A feature-based robust method for abnormal contracts detection in ethereum blockchain. Electronics 11(18):2937. https://doi.org/10.3390/electronics11182937
Bartoletti M, Carta S, Cimoli T, Saia S (2020a) Dissecting ponzi schemes on ethereum: identification, analysis, and impact. Future Gener Comput Syst 102:259–277. https://doi.org/10.1016/j.future.2019.08.014
Bartoletti M, Chiang JH-y, Lluch-Lafuente A (2020b) SoK: lending pools in decentralized finance. arxiv:2012.13230. Accessed 22 Mar 2022
Bartoletti M, Carta S, Cimoli T, Saia S (2019) Dissecting Ponzi schemes on Ethereum
Binance Academy: (2021) Blockchain. https://academy.binance.com/en/glossary/blockchain Accessed 25 Nov
Binance Academy (2021) Blockchain use cases: prediction markets 29 April. https://academy.binance.com/en/articles/blockchain-use-cases-prediction-markets Accessed 25 Oct 2021
BitcoinWiki (2021) ERC20 Token Standard—Ethereum Smart Contracts—BitcoinWiki 3 Feb. https://en.bitcoinwiki.org/wiki/ERC20 Accessed 18 Nov 2021
Blagus R, Lusa L (2015) Joint use of over- and under-sampling techniques and cross-validation for the development and assessment of prediction models. BMC Bioinformat 16(1):363. https://doi.org/10.1186/s12859-015-0784-9
Blockchain Association (2019) Understanding the SEC’s Guidance on Digital Tokens: the hinman token standard. https://blockchainassoc.medium.com/understanding-the-secs-guidance-on-digital-tokens-the-hinman-token-standard-dd51c6105e2a Accessed 10 Jan 2019
Cai W, Wang Z, Ernst JB, Hong Z, Feng C, Leung VCM (2018) Decentralized applications: the blockchain-empowered software system. IEEE Access 6:53019–53033. https://doi.org/10.1109/ACCESS.2018.2870644
Chainalysis (2022) The 2022 Crypto crime report. Technical report February
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority over-sampling technique. J Artif Intel Res 16:321–357. https://doi.org/10.1613/jair.953
Chen W, Li X, Sui Y, He N, Wang H, Wu L, Luo X (2021a) Sadponzi: Detecting and characterizing ponzi schemes in ethereum smart contracts. Proc ACM Meas Anal Comput Syst 5(2):26–12630. https://doi.org/10.1145/3460093
Chen W, Zheng Z, Ngai EC-H, Zheng P, Zhou Y (2019) Exploiting blockchain data to detect smart ponzi schemes on ethereum. IEEE Access 7:37575–37586
Chen W, Zheng Z, Cui J, Ngai E, Zheng P, Zhou Y (2018) Detecting ponzi schemes on ethereum: towards healthier blockchain technology. In: Proceedings of the 2018 World Wide Web Conference on World Wide Web—WWW ’18, 1409–1418. https://doi.org/10.1145/3178876.3186046
Chen L, Fan Y, Ye Y (2021b) Adversarial reprogramming of pretrained neural networks for fraud detection. CIKM ’21, 2935–2939. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3459637.3482053
Chen W, Guo X, Chen Z, Zheng Z, Lu Y, Li Y (2020) Honeypot contract risk warning on ethereum smart contracts, 1–8. IEEE
Choi RY, Coyner AS, Kalpathy-Cramer J, Chiang MF, Campbell JP (2020) Introduction to machine learning, neural networks, and deep learning. Transl Vis Sci Technol 9(2):14. https://doi.org/10.1167/tvst.9.2.14
CipherTrace: (2021) Cryptocurrency Crime and Anti-Money Laundering Report, August (2021). https://ciphertrace.com/cryptocurrency-crime-and-anti-money-laundering-report-august-2021/
Coinbase (2021) Around the Block #14: DeFi insurance 13 May. https://blog.coinbase.com/around-the-block-14-defi-insurance-ebf8e278da13 Accessed 25 Oct 2021
Commodity futures trading commission (2019) CFTC Whistleblower Alert: Be on the Lookout for Virtual Currency Fraud May. https://www.whistleblower.gov/whistleblower-alerts/Virtual_Currency_WBO_Alert.htm Accessed 22 Nov 2021
Crytic: (2021) Ethereum VM (EVM) Opcodes and instruction reference. https://github.com/crytic/evm-opcodes Accessed 17 Nov 2021
Crytic (2020) Pyevmasm. Crytic
DeFi Prime (2021) DeFi and Open Finance. https://defiprime.com/ Accessed 25 Oct 2021
Ethereum: (2021) Ethereum Wallets 23 October. https://ethereum.org/en/wallets/ Accessed 25 Oct 2021
Eversheds Sutherland Ltd. (2018) Navigating the issues securities enforcement global update. Report https://us.eversheds-sutherland.com/mobile/portalresource/lookup/poid/Z1tOl9NPluKPtDNIqLMRV56Pab6TfzcRXncKbDtRr9tObDdEpW3CmS3!/fileUpload.name=/Securities-Enforcement-Global-Update_Fall-2018.pdf
FBI (2021) Securities fraud awareness & prevention tips. https://www.fbi.gov/stats-services/publications/securities-fraud Accessed 18 Nov
Fan S, Fu S, Xu H, Cheng X (2021) Al-spsd: anti-leakage smart ponzi schemes detection in blockchain. Inf Process Manag 58(4):102587. https://doi.org/10.1016/j.ipm.2021.102587
Fan S, Fu S, Luo Y, Xu H, Zhang X, Xu M (2022) Smart Contract Scams Detection with Topological Data Analysis on Account Interaction. In: Proceedings of the 31st ACM international conference on information & knowledge management. CIKM ’22, pp. 468–477. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3511808.3557454. Accessed 2022-11-21
Fang F, Ventre C, Basios M, Kanthan L, Martinez-Rego D, Wu F, Li L (2022) Cryptocurrency trading: a comprehensive survey. Financ Innov 8(1):13. https://doi.org/10.1186/s40854-021-00321-6
Ferreira Torres C, Jonker H, State R (2022) Elysium: context-aware bytecode-level patching to automatically heal vulnerable smart contracts. In: 25th international symposium on research in attacks, intrusions and defenses, pp. 115–128. ACM, Limassol Cyprus. https://doi.org/10.1145/3545948.3545975. https://dl.acm.org/doi/10.1145/3545948.3545975 Accessed 2022-11-21
Gapusan, J (2021) DeFi: Who will build the future of finance? (2021). https://www.forbes.com/sites/jeffgapusan/2021/11/02/defi-who-will-build-the-future-of-finance/ Accessed 18 Nov 2021
Grossman PG (2021) Maura: artificial intelligence as evidence. In: Maryland State Bar Association Young Lawyer’s Section 25 August
Han J, Kamber M, Pei J (2012) Getting to know your data. In: Han J, Kamber M, Pei J (eds) Data mining, 3 edn. The Morgan Kaufmann series in data management systems, pp. 39–82. Morgan Kaufmann. https://doi.org/10.1016/B978-0-12-381479-1.00002-2. https://www.sciencedirect.com/science/article/pii/B9780123814791000022 Accessed 19 April 2022
Hastie T, Qian J, Tay K (2023) An introduction to glmnet. https://glmnet.stanford.edu/articles/glmnet.html#introduction Accessed 10 May 2023
He Z, Song S, Bai Y, Luo X, Chen T, Zhang W, He P, Li H, Lin X, Zhang X (2023) Tokenaware: accurate and efficient bookkeeping recognition for token smart contracts. ACM Trans Softw Eng Methodol. https://doi.org/10.1145/3560263
He N. Wu L, Wang H, Guo Y, Jiang X (2019) Characterizing code clones in the ethereum smart contract ecosystem. arXiv:1905.00272 [cs]
Hertig, A (2020) What Is DeFi? 18 September. https://www.coindesk.com/learn/what-is-defi/ Accessed 25 Oct 2021
Hu T, Liu X, Chen T, Zhang X, Huang X, Niu W, Lu J, Zhou K, Liu Y (2021) Transaction-based classification and detection approach for ethereum smart contract. Inf Process Manag 58(2):102462. https://doi.org/10.1016/j.ipm.2020.102462
Hu H, Xu Y (2021) Scsguard: deep scam detection for ethereum smart contracts. arXiv:2105.10426 [cs]
Ibrahim RF, Mohammad Elian A, Ababneh M (2021) Illicit account detection in the ethereum blockchain using machine learning. In: 2021 International Conference on Information Technology (ICIT), pp 488–493. https://doi.org/10.1109/ICIT52682.2021.9491653
Jagati S (2021) DeFi lending and borrowing, explained 18 January. https://cointelegraph.com/explained/defi-lending-and-borrowing-explained Accessed 25 Oct 2021
Jung E, Le Tilly M, Gehani A, Ge Y (2019) Data mining-based ethereum fraud detection. IEEE, 266–273
Kamps J, Trozze A, Kleinberg B (2022) forthcoming. In: Wood, S., Hanoch, Y. (eds.) Cryptocurrency Fraud. A fresh look at fraud: theoretical and applied approaches. Routledge, forthcoming
Karimov B, Wójcik P (2021) Identification of scams in initial coin offerings with machine learning. Front Artif Intel 4:718450. https://doi.org/10.3389/frai.2021.718450
Kolinska D (2022) Cryptocurrencies in the EU: new rules to boost benefits and curb threats https://www.europarl.europa.eu/news/en/press-room/20220309IPR25162/cryptocurrencies-in-the-eu-new-rules-to-boost-benefits-and-curb-threats Accessed 22 Aug 2022
Lašas K, Kasputytė G, Užupytė R, Krilavičius T (2020) Fraudulent behaviour identification in ethereum blockchain
Li T, Kou G, Peng Y, Yu PS (2022) An integrated cluster detection, optimization, and interpretation approach for financial data. IEEE Trans Cybern 52(12):13848–13861. https://doi.org/10.1109/TCYB.2021.3109066
Li J, Baldimtsi F, Brandao JP, Kugler M, Hulays R, Showers E, Ali Z, Chang J (2021) Measuring illicit activity in defi: The case of ethereum. Lecture Notes in Computer Science. Springer, Berlin, Heidelberg, pp 197–203. https://doi.org/10.1007/978-3-662-63958-0_18
Liu L, Tsai W-T, Bhuiyan MZA, Peng H, Liu M (2022) Blockchain-enabled fraud discovery through abnormal smart contract detection on ethereum. Futur Gener Comput Syst 128:158–166. https://doi.org/10.1016/j.future.2021.08.023
Meta Research (2023) fastText. original-date: 2016-07-16T13:38:42Z https://github.com/facebookresearch/fastText Accessed 28 April 2023
Modi R (2018) Solidity Programming Essentials. Packt, https://subscription.packtpub.com/book/application- development/9781788831383/7/ch07lvl1sec81/the-view-constant-and-pure-functions Accessed 09 May 2023
Musiala RAJ, Goody TM, Reynolds V, Tenery L, McGrath M, Rowland C, Sekhri S (2020) Cryptocurrencies: Forensic techniques to meet the challenge of new fraud and corruption risks | FVS Eye on Fraud. Report, AICPA Winter https://future.aicpa.org/resources/download/cryptocurrencies-forensic-techniques-to-face-new-fraud-and-corruption-risks
Nabi T (2022) Pure vs view in solidity. https://hashnode.com/post/pure-vs-view-in-solidity-cl04tbzlh07kaudnv1ial1gio Accessed 09 May 2023
Narayanan A, Bonneau J, Felten E, Miller A, Goldfeder S (2016) Bitcoin and cryptocurrency technologies
Nutter PW (2018) Machine learning evidence: admissibility and weight comments. Univ Pa J Const Law 21(3):919–958
OpenZeppelin (2023) Contracts. https://docs.openzeppelin.com/contracts/2.x/ Accessed 09 May 2023
OpenZeppelin (2023) Math. https://docs.openzeppelin.com/contracts/2.x/api/math Accessed 15 May 2023
Perkis T (1994) Stack-based genetic programming. In: Proceedings of the First IEEE conference on evolutionary computation. IEEE world congress on computational intelligence, pp. 148–1531. https://doi.org/10.1109/ICEC.1994.350025
Podgor ES (2019) Cryptocurrencies and securities fraud: In need of legal guidance. Available at SSRN 3413384
Practical Law Corporate & Securities (2021) US securities laws: overview. Practice Note 3-383-6798, Thomson Reuters
Prellberg J, Kramer O (2020) Acute lymphoblastic leukemia classification from microscopic images using convolutional neural networks. arXiv:1906.09020 [cs]
Remix (2022a) Debugger. https://remix-ide.readthedocs.io/en/latest/debugger.html Accessed 15 May 2023
Remix (2022b) Debugging transactions. https://remix-ide.readthedocs.io/en/latest/tutorial_debug.html Accessed 15 May 2023
Rodler M, Li W, Karame GO, Davi L (2021) EVMPatch: Timely and automated patching of ethereum smart contracts. In: 30th usenix security symposium (USENIX Security 21), pp. 1289–1306. USENIX Association. https://www.usenix.org/conference/usenixsecurity21/presentation/rodler
Santos I, Brezo F, Ugarte-Pedrero X, Bringas PG (2013) Opcode sequences as representation of executables for data-mining-based unknown malware detection. Inf Sci Int J 231:64–82. https://doi.org/10.1016/j.ins.2011.08.020
Schär F (2021) Decentralized finance: on blockchain- and smart contract-based financial markets. https://doi.org/10.20955/r.103.153-74. Accessed 22 Mar 2022
Scicluna MC, Debono J (2023) MiCA: landmark crypto regulation approved by EU Parliament. https://www.lexology.com/library/detail.aspx?g=152d8020-bc6e-47b9-b236-5b1f8c6b2b88 Accessed 15 May 2023
Scikit-learn developers (2023) sklearn.preprocessing.StandardScaler. https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html Accessed 10 May 2023
Sebastião H, Godinho P (2021) Forecasting and trading cryptocurrencies with machine learning under changing market conditions. Financ Innov 7(1):3. https://doi.org/10.1186/s40854-020-00217-x
Secret network (2021) About Secret (SCRT). https://scrt.network/ Accessed 22 Nov 2021
Securities and exchange commission v. AriseBank, Jared Rice Sr., and Stanley Ford (2020). N.D. Tex. 23 January. No. 3-18-cv-0186-M
Securities and exchange commission v. PlexCorps, Dominic LaCroix, and Sabrina Paradis-Royer (2019). E.D.N.Y. 2 October. No. 17-cv-7007 (CBA) (RML)
Securities and exchange commission v. REcoin Group Foundation, LLC, DRC World INC. A/k/a Diamond Reserve Club, and Maksim Zaslavskiy (2018). E.D.N.Y. 14 May. No. 17-cv-05725
Securities and exchange commission v. LBRY (2022). D.N.H. November 7. No. 21-CV-260-PB
Securities and exchange commission v. Natural Diamonds Investment Co., Eagle Financial Diamond Group Inc A/k/a Diamante Atelier, Argyle Coin, LLC, Jose Angel Aman, Harold Seigel, and Jonathan H. Seigel (2019a). S.D. Fla. 11 December. No. 19-cv-80633
Securities and exchange commission v. Reginald Middleton, et al. (2019b). E.D.N.Y. 1 November. No. 19-cv-4625
Shams SMR, Sobhan A, Vrontis D (2021) Detection of financial fraud risk: implications for financial stability. J Oper Risk
Solidity Team (2020) Solidity 0.8.0 release announcement. https://blog.soliditylang.org/2020/12/16/solidity-v0.8.0-release-announcement/ Accessed 2023 May 09
Solidity Dev Studio: (2020) Exploring the new Solidity 0.8 Release. https://soliditydeveloper.com/solidity-0.8 Accessed 09 May 2023
The Solidity Authors (2023) Contracts. https://docs.soliditylang.org/en/v0.8.19/contracts.html#view-functions Accessed 09 May 2023
Trozze A, Kamps J, Akartuna EA, Hetzel F, Kleinberg B, Davies T, Johnson S (2022) Cryptocurrencies and future financial crime. Crime Science
U.S. Securities and Exchange Commission (2019) Framework for investment contract analysis of digital assets. https://www.sec.gov/corpfin/framework-investment-contract-analysis-digital-assets Accessed 2023 Feb 20
U.S. Securities and Exchange Commission (2021a) SEC Awards $22 Million to Two Whistleblowers. https://www.sec.gov/news/press-release/2021-81 Accessed 22 Nov 2021
U.S. Securities and Exchange Commission: (2021b) Annual Report to Congress Whistleblower Program. https://www.sec.gov/files/2021_OW_AR_508.pdf Accessed 2022 Apr 04
U.S. Securities and Exchange Commission (2022) Cyber enforcement actions 19 January. https://www.sec.gov/spotlight/cybersecurity-enforcement-actions Accessed 10 Feb 2022
Uniswap: (2021) Uniswap Governance. https://gov.uniswap.org/ Accessed 22 November 2021
Uniswap: (2020) Introducing Token Lists 26 August. https://uniswap.org/blog/token-lists Accessed 18 Nov 2021
United States v. Costanzo (2018) D. Arizona 10 August. No. 2:17-cr-00585-GMS
United States v. Murgio (2016) S.D.N.Y. 19 September. No. 15-cr-769 (AJN)
Wang L, Cheng H, Zheng Z, Yang A, Zhu X (2021) Ponzi scheme detection via oversampling-based long short-term memory for smart contracts. Knowl-Based Syst 228:107312. https://doi.org/10.1016/j.knosys.2021.107312
Wang L, Sarker PK, Bouri E (2022) Short- and long-term interactions between bitcoin and economic variables: evidence from the US. Comput Econ. https://doi.org/10.1007/s10614-022-10247-5
web3.py. ethereum (2023) original-date: 2016-04-14T15:59:35Z. https://github.com/ethereum/web3.py/blob/acd5b24474dd5b13548dffa33e1d2872c3dccad9/docs/index.rst Accessed 28 April 2023
Wilder RP (2020) Heidi: Tracing cryptocurrency scams: Clustering replicated advance-fee and phishing websites. arXiv preprint arXiv:2005.14440
Wintermeyer L (2021) After Growing 88x In A Year, Where Does DeFi Go From Here? (2 November 2021). https://www.forbes.com/sites/lawrencewintermeyer/2021/05/20/after-growing-88x-in-a-year-where-does-defi-go-from-here/ Accessed 18 Nov
Wood, G (2021) Ethereum: a secure decentralised generalised transaction ledger. Ethereum 2 Nov. https://ethereum.github.io/yellowpaper/paper.pdf
Wu J, Lin D, Zheng Z, Yuan Q (2020) T-edge: Temporal weighted multidigraph embedding for ethereum transaction network analysis. Front Phys 8:204. https://doi.org/10.3389/fphy.2020.00204
Xia P, wang H, Gao B, Su W, Yu Z, Luo X, Zhang C, Xiao X, Xu G (2021) Demystifying scam tokens on uniswap decentralized exchange. arXiv:2109.00229 [cs]
Xin Q, Zhou J, Hu F (2018) The economic consequences of financial fraud: evidence from the product market in China. China J Account Stud 6(1):1–23. https://doi.org/10.1080/21697213.2018.1480005
Xu M, Chen X, Kou G (2019) A systematic review of blockchain. Financ Innov 5(1):27. https://doi.org/10.1186/s40854-019-0147-z
Xu J, Paruch K, Cousaert S, Feng Y (2021) SoK: Decentralized exchanges (DEX) with automated market maker (AMM) protocols. arxiv:2103.12732. Accessed 22 Mar 2022
Zapper: (2021) Your Homepage to DeFi. https://zapper.fi/ Accessed 18 Nov 2021
Zes (2020) Is it safe to Zap into all liquidity pools on Zapper?. https://zapper.crunch.help/zapper-fi-faq/is-it-safe-to-zap-into-all-liquidity-pools-on-zapper Accessed 18 Nov 2021
Zhang Y, Yu W, Li Z, Raza S, Cao H (2021) Detecting ethereum ponzi schemes based on improved lightgbm algorithm. IEEE Trans Comput Soc Syst. https://doi.org/10.1109/TCSS.2021.3088145
Zhou L, Xiong X, Ernstberger J, Chaliasos S, Wang Z, Wang Y, Qin K, Wattenhofer R, Song D, Gervais A (2023) SoK: decentralized finance (DeFi) attacks. arxiv:2208.13035