Comparison and evaluation of state-of-the-art LSM merge policies
Tóm tắt
Modern NoSQL database systems use log-structured merge (LSM) storage architectures to support high write throughput. LSM architectures aggregate writes in a mutable MemTable (stored in memory), which is regularly flushed to disk, creating a new immutable file called an SSTable. Some of the SSTables are chosen to be periodically merged—replaced with a single SSTable containing their union. A merge policy (a.k.a. compaction policy) specifies when to do merges and which SSTables to combine. A bounded depth merge policy is one that guarantees that the number of SSTables never exceeds a given parameter k, typically in the range 3–10. Bounded depth policies are useful in applications where low read latency is crucial, but they and their underlying combinatorics are not yet well understood. This paper compares several bounded depth policies, including representative policies from industrial NoSQL databases and two new ones based on recent theoretical modeling, as well as the standard Tiered policy and Leveled policy. The results validate the proposed theoretical model and show that, compared to the existing policies, the newly proposed policies can have substantially lower write amplification with comparable read amplification.
Tài liệu tham khảo
Ahmad, M.Y., Kemme, B.: Compaction management in distributed key-value datastores. Proc. VLDB Endow. 8(8), 850–861 (2015)
Alsubaiee, S., Altowim, Y., Altwaijry, H., Behm, A., Borkar, V., Bu, Y., Carey, M., Cetindil, I., Cheelangi, M., Faraaz, K., Gabrielova, E., Grover, R., Heilbron, Z., Kim, Y.S., Li, C., Li, G., Ok, J.M., Onose, N., Pirzadeh, P., Tsotras, V., Vernica, R., Wen, J., Westmann, T.: AsterixDB: a scalable, open source BDMS. Proc. VLDB Endow. 7(14), 1905–1916 (2014a)
Alsubaiee, S., Behm, A., Borkar, V., Heilbron, Z., Kim, Y.S., Carey, M.J., Dreseler, M., Li, C.: Storage management in AsterixDB. Proc. VLDB Endow. 7(10), 841–852 (2014b)
Apache Software Foundation: Apache Cassandra. (2019a). http://cassandra.apache.org
Apache Software Foundation: Apache HBase. (2019b). https://hbase.apache.org
Apache Software Foundation: Apache AsterixDB. (2020). https://asterixdb.apache.org
Balmau, O., Dinu, F., Zwaenepoel, W., Gupta, K., Chandhiramoorthi, R., Didona, D.: SILK: Preventing latency spikes in log-structured merge key-value stores. In: 2019 USENIX Annual Technical Conference (USENIX ATC 19), pp 753–766 (2019)
Cattell, R.: Scalable SQL and NoSQL data stores. SIGMOD Rec. 39(4), 12–27 (2011)
Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: a distributed storage system for structured data. ACM Trans. Comput. Syst. 26(2), 4:1–4:26 (2008)
Chen, F., Lee, R., Zhang, X.: Essential roles of exploiting internal parallelism of flash memory based solid state drives in high-speed data processing. In: 2011 IEEE 17th International Symposium on High Performance Computer Architecture, pp. 266–277. IEEE (2011)
Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB. In: Proceedings of the 1st ACM Symposium on Cloud Computing, ACM, SoCC ’10, pp. 143–154 (2010)
Corbett, J.C., Dean, J., Epstein, M., Fikes, A., Frost, C., Furman, J.J., Ghemawat, S., Gubarev, A., Heiser, C., Hochschild, P., Hsieh, W., Kanthak, S., Kogan, E., Li, H., Lloyd, A., Melnik, S., Mwaura, D., Nagle, D., Quinlan, S., Rao, R., Rolig, L., Saito, Y., Szymaniak, M., Taylor, C., Wang, R., Woodford, D.: Spanner: Google’s globally distributed database. ACM Trans. Comput. Syst. 31(3), 8:1–8:22 (2013)
Dayan, N., Idreos, S.: Dostoevsky: Better space-time trade-offs for LSM-Tree based key-value stores via adaptive removal of superfluous merging. In: Proceedings of the 2018 International Conference on Management of Data, ACM, SIGMOD ’18, pp. 505–520 (2018)
Dayan, N., Athanassoulis, M., Idreos, S.: Monkey: Optimal navigable key-value store. In: Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD, pp. 79–94 (2017)
Dent, A.: Getting started with LevelDB. Packt Publishing Ltd, London (2013)
Dong, S., Callaghan, M., Galanis, L., Borthakur, D., Savor, T., Strum, M.: Optimizing space amplification in RocksDB. In: CIDR, CIDR, vol 3, p. 3 (2017)
Facebook, Inc: RocksDB. (2020a). https://rocksdb.org
Facebook, Inc: RocksDB Wiki: Compaction. https://github.com/facebook/rocksdb/wiki/Compaction (2020b)
George, L.: HBase: The Definitive Guide: Random Access to Your Planet-Size Data. O’Reilly Media Inc, Newton (2011)
Google LLC: Bigtable. (2019a). https://cloud.google.com/bigtable
Google LLC: LevelDB. (2019b). https://github.com/google/leveldb
Graefe, G., et al.: Modern B-tree techniques. Found. Trends Databases 3(4), 203–402 (2011)
Grover, R., Carey, M.J.: Data ingestion in AsterixDB. In: EDBT, OpenProceedings.org, pp. 605–616 (2015)
Jermaine, C., Omiecinski, E., Yee, W.G.: The partitioned exponential file for database storage management. VLDB J. 16(4), 417–437 (2007)
Judd, D.: Scale out with HyperTable. Linux magazine, August 7th 1, (2008). http://www.linux-mag.com/id/6645
Kepner, J., Arcand, W., Bestor, D., Bergeron, B., Byun, C., Gadepally, V., Hubbell, M., Michaleas, P., Mullen, J., Prout, A., et al.: Achieving 100,000,000 database inserts per second using Accumulo and D4M. In: 2014 IEEE High Performance Extreme Computing Conference (HPEC), pp. 1–6. IEEE, IEEE (2014)
Khetrapal, A., Ganesh, V.: HBase and Hypertable for large scale distributed storage systems. Department of Computer Science, Purdue University 10(1376616.1376726) (2006)
Kuszmaul, B.C.: A comparison of fractal trees to log-structured merge (LSM) trees. Tokutek White Paper (2014)
Lakshman, A., Malik, P.: Cassandra: a decentralized structured storage system. SIGOPS Oper. Syst. Rev. 44(2), 35–40 (2010)
Lim, H., Andersen, D.G., Kaminsky, M.: Towards accurate and fast evaluation of multi-stage log-structured designs. In: 14th USENIX Conference on File and Storage Technologies (FAST 16), USENIX Association, pp. 149–166 (2016)
Lu, L., Pillai, T.S., Gopalakrishnan, H., Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H.: WiscKey: separating keys from values in SSD-conscious storage. ACM Trans. Storage 13(1), 5:1–5:28 (2017)
Luo, C., Carey, M.J.: Breaking down memory walls: adaptive memory management in LSM-based storage systems (extended version). arXiv:2004.10360 (2020a)
Luo, C., Carey, M.J.: LSM-based storage techniques: a survey. VLDB J. 29(1), 393–418 (2020b)
Mao, Q., Qader, M.A., Hristidis, V.: Comprehensive comparison of LSM architectures for spatial data. In: 2020 IEEE International Conference on Big Data (Big Data), pp. 455–460. IEEE (2020)
Mathieu, C., Staelin, C., Young, N.E., Yousefi, A.: Bigtable merge compaction. arXiv:1407.3008 (2014)
O’Neil, P., Cheng, E., Gawlick, D., O’Neil, E.: The log-structured merge-tree (LSM-tree). Acta Inf. 33(4), 351–385 (1996)
Patil, S., Polte, M., Ren, K., Tantisiriroj, W., Xiao, L., López, J., Gibson, G., Fuchs, A., Rinaldi, B.: YCSB++: Benchmarking and performance debugging advanced features in scalable table stores. In: Proceedings of the 2Nd ACM Symposium on Cloud Computing, ACM, SOCC ’11, pp. 9:1–9:14 (2011)
Raju, P., Kadekodi, R., Chidambaram, V., Abraham, I.: PebblesDB: Building key-value stores using fragmented log-structured merge trees. In: Proceedings of the 26th Symposium on Operating Systems Principles, pp 497–514 (2017)
Ren, K., Zheng, Q., Arulraj, J., Gibson, G.: SlimDB: a space-efficient key-value storage engine for semi-sorted data. Proc. VLDB Endow. 10(13), 2037–2048 (2017)
Sears, R., Ramakrishnan, R.: bLSM: a general purpose log structured merge tree. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, ACM, SIGMOD ’12. pp. 217–228 (2012)
Teng, D., Guo, L., Lee, R., Chen, F., Ma, S., Zhang, Y., Zhang, X.: LSbM-tree: Re-enabling buffer caching in data management for mixed reads and writes. In: Distributed Computing Systems (ICDCS), 2017 IEEE 37th International Conference on, pp. 68–79. IEEE (2017)
UCR DBLab: Merge Simulator. (2019). https://github.com/UC-Riverside-DatabaseLab/MergeSimulator
Wang, P., Sun, G., Jiang, S., Ouyang, J., Lin, S., Zhang, C., Cong, J.: An efficient design and implementation of LSM-tree based key-value store on open-channel SSD. In: Proceedings of the Ninth European Conference on Computer Systems, ACM, EuroSys ’14, pp. 16:1–16:14 (2014)
Yahoo!: Yahoo! cloud serving benchmark. (2019). https://github.com/brianfrankcooper/YCSB