Reclaiming space from duplicate files in a serverless distributed file system
Tóm tắt
The Farsite distributed file system provides availability by replicating each file onto multiple desktop computers. Since this replication consumes significant storage space, it is important to reclaim used space where possible. Measurement of over 500 desktop file systems shows that nearly half of all consumed space is occupied by duplicate files. We present a mechanism to reclaim space from this incidental duplication to make it available for controlled file replication. Our mechanism includes: (1) convergent encryption, which enables duplicate files to be coalesced into the space of a single file, even if the files are encrypted with different users' keys; and (2) SALAD, a Self-Arranging Lossy Associative Database for aggregating file content and location information in a decentralized, scalable, fault-tolerant manner. Large-scale simulation experiments show that the duplicate-file coalescing system is scalable, highly effective, and fault-tolerant.
Từ khóa
#File systems #Cryptography #Extraterrestrial measurements #Databases #Large-scale systems #File servers #Availability #Fault diagnosis #Secure storage #Distributed computingTài liệu tham khảo
wolf, 1989, The Placement Optimization Program: A Practical Solution to the Disk File Assignment Problem, ACM SIGMETRICS 89, 10.1145/75108.75373
waldman, 2000, Publius: A Robust, Tamper-Evident Censorship-Resistant Web Publishing System, 9th USENIX Security Symposium, 59
10.1109/MRD.1990.138237
saroiu, 2002, A Measurement Study of Peer-to-Peer File Sharing Systems, SPIE Multimedia Computing and Networking 2002
rowstron, 0, Pastry: Scalable, Distributed Object Location and Routing for Large-Scale Peer-to-Peer Systems, SIGCOMM 2001 submission
10.1145/964723.383072
10.1145/268998.266694
10.1145/964723.383071
song, 2000, Practical Techniques for Searches on Encrypted Data, IEEE Symposium on Security and Privacy, 44
solomon, 1998, Inside Windows NT
cabri, 1996, Experience of Adaptive Replication in Distributed File Systems, 22nd EUROMICRO IEEE, 459
10.1145/510726.510755
castro, 1999, Practical Byzantine Fault Tolerance, 3rd OSDI USENIX, 173
clarke, 2000, Freenet: A Distributed Anonymous Information Storage and Retrieval System, Proc ICSI Workshop Design Issues in Anonymity and Unobservability
10.1145/301453.301480
10.1109/RELDIS.2001.969727
10.1109/ICDCS.2002.1022312
droms, 1997, Dynamic Host Configuration Protocol, RFC 2131
10.1145/75577.75583
10.1109/ADL.1998.670389
guy, 1990, Implementation of the Ficus Replicated File System, 1990 USENIX Conference Usenix, 63
10.1145/502034.502052
10.1145/168588.168596
menezes, 1997, Handbook of Applied Cryptography
anderson, 1996, Two Practical and Provably Secure Block Ciphers: BEAR and LION, Proc 3rd Fast Software Encryption Workshop, 113, 10.1007/3-540-60865-6_48
10.1145/362686.362692
10.1145/258492.258523
biham, 1991, Differential Fault Analysis of Secret Key Cryptosystems, CRYPTO'91, 156
10.1145/339331.339345
bolosky, 2000, Single Instance Storage in Windows® 2000, Proc 4th Usenix Windows Systems Symp, 13
10.1145/224056.224066
brinkmann, 2000, Efficient, Distributed Data Placement Strategies for Storage Area Networks, 12th SPAA ACM
alexander, 1997, DHCP Options and BOOTP Vendor Extensions, RFC 2132
10.1145/35037.35059
kelsey, 2000, Side Channel Cryptanalysis of Product Ciphers, Journal of Computer Security, 8, 141, 10.3233/JCS-2000-82-304
jain, 1991, The Art of Computer Systems Performance Analysis, John Wiley & Sons
kocher, 1999, Differential Power Analysis, Journal of Cryptology'99, 388, 10.1007/3-540-48405-1_25
kocher, 1996, Timing Attacks on Implementations of Diffie-Hellman, RSA, DSS and Other Systems, Crypto, 104
10.1109/MRD.1992.242617
kure, 1988, Optimization of File Migration in Distributed Systems, Technical Report UCB/CSD 88/413 University of California at Berkeley
