MemEFS: A network-aware elastic in-memory runtime distributed file system

Future Generation Computer Systems - Tập 82 - Trang 631-646 - 2018
Alexandru Uta1, Ove Danner1, Cas van der Weegen1, Ana-Maria Oprescu1,2, Andreea Sandu1, Stefania Costache1, Thilo Kielmann1
1Department of Computer Science, Vrije Universiteit, Amsterdam, The Netherlands
2Informatics Institute, Universiteit van Amsterdam, The Netherlands

Tài liệu tham khảo

Jacob, 2009, Montage: a grid portal and software toolkit for science-grade astronomical image mosaicking, Int. J. Comput. Sci. Eng., 4, 73 Altschul, 1990, Basic local alignment search tool, J. Mol. Biol., 215, 403, 10.1016/S0022-2836(05)80360-2 Gropp, 1999 Zhang, 2013, Parallelizing the execution of sequential scripts Uta, 2016, Overcoming data locality: An in-memory runtime file system with symmetrical data distribution, Future Gener. Comput. Syst., 54, 144, 10.1016/j.future.2015.01.013 A. Uta, A. Sandu, T. Kielmann, MemFS: an In-memory runtime file system with symmetrical data distribution, in: IEEE Cluster, 2014, pp. 272–273 (poster paper). Uta, 2015, Scalable in-memory computing, 805 Hey, 2009 Schad, 2010, Runtime measurements in the cloud: observing, analyzing, and reducing variance, Proc. VLDB Endow., 3, 460, 10.14778/1920841.1920902 Ballani, 2011, Towards predictable datacenter networks, 242 Kandula, 2009, The nature of data center traffic: measurements & analysis, 202 Benson, 2010, Network traffic characteristics of data centers in the wild, 267 Uta, 2015, MemEFS: an elastic in-memory runtime file system for escience applications, 465 M. Szeredi, et al. FUSE: Filesystem in userspace. http://fuse.sourceforge.net/. B. Aker, Libmemcached, 2016. http://libmemcached.org/libMemcached.html. Fitzpatrick, 2004, Distributed caching with memcached, Linux J., 2004, 5 Deelman, 2015, Pegasus: a workflow management system for science automation, J. Future Gener. Comput. Syst., 10.1016/j.future.2014.10.008 Karger, 1997, Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the world wide web, 654 xxhash, 2016. https://code.google.com/p/xxhash/. D. Eastlake, P. Jones, Us secure hash algorithm 1 (sha1), 2001. R. Rivest, The md5 message-digest algorithm, 1992. Godfrey, 2005, Heterogeneity and load balance in distributed hash tables, 596 Stoica, 2001, Chord: A scalable peer-to-peer lookup service for internet applications, 149 Apache Libcloud, 2016. https://libcloud.apache.org. S. Sanfilippo, P. Noordhuis, Redis, 2014. http://redis.io. hiredis, 2016. https://github.com/redis/hiredis. P. Hunt, M. Konar, F.P. Junqueira, B. Reed, ZooKeeper: wait-free coordination for internet-scale systems, in: USENIX Annual Technical Conference, Vol. 8, 2010, pp. 11–11. SCEC project, Southern California Earthquake Center, 2015. http://www.scec.org/. Pegasus workflow generator, 2016. https://confluence.pegasus.isi.edu/display/pegasus/WorkflowGenerator. Juve, 2013, Characterizing and profiling scientific workflows, Future Gener. Comput. Syst., 29, 682, 10.1016/j.future.2012.08.015 Z. Zhang, D. Katz, Using application skeletons to improve escience infrastructure, in: 2014 IEEE 10th International Conference on e-Science (e-Science), Vol. 1, 2014, pp. 111–118. DAS-4, The distributed ASCI supercomputer, 2016. http://www.cs.vu.nl/das4/. Open Nebula, 2016. http://www.opennebula.org. C. Guo, G. Lu, H.J. Wang, S. Yang, C. Kong, P. Sun, W. Wu, Y. Zhang, Secondnet: A data center network virtualization architecture with bandwidth guarantees, in: Proceedings of the 6th International COnference, Co-NEXT ’10, 2010, pp. 15:1–15:12. R.B. Ross, R. Thakur, et al. PVFS: A parallel file system for linux clusters, in: 4th Annual Linux Showcase and Conference, 2000, pp. 391–430. GlusterFS, 2016. http://www.gluster.org/. F. Hupfeld, T. Cortes, B. Kolbeck, J. Stender, E. Focht, M. Hess, J. Malo, J. Marti, E. Cesario, The XtreemFS Architecture — a case for object-based file systems in grids, Concurr. Comput. Pract. Exp. Shvachko, 2010, The Hadoop distributed file system, 1 Weil, 2006, Ceph: A scalable, high-performance distributed file system, 307 B. Nicolae, P. Riteau, K. Keahey, Bursting the Cloud Data Bubble: Towards transparent storage elasticity in IaaS clouds, in: IEEE 28th International Parallel and Distributed Processing Symposium, IPDPS ’14, 2014, pp. 135–144. H.C. Lim, S. Babu, J.S. Chase, Automated control for elastic storage, in: 7th International Conference on Autonomic Computing, ICAC ’10, 2010, pp. 1–10. Ousterhout, 2011, The case for RAMCloud, Commun. ACM, 54, 121, 10.1145/1965724.1965751 A. Dragojevic, D. Narayanan, M. Castro, O. Hodson, FaRM: Fast remote memory, in: 11th USENIX Symposium on Networked Systems Design and Implementation, 2014, pp. 401–414. Islam, 2014, In-memory i/o and replication for hdfs with memcached: Early experiences, 213 Duro, 2013, A hierarchical parallel storage system based on distributed memory for large scale systems Amazon ElastiCache, 2016. http://aws.amazon.com/elasticache/. Hazelcast, 2016. http://http://hazelcast.com/. T. Li, X. Zhou, K. Brandstatter, D. Zhao, K. Wang, A. Rajendran, Z. Zhang, I. Raicu, ZHT: A light-weight reliable persistent dynamic zero-hop distributed hash table, in: Parallel & Distributed Processing Symposium (IPDPS), 2013. Brinkmann, 2002, Compact, adaptive placement schemes for non-uniform requirements, 53 Schindelhauer, 2005, Weighted Distributed Hash Tables, 218 P. Qin, B. Dai, B. Huang, G. Xu, Bandwidth-aware scheduling with sdn in hadoop: A new trend for big data, arXiv preprint arXiv:1403.2800. Kondikoppa, 2012, Network-aware scheduling of mapreduce framework ondistributed clusters over high speed networks, 39 Yazdanov, 2015, Ehadoop: network i/o aware scheduler for elastic mapreduce cluster, 821 Lin, 2014, Bandwidth-aware divisible task scheduling for cloud computing, Softw. - Pract. Exp., 44, 163, 10.1002/spe.2163 Chaves, 2013, Scheduling cloud applications under uncertain available bandwidth, 3781