DeStager: feature guided in-situ data management in distributed deep memory hierarchies

Xuechen Zhang1, Fang Zheng2, Bao Quoc Nguyen1
1School of Engineering and Computer Science, Washington State University-Vancouver,Vancouver,USA
2IBM T.J.Watson Research Center, New York, USA#TAB#

Tóm tắt

Từ khóa


Tài liệu tham khảo

Abbasi, H., Wolf, M., Eisenhauer, G., Klasky, S., Schwan, K., Zheng, F.: Datastager: scalable data staging services for petascale applications. In: HPDC (2009)

ADIOS. Adios.: Adaptive i/o system. http://www.olcf.ornl.gov/center-projects/adios/ (2012)

Al-Furaih, I., Aluru, S., Goil, S., Ranka, S.: Parallel construction of multidimensional binary search trees. In: ICS (1996)

Caulfield, A.M., Grupp, L.M., Swanson, S.: Gordon: using flash memory to build fast, power-efficient clusters for data-intensive applications. In: ASPLOS (2009)

Center-wide Scrach Filesystem Atlas.: https://www.olcf.ornl.gov/kb_articles/atlas-transition/

Chen, F., Koufaty, D.A., Zhang, X.: Hystor: making the best use of solid state drives in high performance storage systems. In: ICS (2011)

Chen, G., Vo, H.T., Wu, S., Ooi, B.C., Özsu, M.T.: A framework for supporting dbms-like indexes in the cloud. PVLDB 4(11), 702–713 (2011)

Dayal, J., Bratcher, D., Eisenhauer, G., Schwan, K., Wolf, M., Zhang, X., Abbasi, H., Klasky, S., Podhorszki, N.: Flexpath: type-based publish/subscribe system for large-scale science analytics. In: CCGrid (2014)

Evpath.: An event transport middleware layer. http://www.cc.gatech.edu/systems/projects/EVPath/

Hawkes, J.C.S.E.R., Sankaran, R., Chen, J.H.: Direct numerical simulation of turbulent combustion: fundamental insights towards predictive models. J. Phys. 16, 65–79 (2005)

Eisenhauer, G., Wolf, M., Abbasi, H., Schwan, K.: Event-based systems: opportunities and challenges at exascale. In: DEBS (2009)

Guttman, A.: R-trees: A dynamic index structure for spatial searching. In: Yormark, B. (ed) SIGMOD (1984)

He, J., Bennett, J., Snavely, A.: Dash-IO: an empirical study of flash-based IO for PHC. In: TG (2010)

He, J., Jagatheesan, A., Gupta, S., Bennett, J., Snavely, A.: Dash: a recipe for a flash-based data intensive supercomputer. In: SC (2010)

Heikkinen, J.A., Janhunen, S.J., Kiviniemi, T.P., Ogando, F.: Full f gyrokinetic method for particle simulation of tokamak transport. J. Comput. Phys. 227(11), 5582–5609 (2008)

Jin, T., Zhang, F., Sun, Q., Bui, H., Parashar, M., Yu, H., Klasky, S., Podhorszki, N., Abbasi, H.: Using cross-layer adaptations for dynamic data management in large scale coupled scientific workflows. In: SC, p. 74 (2013)

Jin, T., Zhang, F., Sun, Q., Bui, H., Romanus, M., Podhorszki, N., Klasky, S., Kolla, H., Chen, J., Hager, R., Chang, C.S., Parashar, M.: Exploring data staging across deep memory hierarchies for coupled data intensive simulation workflows. In: IPDPS (2015)

Jung, M., Wilson III, E.H., Choi, W., Shalf, J., Aktulga, H.M., Yang, C., Saule, E., Catalyurek, U.V., Kandemir, M.: Exploring the future of out-of-core computing with compute-local non-volatile memory. In: SC (2013)

Kim, J., Abbasi, H., Chacón, L., Docan, C., Klasky, S., Liu, Q., Podhorszki, N., Shoshani, A., Wu, K.: Parallel in situ indexing for data-intensive computing. In: LDAV, pp. 65–72 (2011)

Klasky, S., Ethier, S., Lin, Z., Martins, K., McCune, D., Samtaney, R.: Grid -based parallel data streaming implemented for the gyrokinetic toroidal code. In: SC ’03 (2003)

Lakshminarasimhan, S., Boyuka, D.A., Pendse, S.V., Zou, X., Jenkins, J., Vishwanath, V., Papka, M.E., Samatova, N.F.: Scalable in situ scientific data encoding for analytical query processing. In: HPDC’13

Lakshminarasimhan, S., Boyuka, D.A., Pendse, S.V., Zou, X., Jenkins, J., Vishwanath, V., Papka, M.E., Samatova, N.F.: Scalable in situ scientific data encoding for analytical query processing. In: HPDC (2013)

Lashuk, I., Chandramowlishwaran, A., Langston, H., Nguyen, T.-A., Sampath, R., Shringarpure, A., Vuduc, R., Ying, L., Zorin, D., Biros, G.: A massively parallel adaptive fast multipole method on heterogeneous architectures. In: SC (2009)

Lee, D., Vuduc, R., Gray, A.G.: A distributed kernel summation framework for general-dimension machine learning. In: SDM (2012)

Lee, T., Moon, B., Lee, S.: Bulk insertion for r-trees by seeded clustering. Data Knowl. Eng. 59(1), 86–106 (2006)

Liu, N., Cope, J., Carns, P.H., Carothers, C.D., Ross, R.B., Grider, G., Crume, A., Maltzahn, C.: On the role of burst buffers in leadership-class storage systems. In: MSST, pp. 1–11 (2012)

Lorensen, W.E., Cline, H.E.: Marching cubes: a high resolution 3d surface construction algorithm. In: SIGGRAPH (1987)

Mehta, D.P., Sahni, S.: Handbook of Algorithms and Data Structures. Chapman and Hall, London (2004)

Moon, B., Jagadish, H.V., Faloutsos, C., Saltz, J.H.: Analysis of the clustering properties of the hilbert space-filling curve. Trans. Knowl. Data Eng. 13(1), 124–141 (2001)

Nam, B., Sussman, A.: Spatial indexing of distributed multidimensional datasets. In: CCGRID, pp. 743–750 (2005)

Nam, B., Sussman, A.: Dist: fully decentralized indexing for querying distributed multidimensional datasets. In: IPDPS (2006)

Nguyen, B., Tan, H., Zhang, X.: Large-scale adaptive mesh simulations through non-volatile byte-addressable memory. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2017, Denver, CO (2017)

Plimpton, S.: Fast parallel algorithms for short-range molecular dynamics. J. Comput. Phys. 117(1), 1–19 (1995)

Prabhakar, R., Vazhkudai, S.S., Kim, Y., Butt, A.R., Li, M., Kandemir, M.: Provisioning a multi-tiered data staging area for extreme-scale machines. In: The 31st International Conference on Distributed Computing Systems (2011)

Rajachandrasekar, R., Ouyang, X., Besseron, X., Meshram, V., Panda, D.K.: Can checkpoint/restart mechanisms benefit from hierarchical data staging? Euro-Par Workshops 2, 312–321 (2011)

Reliable UDP networking library.: http://enet.bespin.org/

Schnitzer, B., Leutenegger, S.T.: Master-client R-trees: a new parallel r-tree architecture. In: SSDBM (1999)

Shekhar, R., Fayyad, E., Yagel, R., Cornhill, J.F.:. Octree-based decimation of marching cubes surfaces. In: VIS (1996)

Su, Y., Wang, Y., Agrawal, G.: In-situ bitmaps generation and efficient data analysis based on bitmaps. In: HPDC (2015)

The sith cluster.: https://www.olcf.ornl.gov/computing-resources/sith/

The architecture of burst buffer.: http://www.nersc.gov/users/computational-systems/cori/burst-buffer/burst-buffer/

Vetter, J.S., Mittal, S.: Opportunities for nonvolatile memory systems in extreme-scale high-performance computing. Comput. Sci. Eng. 17(2), 73–82 (2015)

Wang, C., Vazhkudai, S.S., Ma, X., Meng, F., Kim, Y., Engelmann, C.: Nvmalloc: Exposing an aggregate SSD store as a memory partition in extreme-scale machines. In: IPDPS, pp. 957–968 (2012)

Wolf, M., Cai, Z., Huang, W., Schwan, K.: Smartpointers: personalized scientific data portals in your hand. In: SC, pp. 1–16 (2002)

Yang, Q., Ren, J.: I-cash: Intelligently coupled array of SSD and HDD. In: HPCA (2011)

Yu, H., Wang, C., Grout, R.W., Chen, J.H., Ma, K.-L.: In situ visualization for large-scale combustion simulations. IEEE Comput. Graph. Appl. 30(3), 45–57 (2010)

Zhang, W., Tang, H., Ranshous, S., Byna, S., Martn, D.F., Wu, K., Dong, B., Klasky, S., Samatova, N.F.: Exploring memory hierarchy and network topology for runtime AMR data sharing across scientific applications. In: Big Data (2016)

Zhang, X., Zheng, F., Schwan, K., Wolf, M.: Flashstager: improving the performance of SSD-based data staging systems via write redirection. In: CLUSTER (2016)