Heterogeneous resource federation with a centralized security model for information extraction
Tóm tắt
With the continuous growth of data generated in various scientific and commercial endeavors and the rising need for interdisciplinary studies and applications in e-Science easy exchange of information and computation resources capable of processing large amounts of data to allow ad-hoc co-operation becomes ever more important. Unfortunately different communities often use incompatible resource management systems. In this work we try to alleviate the difficulties occurring on bridging the gap between different research eco-systems by federating resources and thus unifying resource access. To this end, our solution presented in this paper outlines a secure, simple, yet highly interoperable and flexible architecture using RESTful Web services and WebDAV. While, first and foremost in the Grid computing domain, there are already standards and solutions in place addressing related problems, our solution differs from those approaches by allowing to federate data storage systems that are not aware of being federated. Access to these is enabled by our federation layer using storage system specific connectors. Hence, our federation approach is intended as an abstraction layer on top of existing storage or middleware solutions, allowing for a more uniform access mechanism. Additionally, our solution also allows for submission and management of computational jobs on said data, thereby federating not only data but also computational resources. Once resource access is unified, information from different data formats can be semantically unified by information extraction methods. It is our belief that the work in this paper can complement existing Grid computing efforts by facilitating access to data storage system not inherently available via commonly used Grid computing standards.
Tài liệu tham khảo
Cannataro M, Talia D, Srimani PK: Parallel data intensive computing in scientific and commercial applications. Parallel Computing 2002, 28(5):673–704. http://dx.doi.org/10.1016/S0167-8191(02)00091-1
Kouzes R, Anderson G, Elbert S, Gorton I, Gracio D: The changing paradigm of data-intensive computing. Computer 2009, 42(1):26–34. 10.1109/MC.2009.26
Bryant R: Data-intensive scalable computing for scientific applications. Computing in Science Engineering 2011, 13(6):25–33. 10.1109/MCSE.2011.73
Foster I, Kesselman C (eds): The grid: blueprint for a new computing infrastructure. San Francisco, CA, USA: Morgan, Kaufmann Publishers Inc.; 1999.
Venugopal S, Buyya R, Ramamohanarao K: A taxonomy of data grids for distributed data sharing, management, and processing. ACM Comput. Surv 2006., 38: http://doi.acm.org/10.1145/1132952.1132955
XSEDE: Extreme Science and Engineering Discovery Environment https://www.xsede.org/home
EGI, European Grid Infrastructure http://www.egi.eu/
D-Grid, The German Grid Initiative http://www.d-grid-gmbh.de/index.php?id=1%26L=1
WisNetGrid: WisNetGrid – Knowledge Networks for Grids Grid Project within the German Grid Initiative (D-Grid). Jäkel R (ed) Technische Universität Dresden, 2009. Web. 04. http://wisnetgrid.org
PubMed: PubMed. National Institutes of Health, 1996 http://www.ncbi.nlm.nih.gov/pubmed/
United Nations Environment Programme: Environmental Data Explorer The Environmental Database. N.p., n.d. Web. 04. http://geodata.grid.unep.ch/
Foster I, Zhao Y, Raicu I, Lu S: Cloud computing and grid computing 360-degree compared. In Grid Computing Environments Workshop, 2008. GCE ’08. New York,: ACM; 2008. pp 1–10. pp 1–10. 10.1109/GCE.2008.4738445
Cooper D, Santesson S, Farrell S, Boeyen S, Housley R, Polk W: Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile. 2008.http://tools.ietf.org/html/rfc5280 D-Grid, The German Grid Initiative.
XSede: Science Gateways via User Portal https://www.xsede.org/science-gateways
EGI, Science Gateways http://www.egi.eu/services/researchers/science-gateways/
Plantikow S, Peter K, Hgqvist M, Grimme C, Papaspyrou A: Generalizing the data management of three community grids. Future Generation Computer Systems 2009, 25(3):281–289. 10.1016/j.future.2008.05.001
EMI, European Middleware Initiative http://www.eu-emi.eu
Jie W, Arshad J, Sinnott R, Townend P, Lei Z: A review of grid authentication and authorization technologies and support for federated access control. ACM Comput. Surv 2011, 43: 12:1–12:26. http://doi.acm.org/10.1145/1883612.1883619
Farkas Z, Kacsuk P: P-grade portal: A generic workflow system to support user communities. Future Generation Computer Systems 2011, 27(5):454–465. 10.1016/j.future.2010.12.001
Novotny J, Tuecke S, Welch V: An online credential repository for the Grid: MyProxy. In: High Performance Distributed Computing, 2001. Proceedings. 10th IEEE International Symposium on, Internet2 Middleware Initiative 2001. pp 104 –111. pp 104 –111. 10.1109/HPDC.2001.945181
Tuecke S, Engert D, Foster I, Welch V, Chicago U, Thompson M, Pearlman L, Kesselman C: Internet X.509 Public Key Infrastructure Proxy Certificate Profile. 2001. Conference publication, Limerick, Revised July 2002
Guo Z, Singh R, Pierce M: Building the PolarGrid portal using Web 2.0 and OpenSocial. In Proceedings of the 5th Grid Computing Environments Workshop, GCE ’09. New York, NY, USA,: ACM; 2009. pp 5:1–5:8. pp 5:1–5:8. 10.1145/1658260.1658267
MosGrid: Molecular Simulation Grid https://mosgrid.de/portal
Gesing S, Grunzke R, Balaskó A, Birkenheuer G, Blunk D, Breuers S, Brinkmann A, Fels G, Herres-Pawlis S, Kacsuk P, Kozlovszky M, Krüger J, Packschies L, Schäfer P, Schuller B, Schuster J, Steinke T, Szikszay Fabri A, Wewior M, Müller-Pfefferkorn R, Kohlbacher O: Granular security for a science gateway in structural bioinformatics. In 3rd International Workshop on Science Gateways for Life Sciences (IWSG 2011), CEUR Workshop Proceedings, vol 819. Amsterdam: Elsevier Science Publishers B. V.; 2011. http://ceur-ws.org/Vol-819/
Recordon, David and Fitzpatrick, Brad: OpenID Authentification 1.1 (2006) http://openid.net/specs/openid-authentication-1_1.html
Internet2 Middleware Initiative: Shibboleth http://shibboleth.net
W3C Working Group: Web Services Architecture 2004.http://www.w3.org/TR/ws-arch/
OASIS Security Services TC: Security Assertion Markup Language (SAML) v2.0 (2005) ∖#samlv2.0 http://www.oasis-open.org/standards ∖#samlv2.0
NSF Middleware Initiative: GridShib http://gridshib.globus.org
Barton T, Basney J, Freeman T, Scavo T, Siebenlist F, Welch V, Ananthakrishnan R, Baker B, Goode M, Keahey K: Identity Federation and Attribute-based Authorization through the Globus Toolkit, Shibboleth, GridShib, and MyProxy. In: 5th Annual PKI R&D Workshop, IEEE Computer Society 2006.
NSF Middleware Initiative: GridShib SAML Tools (2008) http://gridshib.globus.org/docs/gridshib-saml-tools-0.5.0/readme.html
Schwenk J, Kohlar F, Amon M: The power of recognition: secure single sign-on using TLS channel bindings. In Proceedings of the 7th ACM workshop on Digital identity management, DIM ’11. New York, NY, USA,: ACM; 2011. pp 63–72. http://doi.acm.org/10.1145/2046642.2046656 pp 63–72.
Hypertext Transfer Protocol – HTTP/1.1 http://www.w3.org/Protocols/rfc2616/rfc2616.html
Benedyczak K, Baa P, van den Berghe S, Menday R, Schuller B: Key aspects of the unicore 6 security model. Future Generation Computer Systems 2011, 27(2):195–201. 10.1016/j.future.2010.08.009
Berry W: 15 seconds : Sharing cookies across domains. http://www.15seconds.com/issue/971108.htm
Fielding R, Taylor R: Principled design of the modern web architecture. In Software Engineering, 2000. Proceedings of the 2000 International Conference on. Amsterdam,: Elsevier Science Publishers B. V.; 2000. pp 407–416. pp 407–416. 10.1109/ICSE.2000.870431
W3C Working Group: libcurl - the multiprotocol file transfer library (2004) http://www.w3.org/TR/ws-arch/
Crockford D: The application/json Media Type for JavaScript Object Notation (JSON). 2006.http://tools.ietf.org/html/rfc4627
Kulkarni S, Singh A, Ramakrishnan G, Chakrabarti S: Collective annotation of Wikipedia entities in web text. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD), University of Westminster, London. New York,: ACM; 2009. pp 457–466. http://doi.acm.org/10.1145/1557019.1557073 pp 457–466.
Yosef MA, Hoffart J, Spaniol M, Weikum G: Aida: An online tool for accurate disambiguation of named entities in text and tables. In Jagadish HV, Blakeley J, Hellerstein JM, Koudas N, Lehner W, Sarawagi S, Röhm U (eds.) Proceedings of the 37th International Conference on Very Large Data Bases, Proceedings of the VLDB Endowment, vol 4. Seattle, USA,: VLDB Endowment; 2011. pp 1450–1453 pp 1450–1453
Auer S, Bizer C, Kobilarov G, Lehmann J, Ives Z: DBpedia: A Nucleus for a Web of Open Data. In Proceedings of 6th International Semantic Web Conference, 2nd Asian Semantic Web Conference (ISWC+ASWC 2007), Vol. 4825. Amsterdam,: Elsevier Science Publishers B. V.; 2007. pp 11–15 pp 11–15
Brin S: Extracting Patterns and Relations from the World Wide Web. In Workshop on The World Wide Web and Databases (WebDB) at 6th International Conference on Extending Database Technology (EDBT). Valencia,: Springer-Verlag; 1999. pp 172–183 pp 172–183
Suchanek FM, Sozio M, Weikum G: SOFIE: A Self-Organizing Framework for Information Extraction. In WWW’09 : proceedings of the 18th International World Wide Web Conference. Madrid: ACM; 2009.
Elbassuoni S, Hose K, Metzger S, Schenkel R: Roxxi: Reviving witness dOcuments to eXplore eXtracted Information. In Proceedings of the 36th International Conference on Very Large Data Bases, Proceedings of the VLDB Endowment, vol 3. Singapore,: ACM; 2010. pp 1589–1592 pp 1589–1592