On the utility of identification schemes for digital earth science data: an assessment and recommendations

Springer Science and Business Media LLC - Tập 4 - Trang 139-160 - 2011
Ruth E. Duerr1, Robert R. Downs2, Curt Tilmes3, Bruce Barkstrom4, W. Christopher Lenhardt5, Joseph Glassy6, Luis E. Bermudez7, Peter Slaughter8
1National Snow and Ice Data Center, University of Colorado at Boulder, Boulder, USA
2Center for International Earth Science Information Network (CIESIN), Columbia University, Palisades, USA
3NASA Goddard Space Flight Center, Greenbelt, USA
4NASA/NOAA, Asheville, USA
5Oak Ridge National, Laboratory, Oak Ridge, USA
6R&D, Lupine Logic Inc., Missoula, USA
7Open Geospatial Consortium (OGC), Herndon, USA
8Earth Research Institute, University of California at Santa Barbara, Santa Barbara, USA

Tóm tắt

In recent years, a number of data identification technologies have been developed which purport to permanently identify digital objects. In this paper, nine technologies and systems for assigning persistent identifiers are assessed for their applicability to Earth science data (ARKs, DOIs, XRIs, Handles, LSIDs, OIDs, PURLs, URIs/URNs/URLs, and UUIDs). The evaluation used four use cases that focused on the suitability of each scheme to provide Unique Identifiers for Earth science data objects, to provide Unique Locators for the objects, to serve as Citable Locators, and to uniquely identify the scientific contents of data objects if the data were reformatted. Of all the identifier schemes assessed, the one that most closely meets all of the requirements for an Unique Identifier is the UUID scheme. Any of the URL/URI/IRI-based identifier schemes assessed could be used for Unique Locators. Since there are currently no strong market leaders to help make the choice among them, the decision must be based on secondary criteria. While most publications now allow the use of URLs in citations, so that all of the URL/URI/IRI based identification schemes discussed in this paper could potentially be used as a Citable Locator, DOIs are the identification scheme currently adopted by most commercial publishers. None of the identifier schemes assessed here even minimally address identification of scientifically identical numerical data sets under reformatting.

Tài liệu tham khảo

Altman M (2008) A fingerprint method for scientific data verification. Adv Comput Inf Sci Eng 311–316. doi:10.1007/978-1-4020-8741-7_57

Alvestrand HT (1997) Object identifiers registry. http://www.alvestrand.no/objectid/. Accessed 14 August 2010

Barkstrom BR (2011) On the uniqueness of earth science data. In preparation

Bartolomeo G, Kovacikova T (2009) User profile management in next generation networks. Telecommunications, 2009. AICT ‘09. Fifth Advanced International Conference on, 129–135. Venice: IEEE, May 24. doi:10.1109/AICT.2009.28. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5072418. Accessed 21 August 2010

Bartolomeo G, Kovacikova T, Petersen F (2010) A novel approach to use data federation in next-generation networks. Int J Commun Syst 23:802–816. doi:10.1002/dac.1083

Berners-Lee T, Hendler J, Lassila O (2001) The semantic web. Sci Am 184:34–43

Blanchi C, Petrone J (2001) An architecture for digital object typing. Corporation for National Research Initiatives. http://www.cnri.reston.va.us/software/repository/repo-whitepaper.pdf. Accessed 13 August 2010

Burton A, Treloar A (2009) Publish my data: a composition of services for ANDS and ARCS. Proceedings of the Fifth IEEE International Conference on e-Science. 164–170. doi:10.1109/e-Science.2009.31

Corporation for National Research Initiatives (2006) Handle system public license agreement (Version 2). http://hdl.handle.net/4263537/5030. Accessed 14 August 2010

Corporation for National Research Initiatives (2007a) HANDLE.NET (version 6.2) technical manual (Version 2). http://hdl.handle.net/4263537/5031. Accessed 13 August 2010

Corporation for National Research Initiatives (2007b) Handle system service agreement (Version 3). http://hdl.handle.net/4263537/5029. Accessed 14 August 2010

Corporation for National Research Initiatives (2009) The handle system. http://www.handle.net/. Accessed 13 August 2010

Corporation for National Research Initiatives (2009b) Quick facts. http://www.handle.net/factsheet.html. Accessed 13 August 2010

Cox S, Schade S, Bermudez LE, Simonis I (2010) “OGC identifier policy—a case for http URI”, Open Geospatial Consortium, Document No.: 10-124r1

Crosas M (2011) The dataverse network®: an open-source application for sharing, discovering and preserving data. D-Lib Magazine 17(1/2). doi:10.1045/january2011-crosas

DataCite (2011) http://datacite.org/. Accessed 11 January 2011

How to Cite a Dataset (2008) international polar year data and information service. http://ipydis.org/data/citations.html. Accessed 21 August 2010

Denning PJ, Kahn RE (2010) The long quest for universal information access. Commun ACM 53(12):34–36. doi:10.1145/1859204.1859218

Department of Defense. Instruction Number 1322.26 (DoDI 1322.26) (2006) http://www.dtic.mil/whs/directives/corres/pdf/132226p.pdf. Accessed 14 August 2010

MSDN (Microsoft Developer Network) (2010) GUID structure. http://msdn.microsoft.com/en-us/library/aa373931%28VS.85%29.aspx. Accessed 15 September 2010

Dublin Core Metadata Initiative (2008) DCMI metadata terms. http://dublincore.org/documents/dcmi-terms/. Accessed 14 August 2010

PURL Federation (2010) Persistent URLs. http://www.purlz.org/. Accessed 14 August 2010.

Garrity G, Thompson L, Ussery D, Paskin N, Baker D, Desmeth P, Schindel D, Ong P (2009) Executive summary: studies on monitoring and tracking genetic resources. Standards in Genomic Sciences, North America, 1, Jul 2009. Available at: http://standardsingenomics.org/index.php/sigen/article/view/sigs.1491. Date accessed: 17 Sep. 2010

Hyam R (2009) A position on LSID’s. http://www.hyam.net/blog/archives/325. Accessed 27 December 2010

ISO7498-3 (1997) Data networks and open system communication OSI networking and system aspects—Naming, Addressing and Registration. International Standard ITU-T Rec. X.650 (1996) j ISO/IEC 7498–3:1997. http://www.itu.int/rec/T-REC-X.650/en. Accessed 14 August 2010

ITU (1994) X.200: Information technology—open systems interconnection—basic reference model: the basic model. http://www.itu.int/rec/T-REC-X.200/en

ITU (2005) Procedures for the operation of OSI registration authorities: general procedures and ASN.1 Object Identifier tree top arcs, ITU-T Rec. X.660 (2004) | ISO/IEC 9834–1:2005, http://www.itu.int/ITU-T/studygroups/com17/oid/X.660-E.pdf. Accessed 14 August 2010

Khoury JS, Jerez HN, Abdallah CT (2007) Efficient user controlled inter-domain SIP mobility authentication, registration, and call routing. Proceedings of the 4th Annual International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services, 2007. MobiQuitous 2007. doi:10.1109/MOBIQ.2007.4451064 http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4451064. Accessed 14 August 2010

Klerkx J, Vandeputte B, Parra G, Santos JL, Van Assche F, Duval E (2010) How to share and reuse learning resources: the ARIADNE experience. Sustaining TEL: from innovation to learning and practice. Lect Notes Comput Sci 6383/2010:183–196. doi:10.1007/978-3-642-16020-2_13

Kong MM (1995) DCE: an environment for secure client/server computing—Open Software Foundation’s Distributed Computing Environment—Technology Information. Hewlett-Packard Journal

Larmouth J (1999) ASN.1 Complete. Morgan Kaufmann, London, p 472

Larmouth J (2003) Registration authorities for OID components. http://www.itu.int/itudoc/itu-t/com17/tutorial/84393.html. Accessed 14 August 2010

Library of Congress (1997) The relationship between URNs, Handles, and PURLs. http://memory.loc.gov/ammem/award/docs/PURL-handle.html. Accessed 14 August 2010

Life Science Identifier Resolution Project (2010) GeekNet, Inc. SourceForge.net. http://lsids.sourceforge.net/. Accessed 12 August 2010

LSID (2010) Wikipedia. http://en.wikipedia.org/wiki/LSID. Accessed 21 August 2010

LSID-Developer (2010) GeekNet, Inc. SourceForge.net. http://sourceforge.net/mailarchive/forum.php?forum_name=lsid-developer. Accessed 21 August 2010

Lynch C (1997) Identifiers and their role in networked information applications. http://www.arl.org/bm~doc/identifier.pdf. Accessed 14 August 2010

Lyons WE (2000) Object identification and registration. http://isotc.iso.org/livelink/livelink/fetch/2000/2489/Ittf_Home/MoU-MG/Moumg178.ppt. Accessed 14 August 2010

Maniatis P, Roussopoulos M, Guili TJ, Rosenthal DS, Baker M (2005) The LOCKSS peer-to-peer digital preservation system. ACM Trans Comput Syst 23(1):2–50. doi:10.1145/1047915.1047917

McRae MP (2008) Failed OASIS standard ballot of XRI syntax v2.0. xri message. Organization for the Advancement of Structured Internet Standards (OASIS). http://lists.oasis-open.org/archives/xri/200806/msg00001.html. Accessed 21 August 2010

Mealling M (2000) A URN namespace of object identifiers. Request for Comments: 3001. Internet Engineering Task Force (IETF). http://www.ietf.org/rfc/rfc3001.txt. Accessed 12 August 2010

Miller E (2007) Eric Miller’s home page. http://www.w3.org/People/EM/

Moats R (1997) URN syntax. Request for comments: 2141. Internet Engineering Task Force (IETF). http://www.ietf.org/rfc/rfc2141.txt. Accessed 13 August 2010

Name-to-Thing (N2T) Resolver (2007) http://n2t.info/. Accessed 13 August 2010

NASA (2006) Requirements for archiving, distribution and user services in EOS Data and Information System (EOSDIS), 423-10-69 earth science data information system project. GSFC, Greenbelt

Network Development and MARC Standards Office (1999) Library of Congress Control Number (LCCN)—restructuring to accommodate century change. Library of Congress. http://www.loc.gov/marc/lccn.html. Accessed 14 August 2010

Network Development and MARC Standards Office (2006) Structure of the LC Control Number. Library of Congress. http://www.loc.gov/marc/lccn_structure.html. Accessed 14 August 2010

OASIS XRI Technical Committee (2005) Extensible Resource Identifier (XRI) Syntax V2.0, Committee Specification, 14 November 2005. Organization for the Advancement of Structured Internet Standards (OASIS). http://www.oasis-open.org/committees/download.php/15376. Accessed 21 August 2010

Object Identifier (2010) Wikipedia. http://en.wikipedia.org/wiki/Object_identifier. Accessed 14 August 2010

OCLC (2010) PURLs: persistent uniform resource locators. http://purl.oclc.org/. Accessed 14 August 2010

OID-Registry (2010) France Telecom. http://www.oid-info.com/. Accessed 14 August 2010

Online Computer Library Center (2007) OCLC to work with Zepheira to redesign OCLC’s PURL service. http://www.oclc.org/news/releases/200669.htm. Accessed 12 August 2010

Online Computer Library Center (2010) OCLS information and services for publishers. http://publishers.oclc.org/en/default.htm. Accessed 22 August 2010

Paskin N (2006) Naming and meaning of digital objects. In: Kia N, Atta B, Pierfrancesco B (eds) Proceedings of the 2nd International Conference on Automated Production of Cross Media

Powell A (2004) Guidelines for assigning identifiers to metadata terms. Dublin Core Metadata Initiative. http://www.ukoln.ac.uk/metadata/dcmi/term-identifier-guidelines/ Accessed 14 August 2010

Publication Manual of the American Psychological Association (2001), American Psychological Association, Washington, DC. http://www.apastyle.org/. Accessed 14 August 2010

Reed, D (2010a) Extensible Resource Identifier (XRI) Version 3.0, Working Draft 02. 18 January 2010. http://www.oasis-open.org/committees/download.php/33876/xri-syntax-3.0-wd02.doc Accessed 11 January 2011

Reed D (2010b) Personal communication with Ruth Duerr.

Registry of Z39.50 Object Identifiers (2009) Z39.50 International standard maintenance agency. Library of Congress. http://www.loc.gov/z3950/agency/defns/oids.html Accessed 14 August 2010

Rust G (1998) Metadata: the right approach. D-Lib Mag., July/August 1998. http://www.dlib.org/dlib/july98/rust/07rust.html Accessed 14 August 2010

Rust G, Bide M (2000) The < indecs > metadata framework: principles, model and data dictionary. WP1a-006-2.0, June. http://www.doi.org/topics/indecs/indecs_framework_2000.pdf Accessed 27 December 2010

Salamone S (2002) LSID: an informatics lifesaver. Bio-IT World. http://www.bio-itworld.com/archive/011204/lsid.html Accessed 13 August 2010

Smith AJ (2007) Developing handle system® web services at Cornell University. D-Lib Mag. 13(9/10). ISSN 1082–9873. http://www.dlib.org/dlib/september07/smith/09smith.html Accessed 14 August 2010

Smith D, Szekely B (2005) LSID best practices: a guide to deploying life science identifiers. http://www.ibm.com/developerworks/opensource/library/os-lsidbp Accessed 21 August 2010

Sun S, Lannom L, Brian B (2003a) Handle system overview. Request for comments: 3650. Internet Engineering Task Force (IETF). http://hdl.handle.net/4263537/4069 Accessed 14 August 2010

Sun S, Reilly S, Lannom L, Petrone J (2003c) Handle system protocol (ver 2.1) specification. Request for comments: 3652. Internet Engineering Task Force (IETF). http://hdl.handle.net/4263537/4086 Accessed 14 August 2010

Tansley R (2006) Building a distributed, standards-based repository federation: the China digital museum project. D-Lib Mag., 12(7/8). ISSN: 1082–9873 http://www.dlib.org/dlib/july06/tansley/07tansley.html Accessed 14 August 2010

Taxonomic Databases Working Group (2006) Biodiversity information standards. http://www.tdwg.org/fileadmin/subgroups/guid/LSIDs_for_Biologists.pdf Accessed 21 August 2010

TDWG Globally Unique Identifiers: LSID (2010) GBIF Wiki, Global Biodiversity Information Facility. http://wiki.gbif.org/guidwiki/wikka.php?wakka=LSID Accessed 21 August 2010

The Chicago Manual of Style Online, Fifteenth Edition (2006) The University of Chicago Press, Chicago. http://www.chicagomanualofstyle.org/home.html Accessed http://www.chicagomanualofstyle.org/home.html

The International DOI Foundation (2010a) The digital object identifier system. http://www.doi.org/ Accessed 13 August 2010

The International DOI Foundation (2010b) Overviews and standards. http://www.doi.org/about_the_doi.html#standards Accessed 13 August 2010

The International DOI Foundation (2010c) TC 46 information and documention. http://www.iso.org/iso/standards_development/technical_committees/list_of_iso_technical_committees/iso_technical_committee.htm?commid=48750 Accessed 13 August 2010

The Open Group (1997) DCE 1.1: remote procedure call—Universal unique identifier http://www.opengroup.org/onlinepubs/9629399/apdxa.htm Accessed 15 September 2010

Tilmes C, Yesha Y, Halem M (2010) Tracking provenance of earth science data. Earth Sci Inform 3(1–2):59–65. doi:10.1007/s12145-010-0046-3.

Ts’o, T (2010) E2fsprogs: Ext2/3/4 filesystem utilities, sourceforge.Net. http://e2fsprogs.sourceforge.net/ Accessed 11 January 2011.

University of California Curation Center (UC3), California Digital Library (2010) Name Assigning Authority Number (NAAN) Registry. http://www.cdlib.org/inside/diglib/ark/natab Accessed 3 August 2010

Weibel S (2007) PURLy gates and gift horses. Six Apart, Ltd., TypePad. http://weibel-lines.typepad.com/weibelines/2007/11/purly-gates-and.html Accessed 14 August 2010

Willet P (2010) NOID: Nice Opaque Identifier (Minter and Name Resolver), https://confluence.ucop.edu/display/Curation/NOID Accessed 15 September 2010

Williams, S (2008) TAG recommends against XRI from Williams, Stuart (HP Labs, Bristol) on 2008-05-21 ([email protected] from May 2008). W3C Public Mailing List Archives, World Wide Web Consortium (W3C). http://lists.w3.org/Archives/Public/www-tag/2008May/0078 Accessed 21 August 2010

Xiaofeng, G., Ying, L., and Sun, S. X. 2010. Federated content rights management for research and academic publications using the handle system. D-Lib Magazine, 16, 11/12, doi:10.1045/november2010-guo

Zepheira (2011) http://zepheira.com/about/people/eric-miller/ Accessed 21 January 2011

ZooKeys (2010) Pensoft Publishers. http://pensoftonline.net/zookeys/ Accessed 21 August 2010

Zootaxa (2010) Magnolia Press. http://www.mapress.com/zootaxa/ Accessed 21 August 2010