Overlap in bibliographic databases

Wiley - Tập 54 Số 12 - Trang 1091-1103 - 2003
William W. Hood1, Concepción S. Wilson1
1School of Information Systems, Technology and Management, The University of New South Wales, Sydney, 2052 Australia

Tóm tắt

AbstractBibliographic databases contain surrogates to a particular subset of the complete set of literature; some databases are very narrow in their scope, while others are multidisciplinary. These databases overlap in their coverage of the literature to a greater or lesser extent. The topic of Fuzzy Set Theory is examined to determine the overlap of coverage in the databases that index this topic. It was found that about 63% of records in the data set are unique to only one database, and the remaining 37% are duplicated in from two to 12 different databases. The overlap distribution is found to conform to a Lotka‐type plot. The records with maximum overlap are identified; however, further work is needed to determine the significance of the high level of overlap in these records. The unique records are plotted using a Bradford‐type form of data presentation and are found to conform (visually) to a hyperbolic distribution. The extent and causes of intra‐database duplication (records duplicated in the one database) are also examined. Finally, the overlap in the top databases in the dataset were examined, and a high correlation was found between overlapping records, and overlapping DIALOG OneSearch categories.

Từ khóa


Tài liệu tham khảo

Barbarino M.(1989).Similarity detection in online bibliographic databases. In Online Information 89. Proceedings of the 13th International Online Information Meeting London 12–14 December 1989 (pp.111–117). Oxford and Medford NJ.

Bearman T.C., 1977, A study of coverage overlap among fourteen major science and technology abstracting services

Bharat K. &Broder A.(1998). Measuring the Web (An update to conference paper: Estimating the relative size and overlap of public Web search engines presented at the 7thInternational World Wide Web Conference in April '98).http://www.research.compaq.com/src/whatsnew/sem.html(accessed 5 March 2003).

10.1021/ci00023a017

Bradford S.C.(1929).Towards Union Cataloguing. 2: In Science. (i) The Information Service of the Science Library. Library Association Record Vol. 31 Conference Supplement pp.24*–29* (Article 6).

Bradford S.C., 1934, Sources of information on specific subjects, Engineering, 137, 85

Brandon A.N., 1994, Selected list of books and journals in allied health, Bulletin of the Medical Library Association, 82, 247

Brettle A.J., 2001, Comparison of bibliographic databases for information on rehabilitation of people with severe mental illness, Bulletin of the Medical Library Association, 89, 353

10.3233/ISU-1984-41-204

DIALOG. (1989).Now you can identify duplicate records on DIALOG. Dialog Chronolog December.

Ernest D.J., 1988, An online comparison of three library science databases, RQ (Reference Quarterly), 28, 185

Fishman D.L., 1996, Where should the pharmacy researcher look first? Comparing International Pharmaceutical Abstracts and MEDLINE, Bulletin of the Medical Library Association, 84, 402

10.1016/S0378-7206(00)00079-3

Funk M.E., 1983, Indexing Consistency in MEDLINE, Bulletin of the Medical Library Association, 71, 176

10.1002/(SICI)1097-4571(199001)41:1<43::AID-ASI4>3.0.CO;2-P

10.1177/0002764202045011007

10.1108/eb023936

Hood W.W.(1998).An informetric study of the distribution of bibliographic records in online databases: a case study using the literature of Fuzzy Set Theory (1965–1993). Sydney Australia: Ph.D. dissertation The University of New South Wales.

10.1007/BF02459605

10.1002/asi.1191

10.1023/A:1015688605943

10.1002/(SICI)1097-4571(199703)48:3<205::AID-ASI3>3.0.CO;2-0

10.1016/S0740-8188(98)90016-0

LaBorie T., 1985, Library and information science abstracting and indexing services: coverage, overlap, and context, Library and Information Science Research, 7, 183

LaBorie T., 1981, The ERIC and LISA databases: how the sources of library science literature compare, Database, 4, 32

10.1108/eb026420

Miller B., 1976, 66

10.1108/eb024077

Mychko‐Megrin A.Y., 1991, A comparison of biomedical databases. Scope and coverage 1970–1988, Bulletin of the Medical Library Association, 79, 302

10.1108/eb024120

10.1002/asi.4630340410

Nicholls P.T., 1989, Bibliometrics of the laserdisk applications literature, Laserdisk Professional, 2, 106

10.1108/00220410010803810

10.1016/0306-4573(93)90026-A

Pollard A.F.C. &Bradford S.C.(1930).The inadequacy of the alphabetical subject index. In Report of the Proceedings of the 7th Conference of the Association of Special Libraries and Information Bureaux (ASLIB) New College Oxford September 19–22 (pp.39–54). London: ASLIB.

10.1007/BF02016855

Ramos‐Remus C., 1994, Performance of online biomedical databases in rheumatology, Journal of Rheumatology, 21, 1912

Read E.J., 2000, Searching for library and information science literature: a comparison of coverage in three databases, Library Computing, 19, 118

10.1177/030802269205500406

10.1002/(SICI)1097-4571(199101)42:1<1::AID-ASI1>3.0.CO;2-9

10.1016/0306-4573(77)90033-4

Slater L.G., 1997, Mapping the literature of speech‐language pathology, Bulletin of the Medical Library Association, 85, 297

10.1177/016555159402000207

Stern B.T., 1977, 3

10.1108/eb024110

Walker G., 1990, Searching the humanities—subject overlap and search vocabulary, Database—the Magazine of Database Reference and Review, 13, 37

Wilson C.S., 1999, 107

Wolfram D., 1990, Informetrics 89/90. Selection of Papers Submitted for the Second International Conference on Bibliometrics, Scientometrics and Informetrics, 355

10.2307/40323464

10.1002/(SICI)1097-4571(199006)41:4<245::AID-ASI3>3.0.CO;2-8

Yonker V.A., 1990, Coverage and overlaps in bibliographic databases relevant to forensic medicine: a comparative analysis of MEDLINE, Bulletin of the Medical Library Association, 78, 49

10.1016/S0019-9958(65)90241-X