Authoritative sources in a hyperlinked environment

Journal of the ACM - Tập 46 Số 5 - Trang 604-632 - 1999
Jon Kleinberg1
1Cornell University, Ithaca, NY

Tóm tắt

The network structure of a hyperlinked environment can be a rich source of information about the content of the environment, provided we have effective means for understanding it. We develop a set of algorithmic tools for extracting information from the link structures of such environments, and report on experiments that demonstrate their effectiveness in a variety of context on the World Wide Web. The central issue we address within our framework is the distillation of broad search topics, through the discovery of “authorative” information sources on such topics. We propose and test an algorithmic formulation of the notion of authority, based on the relationship between a set of relevant authoritative pages and the set of “hub pages” that join them together in the link structure. Our formulation has connections to the eigenvectors of certain matrices associated with the link graph; these connections in turn motivate additional heuristrics for link-based analysis.

Từ khóa


Tài liệu tham khảo

AROCENA G.O., 1997, Proceedings of the 6th International World Wide Web Conference (Santa Clara, Calif., Apr. 7-11)

BARRETT R., 1997, Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI '97), 75, 10.1145/258549.258595

BERMAN O., Facility Location: A Survey of Applications and Methods

10.1145/179606.179671

BHARAT K., 1998, Proceedings of the 7th International World Wide Web Conference

BHARAT K., 1998, Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 104, 10.1145/290941.290972

10.1145/146802.146826

BRIN S., 1998, Proceedings of the 7th International World Wide Web Conference, 107

CARRIERE J., 1997, Proceedings of the 6th International World Wide Web Conference (Santa Clara, Calif., Apr. 7-11)

CHAKRABARTI S., 1998, Proceedings of the ACM SIGIR Workshop on Hypertext Information Retrieval on the Web

CHAKRABARTI S., 1998, Proceed-ings of the 7th International World Wide Web Conference, 65

CHUNG F. R. K. 1997. Spectral Graph Theory. AMS Press Providence R.I. CHUNG F. R. K. 1997. Spectral Graph Theory. AMS Press Providence R.I.

CHEKURI C., 1997, Proceedings of the 6th International World Wide Web Conference (Santa Clara, Calif., Apr. 7-11)

CUTTING D.R., 1992, Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 330

DE SOLLA PRICE D., 1981, The analysis of square matrices of scientometric transactions, Sciento-metrics, 3, 55, 10.1007/BF02021864

DEERWESTER S., 1990, Indexing by latent semantic analysis, J. Amer. Soc. Info. Sci., 41, 391, 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9

DIGITAL EQUIPMENT CORPORATION. AltaVista search engine http://altavista.digital.com/. DIGITAL EQUIPMENT CORPORATION. AltaVista search engine http://altavista.digital.com/.

DONATH W.E., 1973, Lower bounds for the partitioning of graphs. IBM, J. Res. Develop. 17.

10.1016/0306-4573(88)90077-5

10.1016/0306-4573(94)90021-3

10.1016/0306-4573(88)90026-X

EGGHE L. AND ROUSSEAU R. 1990. Introduction to Informetrics Elsevier North-Holland Am-sterdam The Netherlands. EGGHE L. AND ROUSSEAU R. 1990. Introduction to Informetrics Elsevier North-Holland Am-sterdam The Netherlands.

FIELDER M., 1973, Algebraic connectivity of graphs, Czech. Math. J., 23, 298, 10.21136/CMJ.1973.101168

FRIEZE A., 1998, Proceedings of the 39th IEEE Symposium on Foundations of Computer Science (Palo Alto, Calif., Nov. 8-11)

10.1145/48511.48518

GARFIELD E., 1972, Citation analysis as a tool in journal evaluation, Science, 178, 471, 10.1126/science.178.4060.471

GELLER N., 1978, On the citation influence methodology of Pinski and, Narin. Inf. Proc. Manage., 14, 93, 10.1016/0306-4573(78)90066-3

GIBSON D., 1998, Proceedings of the 9th ACM Conference on Hypertext and Hypermedia, 225

GIBSON D., 1998, Proceedings of the 24th International Conference on Very Large Databases, 311

GOLUB G. AND VAN LOAN C. F. 1989. Matrix Computations. Johns Hopkins University Press Baltimore Md. GOLUB G. AND VAN LOAN C. F. 1989. Matrix Computations. Johns Hopkins University Press Baltimore Md.

HOTELLING H., 1933, Analysis of a complex statistical variable into principal components, J. Educ. Psychol., 24, 417, 10.1037/h0071325

HUBBELL C. H., 1965, An input-output approach to clique identification, Sociometry, 28, 377, 10.2307/2785990

HUBERMAN B. PIROLLI P. PITKOW J. AND LUKOSE R. 1998. Strong regularities in world wide web surfing. Science 280. HUBERMAN B. PIROLLI P. PITKOW J. AND LUKOSE R. 1998. Strong regularities in world wide web surfing. Science 280.

JOLLIFFE I. T. 1986. Principal Component Analysis. Springer-Verlag New York. JOLLIFFE I. T. 1986. Principal Component Analysis. Springer-Verlag New York.

KATZ L., 1953, A new status index derived from sociometric analysis, Psychometrika, 18, 39, 10.1007/BF02289026

KESSLER M. M., 1963, Bibliographic coupling between scientific papers, Amer. Document., 14, 10, 10.1002/asi.5090140103

LARSON R., 1996, Proceedings of the Annual Meeting of the American Society of Information Science

LEVINE J. H., 1979, Joint-space analysis of 'pick-any' data: Analysis of choices from an uncon-strained set of alternatives, Psychometrika, 44, 85, 10.1007/BF02293787

MARCHIORI M., 1997, Proceedings of the 6th International World Wide Web Conference (Santa Clara, Calif., Apr. 7-11)

MCBRYAN O., 1994, Proceedings of the 1st International World Wide Web Conference

MCCAIN K., 1986, Co-cited author mapping as a valid representation of intellectual structure, J. Amer. Soc. Info. Sci., 37, 111, 10.1002/(SICI)1097-4571(198605)37:3<111::AID-ASI2>3.0.CO;2-D

NOMA E., 1982, An improved method for analyzing square scientometric transaction matrices, Scientometrics, 4, 297, 10.1007/BF02021645

NOMA E., 1984, Co-citation analysis and the invisible college, J. Amer. Soc. Info. Sci., 35, 29, 10.1002/asi.4630350105

PAPADIMITRIOU C.H., 1998, Proceedings of the 17th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, 159

PINSKI G., 1976, Citation influence for journal aggregates of scientific publications: Theory, with application to the literature of physics, Inf. Proc. Manage., 12, 297, 10.1016/0306-4573(76)90048-0

PIROLLI P., 1996, Proceedings of ACM SIGCHI Conference on Human Factors in Computing Systems (CHI '96), 118

PITKOW J., 1997, Proceedings of ACM SIGCHI Conference on Human Factors in Computing Systems (CHI '97), 383, 10.1145/258549.258805

SALTON G. 1989. Automatic Text Processing. Addison-Wesley Reading Mass. SALTON G. 1989. Automatic Text Processing. Addison-Wesley Reading Mass.

SHAW W. M., 1991, Subject and citation indexing. Part I: The clustering structure of composite representations in the cystic fibrosis document collection, J. Amer. Soc. Info. Sci., 42, 669, 10.1002/(SICI)1097-4571(199110)42:9<669::AID-ASI5>3.0.CO;2-Y

SHAW W. M., 1991, Subject and citation indexing. Part II: The optimal, cluster-based retrieval performance of composite representations, J. Amer. Soc. Info. Sci., 42, 676, 10.1002/(SICI)1097-4571(199110)42:9<676::AID-ASI6>3.0.CO;2-2

SMALL H., 1973, Co-citation in the scientific literature: A new measure of the relationship between two documents, J. Amer. Soc. Info. Sci., 24, 265, 10.1002/asi.4630240406

10.1002/(SICI)1097-4571(198605)37:3%3C97::AID-ASI1%3E3.0.CO;2-K

SMALL H., 1974, The structure of the scientific literatures I. Identifying and graphing specialties, Science Studies, 4, 17, 10.1177/030631277400400102

SPERTUS E., 1997, Proceedings of the 6th International World Wide Web Conference (Santa Clara, Calif., Apr. 7-11)

VAN RIJSBERGEN C. J. 1979. Information Retrieval. Butterworths London England. VAN RIJSBERGEN C. J. 1979. Information Retrieval. Butterworths London England.

WEISS R., 1996, Proceedings of the 7th ACM Conference on Hypertext (Washington, D.C., Mar. 16-20), 180

WIRED DIGITAL INC. Hotbot http://www.hotbot.com. WIRED DIGITAL INC. Hotbot http://www.hotbot.com.

YAHOO!CORPORATION Yahoo! http://www.yahoo.com. YAHOO!CORPORATION Yahoo! http://www.yahoo.com.