Một chiến lược xử lý truy vấn đa từ khóa hiệu quả trên tìm kiếm Web dựa trên P2P

EDP Sciences - Tập 12 - Trang 881-886
Derong Shen1, Ge Yu1, Hongkai Zhu2, Meifang Li1
1College of Information Science and Engineering, Northeastern University, Shenyang, China
2Department of New Froduct Research, Baidu.com, Inc., Beijing, China

Tóm tắt

Bài báo trình bày một chiến lược xử lý truy vấn dựa trên lợi ích mới mẻ nhằm tối ưu hóa việc định tuyến truy vấn. Dựa trên DHT như là một mạng chồng, trước tiên, nó áp dụng điểm cân bằng Nash để xây dựng nhóm đồng đẳng tối ưu dựa trên mối tương quan của các từ khóa và sự bao phủ cũng như chồng chéo của các đồng đẳng nhằm giảm thiểu chi phí thời gian, sau đó đưa ra một kiến trúc hai lớp cho việc xử lý truy vấn, sử dụng bộ lọc Bloom như một đại diện gọn để giảm thiểu mức tiêu thụ băng thông. Các thí nghiệm rộng rãi được thực hiện trên một tập dữ liệu thực tế đã chứng minh rằng phương pháp của chúng tôi rõ ràng giảm thiểu thời gian xử lý, đồng thời cải thiện độ chính xác và độ thu hồi.

Từ khóa

#Xử lý truy vấn #DHT #nhóm đồng đẳng #điểm cân bằng Nash #bộ lọc Bloom

Tài liệu tham khảo

Stoica I, Morris R, Karger D, et al. Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications [C]// Proc of SIGCOMM’01. San Diego, California, USA, August 27–31, 2001: 149–160. Druschel P, Rowstron A. Pastry: Scalable, Distributed Object Location and Routing for Large-Scale Peer-to-Peer Systems [C]//Proc of IFIP/ACM Middleware 2001, Heidelberg, Germany, November 23–27, 2001: 329–350. Aberer K, Punceva M, Hauswirth M, et al Improving Data Access in P2P Systems [J]. IEEE Internet Computing, 2002, 6(1):58–67. Wang Y, Galanis L, Witt D J. Galanx: An Efficient Peer-to-Peer Search Engine System. Proseminar Peer-to-Peer Information Systems [EB/OL]. [2004-08-01]. http://www.cs.wisc.edu/yuanwang . Cuenca-Acuna F M, Peery C, Richard P M et al. PlanetP: Using Gossiping to Build Content Addressable Peer-to-Peer Information Sharing Communities [R]. New Brunswick, USA: Dept of Computer Science, Rutgers University, 2002. Suel T, Mathur C, Wu J, et al. Odissea: A Peer-to-Peer Architecture for Scalable Web Search and Information Retrieval [R]. Hong Kong: Polytechnic Univ, 2003. Callan J, Lu Z, Croft W. Searching Distributed Collections with Inference Networks[C]//Proc of the 18th Annual Int ACM SIGIR Conf. on Research and Development in Information Retrieval. Seattle, USA, August 9–13, 1995: 21–28. Fuhr N. A Decision-Theoretic Approach to Database Selection in Networked IR [J]. ACM Transactions on Information Systems, 1999, 17(3):229–249. Gravano L, Garcia-Molina H, Tomasic A. Gloss: Text-Source Discovery over the Internet [J]. ACM Trans Database Syst, 1999, 24(2):229–264. Si L, Jin R, Callan J, et al. A Language Modeling Framework for Resource Selection and Results Merging [C]//Proc of CIKM’ 02. McLean, Virginia, USA, November 8–12, 2002: 94–100. Xu J, Croft B. Cluster-Based Language Models for Distributed Retrieval [C]// Proc of ACM SIGIR Conference. Berkeley, California, USA, April 12–16, 1999: 254–261. Crespo A, Garcia-Molina H. Semantic Overlay Networks for P2P Systems[C]// Proc of the 3rd Int’l Workshop on Agents and Peer-to-Peer Computing. Berlin, Germany: Springer-Verlag, 2004: 1–13. Triantafillou P, Xiruhaki C, Koubarakis M, et al. Towards High Performance Peer-to-Peer Content and Resource Sharing Systems [C]// Proc of the First Biennial Conference on Innovative Data Systems Research (CIDR 2003). Pacific Grove, California, USA, January 13–16, 2003: 341–355. Cohen E, Fiat A, Kaplan H. Associative Search in Peer to Peer Networks: Harnessing Latent Semantics [C]// Proc of the 22nd Annual Joint Conf of the IEEE Computer and Communications Societies (INFOCOM 2003). California, USA. April 14–17, 2003:1261–1271. Voulgaris S, Kermarrec A M, Massoulié L, et al. Exploiting Semantic Proximity in Peer-to-Peer Content Searching[C]// Proc of the10th International Workshop on Future Trends in Distributed Computing Systems (FTDCS 2004). Suzhou, China, May 26–28, 2004:238–243. Sripanidkulchai K, Maggs B, Zhang H. Efficient Content Location Using Interest-Based Locality in Peer-to-Peer Systems[C]// Proc of the 22nd Annual Joint Conf of the IEEE Computer and Communications Societies (INFOCOM 2003). California, USA, April 23–26, 2003:175–186. Comito C, Patarin S, Talia D. A Semantic Overlay Network for P2P Schema-Based Data [C]//Proc of the 11th IEEE Symposium(ISCC’06). Pula-Cagliari, Sardinia, Italy. June 15–17, 2006:88–94. Bender M, Michel S, Triantafillou P, et al. Improving Collection Selection with Overlap Awareness in P2P Search [C]//Proc of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR2005), Salvador, Brazil, August 16–20, 2005:15–19. Bender M, Michel S, Triantafillou P, et al. P2P Content Search: Give the Web Back to the People[C]//Proc of 5th International Workshop on Peer-to-Peer Systems (IPTPS 2006), Santa Barbara, USA. February 12–16, 2006:113–119. Aberer K, Klemm F, Luu T, et al. Building a Peer-to-Peer Full-Text Web Search Engine with Highly Discriminative Keys[EB/OL].[2007-02-01]. http://infoscience.epfl.ch/getfile.py?recid=63674&mode=best . Michel S, Bender M, Ntarmos N, et al Discovering and Exploiting Keyword and Attribute-Value Co-occurrences to Improve P2P Routing Indices[C]//Proc of ACM Fifteenth Conference on Information and Knowledge Management (CIKM2006), Arlington, USA. May 21–24, 2006:123–131. Ratnasamy S, Karp B, Govindan R, et al. A Scalable Content-Addressable Network [J]. ACM SIGCOMM, 2001, 31(4): 161–172. Bloom B H. Space/time Trade-offs in Hash Coding with Allowable Errors [J]. Communications of the ACM, 1970, 13(7):422–426. Nash J. Equilibrium Points in N-Person Games [J]. National Academy of Sciences, 1950, (36):48–49. Boulogne T, Altman Z, Pourtallier O. On the Convergence to Nash Equilibrium in Problems of Distributed Computing [J]. Annals of Operation Research, 2002, 109(1): 279–291. Palmer C R, Steffan J G. Generating Network Topologies that Obey Power Law[C]// Proc of the GLOBECOM 2000, San Francisco, USA, November 24–27, 2000: 434–438.