Một phương pháp cải thiện tìm kiếm theo chủ đề định kỳ trên web

Maksakov, A. V.1
1Department of Automation of the Systems of Computational Complexes, Faculty of Computational Mathematics and Cybernetics, Moscow State University, Leninskie Gory, Moscow, Russia

Tóm tắt

Bài báo này mô tả một phương pháp tìm kiếm theo chủ đề định kỳ dựa trên sự kết hợp giữa phương pháp tìm kiếm bằng từ khóa và lọc theo chủ đề với việc sử dụng các bộ phân loại văn bản. Chúng tôi xem xét nhiều thuật toán phân loại khác nhau từ quan điểm về hiệu quả của chúng trong việc giải quyết vấn đề đang được nghiên cứu.

Từ khóa

#tìm kiếm theo chủ đề #tìm kiếm định kỳ #phân loại văn bản #thuật toán phân loại

Tài liệu tham khảo

citation_title=Bringing Order to the Web: Automatically Categorizing Search; citation_inbook_title=Proc. of ACM SIGCHI Conf. on Human Factors in Computing Systems; citation_publication_date=2000; citation_pages=145-152; citation_id=CR1; citation_author=H. Chen; citation_author=S. Sumais; citation_publisher=ACM Press citation_journal_title=J. Inform. Sci.; citation_title=Using Documents Classification for Displaying Search Results List; citation_author=O. Driori, N. Aron; citation_volume=29; citation_issue=2; citation_publication_date=2003; citation_pages=97-106; citation_doi=10.1177/016555150302900202; citation_id=CR2 citation_title=Information Retrieval; citation_publication_date=1979; citation_id=CR3; citation_author=C. Rijsbergen; citation_publisher=Butterworth’s and Co citation_title=Information Retrieval on the World Wide Web and Active Logic: A Survey and Problem Definition; citation_publication_date=2002; citation_id=CR4; citation_author=A. Barfourosh; citation_author=H. Nezhad; citation_author=M. Anderson; citation_author=D. Perlis; citation_publisher=University of Maryland citation_journal_title=ACM Computing Surveys; citation_title=Information Retrieval on the Web; citation_author=M. Kobayashi, K. Takeda; citation_volume=32; citation_issue=2; citation_publication_date=2000; citation_pages=144-173; citation_doi=10.1145/358923.358934; citation_id=CR5 citation_title=A Study Using N-Gram Features for Text Categorization; citation_publication_date=1998; citation_id=CR6; citation_author=J. Furnkranz; citation_publisher=Austrian Institute for Artificial Intelligence citation_title=Naive-Bayes vs. Rule-Learning in Classification of Email; citation_publication_date=1999; citation_id=CR7; citation_author=J. Provost; citation_publisher=The University of Texas, Department of Computer Sciences citation_title=Machine Learning in Automated Text Categorization; citation_inbook_title=ACM Computing Surveys; citation_publication_date=2002; citation_pages=1-47; citation_id=CR8; citation_author=F. Sebastiani; citation_publisher=ACM Press citation_title=A Re-examination of Text Categorization Methods; citation_inbook_title=Pros. of Int. ACM Conf. on Research and Development in Information Retrieval (SIGIR-99); citation_publication_date=1999; citation_pages=42-49; citation_id=CR9; citation_author=Y. Yang; citation_author=X. Liu; citation_publisher=ACM Press citation_title=Making Large-Scale SVM Learning Practical; citation_inbook_title=Advances in Kernel Methods: Support Vector Learning; citation_publication_date=1999; citation_pages=169-184; citation_id=CR10; citation_author=T. Joachims; citation_publisher=MIT-Press citation_title=Mining the Web Discovering Knowledge from Hypertext Data; citation_publication_date=2004; citation_id=CR11; citation_author=S. Chakrabarti; citation_publisher=Morgan Kaufmann Publishers citation_journal_title=Annals of Eugenics; citation_title=The Use of Multiple Measurements in Taxonomic Problems; citation_author=R. Fisher; citation_volume=7; citation_publication_date=1936; citation_pages=179-188; citation_id=CR12 citation_title=C4.5: Programs for Machine Learning; citation_publication_date=1993; citation_id=CR13; citation_author=R. Quinlan; citation_publisher=Morgan Kaufmann Publishers citation_title=Feature Preparation in Text Categorization; citation_inbook_title=Proc. of Australian Data Mining Workshop; citation_publication_date=2003; citation_pages=23-34; citation_id=CR14; citation_author=C. Liao; citation_author=S. Alpha; citation_author=P. Dixon; citation_publisher=University of Technology citation_title=OHSUMED: An Interactive Retrieval Evaluation and New Large Test Collection for Research; citation_inbook_title=Proc. 17th Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval; citation_publication_date=1994; citation_pages=192-201; citation_id=CR15; citation_author=W. Hersh; citation_author=C. Buckley; citation_author=T. Leone; citation_author=D. Hickam; citation_publisher=Springer http://www.daviddlewis.com/resources/testcollections/reuters21578/readme.txt citation_title=Learning to Classify Text from Labeled and Unlabeled Documents; citation_inbook_title=Proc. 15th National Conf. on Artificial Intelligence; citation_publication_date=1998; citation_pages=729-799; citation_id=CR17; citation_author=K. Nigam; citation_author=A. McCallum; citation_author=S. Thrum; citation_author=T. Mitchell; citation_publisher=AAAI Press citation_title=A Probabilistic Analysis of the Rochio Algorithm with TFIDF for Text Categorization; citation_inbook_title=Proc. of Int. Conf. on Machine Learning (ICML); citation_publication_date=1997; citation_pages=143-151; citation_id=CR18; citation_author=null Joachims; citation_publisher=Morgan Kaufmann Publishers Proc. 4th Russian Seminar ROMIP-2006 (NU TsSI, St. Petersburg, 2006) [in Russian].