Classifying documents with link-based bibliometric measuresSpringer Science and Business Media LLC - Tập 13 - Trang 315-345 - 2009
T. Couto, N. Ziviani, P. Calado, M. Cristo, M. Gonçalves, E. S. de Moura, W. Brandão
Automatic document classification can be used to organize documents in a digital
library, construct on-line directories, improve the precision of web searching,
or help the interactions between user and search engines. In this paper we
explore how linkage information inherent to different document collections can
be used to enhance the effectiveness of classification algorithms. We have
experiment... hiện toàn bộ
ReBoost: a retrieval-boosted sequence-to-sequence model for neural response generationSpringer Science and Business Media LLC - Tập 23 - Trang 27-48 - 2019
Yutao Zhu, Zhicheng Dou, Jian-Yun Nie, Ji-Rong Wen
Human–computer conversation is an active research topic in natural language
processing. One of the representative methods to build conversation systems uses
the sequence-to-sequence (Seq2seq) model through neural networks. However, with
limited input information, the Seq2seq model tends to generate meaningless and
trivial responses. It can be greatly enhanced if more supplementary information
is p... hiện toàn bộ
An analysis of NP-completeness in novelty and diversity rankingSpringer Science and Business Media LLC - Tập 14 - Trang 89-106 - 2010
Ben Carterette
A useful ability for search engines is to be able to rank objects with novelty
and diversity: the top k documents retrieved should cover possible intents of a
query with some distribution, or should contain a diverse set of subtopics
related to the user’s information need, or contain nuggets of information with
little redundancy. Evaluation measures have been introduced to measure the
effectivenes... hiện toàn bộ
Using the Web as corpus for self-training text categorizationSpringer Science and Business Media LLC - Tập 12 - Trang 400-415 - 2008
Rafael Guzmán-Cabrera, Manuel Montes-y-Gómez, Paolo Rosso, Luis Villaseñor-Pineda
Most current methods for automatic text categorization are based on supervised
learning techniques and, therefore, they face the problem of requiring a great
number of training instances to construct an accurate classifier. In order to
tackle this problem, this paper proposes a new semi-supervised method for text
categorization, which considers the automatic extraction of unlabeled examples
from t... hiện toàn bộ
Distance matters! Cumulative proximity expansions for ranking documentsSpringer Science and Business Media LLC - Tập 17 - Trang 380-406 - 2014
Jeroen B. P. Vuurens, Arjen P. de Vries
In the information retrieval process, functions that rank documents according to
their estimated relevance to a query typically regard query terms as being
independent. However, it is often the joint presence of query terms that is of
interest to the user, which is overlooked when matching independent terms. One
feature that can be used to express the relatedness of co-occurring terms is
their pro... hiện toàn bộ
Probabilistic models in IR and their relationshipsSpringer Science and Business Media LLC - Tập 17 - Trang 177-201 - 2013
Robin Aly, Thomas Demeester, Stephen Robertson
A solid research path towards new information retrieval models is to further
develop the theory behind existing models. A profound understanding of these
models is therefore essential. In this paper, we revisit probability ranking
principle (PRP)-based models, probability of relevance (PR) models, and language
models, finding conceptual differences in their definition and
interrelationships. The p... hiện toàn bộ
Balancing exploration and exploitation in listwise and pairwise online learning to rank for information retrievalSpringer Science and Business Media LLC - Tập 16 - Trang 63-90 - 2012
Katja Hofmann, Shimon Whiteson, Maarten de Rijke
As retrieval systems become more complex, learning to rank approaches are being
developed to automatically tune their parameters. Using online learning to rank,
retrieval systems can learn directly from implicit feedback inferred from user
interactions. In such an online setting, algorithms must obtain feedback for
effective learning while simultaneously utilizing what has already been learned
to ... hiện toàn bộ
A relatedness analysis of government regulations using domain knowledge and structural organizationSpringer Science and Business Media LLC - Tập 9 - Trang 657-680 - 2006
Gloria T. Lau, Kincho H. Law, Gio Wiederhold
The complexity and diversity of government regulations make understanding and
retrieval of regulations a non-trivial task. One of the issues is the existence
of multiple sources of regulations and interpretive guides with differences in
format, terminology and context. This paper describes a comparative analysis
scheme developed to help retrieval of related provisions from different
regulatory doc... hiện toàn bộ