Implementing and evaluating phrasal query suggestions for proximity search

被引:4
作者
Feuer, Alan [1 ]
Savev, Stefan [1 ]
Aslam, Javed A. [1 ]
机构
[1] Northeastern Univ, Coll Comp & Informat Sci, Boston, MA 02115 USA
基金
美国国家科学基金会;
关键词
Proximity search; Proximal subphrases; Unordered super phrases; Query log analysis; User study; Web search; ALGORITHM;
D O I
10.1016/j.is.2009.03.012
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper describes and evaluates a unified approach to phrasal query suggestions in the context of a high-precision search engine. The search engine performs ranked extended-Boolean searches with the proximity operator NFAR being the default operation. Suggestions are offered to the searcher when the length of the result list falls outside predefined bounds. If the list is too long, the engine specializes the query through the use of super phrases; if the list is too short, the engine generalizes the query through the use of proximal subphrases. We describe methods for generating both types of suggestions and present algorithms for ranking the suggestions. Specifically, we present the problem of counting proximal subphrases for specialization and the problem of counting unordered super phrases for generalization. The uptake of our approach was evaluated by analyzing search log data from before and after the suggestion feature was added to a commercial version of the search engine. We looked at approximately 1.5 million queries and found that, after they were added, suggestions represented nearly 30% of the total queries. Efficacy was evaluated through a controlled study of 24 participants performing nine searches using three different search engines. We found that the engine with phrasal query suggestions had better high-precision recall than both the same search engine without suggestions and a search engine with a similar interface but using an Okapi BM25 ranking algorithm. (c) 2009 Elsevier B.V. All rights reserved.
引用
收藏
页码:711 / 723
页数:13
相关论文
共 44 条
[1]  
ALAN F, 2007, P 16 ACM C INF KNOWL
[2]  
Anick P., 2003, P 26 ANN INT ACM SIG
[3]  
ANICK PG, 1999, P 22 ANN INT ACM SIG
[4]   Engineering a multi-purpose test collection for Web retrieval experiments [J].
Bailey, P ;
Craswell, N ;
Hawking, D .
INFORMATION PROCESSING & MANAGEMENT, 2003, 39 (06) :853-871
[5]   Iterative exploration, design and evaluation of support for query reformulation in interactive information retrieval [J].
Belkin, NJ ;
Cool, C ;
Kelly, D ;
Lin, SJ ;
Park, SY ;
Perez-Carballo, J ;
Sikora, C .
INFORMATION PROCESSING & MANAGEMENT, 2001, 37 (03) :403-434
[6]  
Brajnik G., 1996, P 19 ANN INT ACM SIG, P128
[7]  
BRILL E, 1992, SPEECH AND NATURAL LANGUAGE, P112
[8]  
Bruza P., 1997, P RIAO C INT TEXT IM, V97, P488
[9]  
BUCKLEY C, 2006, P 29 ANN INT ACM SIG
[10]   Analysis of the query logs of a web site search engine [J].
Chau, M ;
Fang, X ;
Sheng, ORL .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2005, 56 (13) :1363-1376