Matrices, vector spaces, and information retrieval

被引:375
作者
Berry, MW [1 ]
Drmac, Z
Jessup, ER
机构
[1] Univ Tennessee, Dept Comp Sci, Knoxville, TN 37996 USA
[2] Univ Colorado, Dept Comp Sci, Boulder, CO 80309 USA
关键词
information retrieval; linear algebra; QR factorization; singular value decomposition; vector spaces;
D O I
10.1137/S0036144598347035
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
The evolution of digital libraries and the Internet has dramatically transformed the processing, storage, and retrieval of information. Efforts to digitize text, images, video, and audio now consume a substantial portion of both academic and industrial activity. Even when there is no shortage of textual materials on a particular topic, procedures for indexing or extracting the knowledge or conceptual information contained in them can be lacking. Recently developed information retrieval technologies are based on the concept of a vector space. Data are modeled as a matrix, and a user's query of the database is represented as a vector. Relevant documents in the database are then identified via simple vector operations. Orthogonal factorizations of the matrix provide mechanisms for handling uncertainty in the database itself. The purpose of this paper is to show how such fundamental mathematical concepts from linear algebra can be used to manage and index large text collections.
引用
收藏
页码:335 / 362
页数:28
相关论文
共 56 条
[1]  
Anderson E., 1995, LAPACK USERS GUIDE
[2]   OUTER PRODUCT EXPANSIONS AND THEIR USES IN DIGITAL IMAGE-PROCESSING [J].
ANDREWS, HC ;
PATTERSON, CL .
AMERICAN MATHEMATICAL MONTHLY, 1975, 82 (01) :1-13
[3]  
[Anonymous], THESIS U TENNESSEE K
[4]  
[Anonymous], 1993, A guide to the Oxford English Dictionary
[5]  
Barrett R., 1994, Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, V2nd ed.
[6]  
BERRY M, 1993, CS93194 U TENN
[7]  
BERRY M, 1997, P INT 97
[8]  
BERRY M, 1997, P CORN LANCZ CENT C, P332
[9]  
BERRY M. W., 1996, LECT APPL MATH, V32, P99
[10]   LARGE-SCALE SPARSE SINGULAR VALUE COMPUTATIONS [J].
BERRY, MW .
INTERNATIONAL JOURNAL OF SUPERCOMPUTER APPLICATIONS AND HIGH PERFORMANCE COMPUTING, 1992, 6 (01) :13-49