Effective summarization method of text documents

被引:26
作者
Alguliev, RM [1 ]
Aliguliyev, RM [1 ]
机构
[1] Azerbaijan Natl Acad Sci, Inst Informat Technol, Baku, Azerbaijan
来源
2005 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, PROCEEDINGS | 2005年
关键词
D O I
10.1109/WI.2005.57
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose text summarization method that creates text summary by definition of the relevance score of each sentence and extracting sentences from the original documents. While summarization this method takes into account weight of each sentence in the document. The essence of the method suggested is in preliminary identification of every sentence in the document with characteristic vector of words, which appear in the document, and calculation of relevance score for each sentence. The relevance score of sentence is determined through its comparison with all the other sentences in the document and with the document title by cosine measure. Prior to application of this method the scope of features is defined and then the weight of each word in the sentence is calculated with account of those features. The weights of features, influencing relevance of words, are determined using genetic algorithms.
引用
收藏
页码:264 / 271
页数:8
相关论文
共 25 条
[1]  
[Anonymous], 2000, SIGKDD EXPLOR, DOI [DOI 10.1145/846183.846187, 10.1145/846183.846187]
[2]  
[Anonymous], 2004, P 27 ANN INT ACM SIG, DOI DOI 10.1145/1008992.1009035
[3]  
[Anonymous], 1989, GENETIC ALGORITHM SE
[4]  
Delort B.B.-M. J.-Y., 2003, PROC 14 ACM C HYPERT, P208
[5]  
EIBEN AE, 2003, INTRO EVOLUTIONARY
[6]   An analytical approach to concept extraction in HTML']HTML environments [J].
Fresno, V ;
Ribeiro, A .
JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2004, 22 (03) :215-235
[7]   Summarizing text documents: Sentence selection and evaluation metrics [J].
Goldstein, J ;
Kantrowitz, M ;
Mittal, V ;
Carbonell, J .
SIGIR'99: PROCEEDINGS OF 22ND INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 1999, :121-128
[8]  
Gong Y., 2001, P 24 ANN INT ACM SIG, P19, DOI DOI 10.1145/383952.383955
[9]   Efficient phrase-based document indexing for web document clustering [J].
Hammouda, KM ;
Kamel, MS .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2004, 16 (10) :1279-1296
[10]  
Hu P, 2004, FOURTH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY, PROCEEDINGS, P1159