关联数据资源集相似度计算方法研究

被引：6

作者：

邓兰兰 ^{[1
,2
]}

李春旺 ^{[1
]}

机构：

[1] 中国科学院国家科学图书馆

[2] 中国科学院研究生院

来源：

情报理论与实践 | 2012年 / 35卷 / 05期

关键词：

关联数据; 资源集; 相似度; 算法;

D O I：

10.16353/j.cnki.1000-7490.2012.05.009

中图分类号：

G202 [信息处理技术];

学科分类号：

050302 ;

摘要：

文章提出的适用于关联数据资源集相似度计算的综合描述信息模型,分为基本描述、内容描述和外部链接3个模块描述资源集,并根据各信息项的特点挑选字符串相似度、集合相似度、向量空间模型和基于统计和语义的相似度等算法计算资源集相似度,在一定程度上解决了当前关联创建中相关资源集手工配置的问题。

引用

页码：112 / 116

页数：5

共 16 条

[1]

Two approaches matching in example-based machine translation. Sergei Nirenburg,et al. Proc of TMI-93 . 1993

[2]

Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language. Philip Resnik. Journal of Artificial Organs . 1999

[3]

Relevant document distribution estimation method for resource selection. L. Si,and J. Callan. Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval . 2003

[4] Web整合中的资源描述技术 [J].

张丽 ;

汪语宇 .

图书情报工作, 2005, (10) :25-28

[5]

基于多层特征的字符串相似度计算模型[J]. 章成志. 报学报. 2005 (06)

[6]

Searching distributed collections with inference networks. Callan J,Lu Z,Croft W. Proceedings of the 18th International ACM SIGIR Conference on Research and Development in Information Retrieval . 1995

[7]

[8] Query-based sampling of text databases [J].

Callan, J ;

Connell, M .

ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2001, 19 (02) :97-130

[9] 集成检索系统中资源选择技术及算法 [J].

汪语宇 ;

张丽 .

图书情报工作 , 2005, (10) :29-32+66

[10]

Extraction of information in large graphs auto-matic search for synonyms. SENEHART P P. . 2001

← 1 2 →