Chinese comments sentiment classification based on word2vec and SVMperf

被引:374
作者
Zhang, Dongwen [1 ,2 ]
Xu, Hua [1 ]
Su, Zengcai [1 ,2 ]
Xu, Yunfeng [1 ,2 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Tsinghua Natl Lab Informat Sci & Technol, State Key Lab Intelligent Technol & Syst, Beijing 100084, Peoples R China
[2] Hebei Univ Sci & Technol, Sch Informat Sci & Engn, Shijiazhuang 050018, Hebei, Peoples R China
基金
中国国家自然科学基金;
关键词
Sentiment classification; Word2vec; SVMperf; Semantic features; ONLINE REVIEWS;
D O I
10.1016/j.eswa.2014.09.011
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Since the booming development of e-commerce in the last decade, the researchers have begun to pay more attention to extract the valuable information from consumers comments. Sentiment classification, which focuses on classify the comments into positive class and negative class according to the polarity of sentiment, is one of the studies. Machine learning-based method for sentiment classification becomes mainstream due to its outstanding performance. Most of the existing researches are centered on the extraction of lexical features and syntactic features, while the semantic relationships between words are ignored. In this paper, in order to get the semantic features, we propoie a method for sentiment classification based on word2vec and SVMperf. Our research consists of two parts of work. First of all, we use word2vec to cluster the similar features for purpose of showing the capability of word2vec to capture the semantic features in selected domain and Chinese language. And then, we train and classify the comment texts using word2vec again and SVMperf. In the process, the lexicon-based and part-of-speech-based feature selection methods are respectively adopted to generate the training file. We conduct the experiments on the data set of Chinese comments on clothing products. The experimental results show the superior performance of our method in sentiment classification. (C) 2014 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1857 / 1863
页数:7
相关论文
共 34 条
[1]   Selecting Attributes for Sentiment Classification Using Feature Relation Networks [J].
Abbasi, Ahmed ;
France, Stephen ;
Zhang, Zhu ;
Chen, Hsinchun .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2011, 23 (03) :447-462
[2]  
[Anonymous], 2011, P 4 ACM INT C WEB SE, DOI DOI 10.1145/1935826.1935884
[3]  
[Anonymous], 2006, P ACMSIGKDD INT C KN
[4]  
[Anonymous], 2012, Synth. Lectures Human Lang. Technol., DOI [10.2200/S00416ED1V01Y201204HLT016, DOI 10.2200/S00416ED1V01Y201204HLT016]
[5]  
[Anonymous], 2010, P 23 INT C COMP LING
[6]  
[Anonymous], 2013, P WORKSHOP INT C LEA
[7]  
[Anonymous], 2012, Mining Text Data, DOI DOI 10.1007/978-1-4614-3223-413
[8]  
Bo Pang, 2008, Foundations and Trends in Information Retrieval, V2, P1, DOI 10.1561/1500000001
[9]  
Joachims T., 2005, P 22 INT C MACHINE L, P377, DOI DOI 10.1145/1102351.1102399
[10]   Sparse kernel SVMs via cutting-plane training [J].
Joachims, Thorsten ;
Yu, Chun-Nam John .
MACHINE LEARNING, 2009, 76 (2-3) :179-193