SEMANTIC-STATISTICAL ALGORITHM FOR DETERMINING THE CATEGORIES OF ASPECTS IN THE PROBLEMS OF SENTIMENT ANALYSIS
Abstract
In the modern world, one of the important communication channels is the Internet. Trade,
promotion of services is carried out through electronic platforms. Social networks and instant
messengers are becoming the most important communication channel and a powerful tool for
influencing public opinion. A significant amount in all published content falls on texts written in
natural language. Therefore, the problems of natural language processing (NLP) and natural
language understanding (NLU) today are one of the key ones. Under the influence of commercial
interests, the field of automatic aspect-based sentiment analysis is actively developing. This task
significantly depends on specific subject areas, and therefore the issue of quick and effective adaptation
of existing models to new domains is very acute. The paper proposes a hybrid method of
aspect-oriented analysis, based on data extracted from common dictionaries and domain-oriented
texts. The novel method for constructing a condensed semantic graph based on unstructured domain-
dependent texts is proposed. Numerical metrics to assess the significance of individual termswithin the entire domain are introduced. An algorithm for the text categorization based on the
selection of semantic clusters within a condensed domain-specific graph is proposed. A method for
assessing the sentiment of domain-oriented texts based on statistical data, including the joint use
of a tone lexicon and a condensed domain-specialized graph, is proposed. The results of experiments
are presented, allowing for evaluation of the quality of the algorithms.
References
2. Pennington J., Socher R., Manning C.D. Glove: Global vectors for word representation, In
Proceedings of the 2014 conference on empirical methods in natural language processing
(EMNLP), pp. 1532-1543.
3. Mikolov T., et al. Distributed representations of words and phrases and their compositionality,
Advances in neural information processing systems, 2013, Vol. 26, pp. 3111-3119.
4. Hung C. and Chen S.J. Word sense disambiguation based sentiment lexicons for sentiment
classification, Knowledge-Based Systems, 2016, Vol. 110, pp. 224-232.
5. Baccianella A.E., Sebastiani F., Sebastiani S. SentiWordNet 3.0: Anenhanced lexical resource for
sentiment analysis and opinion mining, In Proceedings of LREC, 2010, Vol. 10, pp. 2200-2204.
6. Cambria E., Poria S., Hazarika D., Kwok K., Senticnet 5: Discovering conceptual primitives
for sentiment analysis by means of context embeddings, AAAI, 2018.
7. Strapparava C., & Valitutti A. Wordnet affect: an affective extension of wordnet, In Lrec.,
2004, May, Vol. 4, No. 40, pp. 1083-1086.
8. Dai W., G. Xue, Qiang Yang and Y. Yu. Transferring Naive Bayes Classifiers for Text Classification,
AAAI, 2007, Vol. 7, pp. 540-545.
9. Guo G., et al. Using kNN model for automatic text categorization, Soft Computing, 2006,
No. 10 (5), pp. 423-430.
10. Joachims T. Text categorization with support vector machines: Learning with many relevant
features, In European conference on machine learning. Springer, Berlin, Heidelberg, 1998,
pp. 137-142.
11. Salles T., et al. Improving random forests by neighborhood projection for effective text classification,
Information Systems, Vol. 77, pp. 1-21.
12. Peng H., et al. Large-scale hierarchical text classification with recursively regularized deep
graph-cnn, In Proceedings of the 2018 World Wide Web Conference, pp. 1063-1072.
13. Luan Y., Lin S. Research on Text Classification Based on CNN and LSTM, In 2019 IEEE International
Conference on Artificial Intelligence and Computer Applications (ICAICA). IEEE,
pp. 352-355.
14. Xu Y., et al. A Study on Mutual Information-based Feature Selection for Text Categorization,
Journal of Computational Information Systems, No. 3 (3), pp. 1007-1012.
15. Sugiyama M. Dimensionality reduction of multimodal labeled data by local fisher discriminant
analysis, Journal of machine learning research, 2007, No. 8, pp. 1027-1061.
16. Krayvanova V., Kryuchkova E. The mathematical model of the semantic analysis of phrases based
on the trivial logic, In Proceedings of “Speech and computer” SPECOM, 2009, pp. 543-546.
17. Ozhegov S.I., Shvedova N.Yu. Tolkovyy slovar' russkogo yazyka [Explanotary Dictionary of
the Russian Language]. Izd-vo "Az"", 1992. Available at: http://lib.ru/DIC/OZHEGOW/.
18. Abramov N. Slovar' russkikh sinonimov i skhodnykh po smyslu vyrazheniy [Dictionary of
Russian Sysnonyms and words with close meanings]. Izd-vo Russkie slovari, 2007. Available
at: http://dict.buktopuha.net/data/abr1w.zip.
19. Pontiki M., et al. SemEval-2016 Task 5: Aspect Based Sentiment Analysis, In Proceedings of
the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 19-30.
20. Korney A.O., Kryuchkova E.N. Analiz tonal'nosti korotkikh tekstov na osnove
semanticheskogo grafa» [Short text sentiment analysis based on semantic graph],
Robototekhnika i iskusstvennyy intellekt: Mater. X Vserossiyskoy nauchno-tekhnicheskoy
konferentsii s mezhdunarodnym uchastiem [Robotics and Artificial Intelligence: Materials of
the X All-Russian Scientific and Technical Conference with International Participation], 2018,
pp. 168-174.