СТАТИСТИЧЕСКИЕ И МАШИННЫЕ МЕТОДЫ АВТОМАТИЧЕСКОГО ИЗВЛЕЧЕНИЯ ПРИЧИННО-СЛЕДСТВЕННЫХ СВЯЗЕЙ ИЗ ТЕКСТА (ОБЗОР)

K.B. Shtanchaev

K.B. Shtanchaev Dagestan State Technical University

Keywords: Causality, causal knowledge, natural language processing, machine learning, computational linguistics, hidden causality

Abstract

Until the 2000s, the concept of non-statistical methods was used to solve the problem of
automatic extraction of causal relationships (CR). These methods used manually constructed
linguistic templates. Obviously, the CR that did not fit into the built templates could not be
defined. Non-statistical methods required constant manual control by experts, up to the evaluation.
Almost all methods were aimed at extracting explicit CR. In some methods, attempts
were made to untie the extraction system from a specific subject area. To eliminate the above
disadvantages, the methods developed in the future began to shift towards statistical data
processing and machine learning. In this article, statistical and machine methods of CR e xtraction
are considered. A few valuable papers related to the new paradigm of CR extraction
were analyzed. The aim of the research was to evaluate new methods with the ability to identify
their advantages and disadvantages. The great advantage of machine and statistical
methods is independence from the subject area while maintaining the accuracy of extraction.
Such methods are worse in accuracy, but they are not tied to a specific problem area. The
methods themselves, unlike non-statistical ones, which used linguistic and syntactic comparison
with templates manually, are focused on finding these templates. Even though machine
and statistical methods are mostly independent of the subject area and use large corpora oftext for teaching, they are intended mainly for the English language. There is also no standardized
data set that would allow methods to be compared with each other. All works devoted
to methods ignored the extraction of implicit CR.

References

1. Shtanchaev Kh.B. Nestatisticheskie metody avtomaticheskogo izvlecheniya prichinnosledstvennykh
svyazey iz teksta [Non-statistical methods for automatically extracting causeand-
effect relationships from text], Izvestiya YuFU Tekhnicheskie nauki [Izvestiya SFedU. Engineering
Sciences], 2023, No. 2, pp. 273-280.
2. Girju R. Automatic detection of causal relations for question answering, Proceedings of the
ACL 2003 workshop on Multilingual summarization and question answering, 2003, Vol. 12,
pp. 76-83.
3. Girju R., Moldovan D. Text mining for causal relations, FLAIRS Conference, 2002, pp. 360-364.
4. Quinlan J.R. C4. 5: programs for machine learning. Elsevier, 2014.
5. Marcu D., Echihabi A. An unsupervised approach to recognizing discourse relations, Proceedings of
the 40th Annual Meeting on Association for Computational Linguistics, 2002, pp. 368-375.
6. Dauni A.B. D21 Bayesovskie modeli [D21 Bayesian models]: transl. from engl.
V.A. Yarockogo. Moscow: DMK Press. 2018, 182 p.
7. Chang D.-S., Choi K.-S. Causal relation extraction using cue phrase and lexical pair probabilities,
in Natural Language Processing– IJCNLP, 2004. Springer, 2004, pp. 61-70.
8. Tapanainen P., J¨arvinen T. A non-projective dependency parser, Proceedings of the fifth
conference on Applied natural language processing. Association for Computational Linguistics,
1997, pp. 64-71.
9. Blanco E., Castell N., Moldovan D.I. Causal relation extraction. LREC, 2008.
10. Sil A., Huang F., Yates A. Extracting action and event semantics from web text, AAAI Fall
Symposium: Commonsense Knowledge, 2010.
11. Church K.W., Hanks P. Word association norms, mutual information, and lexicography,
In Proceedings of the 27th Annual Conference of the Association of Computational Linguistics,
1989, pp. 76-83.
12. Gordon A.S., Bejan C.A., Sagae K. Commonsense causal reasoning using millions of personal
stories. AAAI, 2011.
13. Bethard S., Martin J.H. Learning semantic links from a corpus of parallel temporal and causal
relations, Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics
on Human Language Technologies: Short Papers. Association for Computational Linguistics,
2008, pp. 177-180.
14. Rink B., Bejan C.A., Harabagiu S.M. Learning textual graph patterns to detect causal event
relations, FLAIRS Conference. – 2010.
15. Yan X. and Han J. Graph-based substructure pattern mining, Data Mining, Proceedings. 2002
IEEE International Conference on. IEEE. 2002, pp. 721-724.
16. Sorgente A., Vettigli G., Mele F. Automatic extraction of cause effect relations in natural language
text, DART@ AI* IA. Vol. 2013, pp. 37-48.
17. Yang X. and Mao K. Multi level causal relation identification using extended features, Expert
Systems with Applications, 2014, Vol. 41, No. 16, pp. 7171-7181
18. Pakray P., Gelbukh A. An open domain causal relation detection from paired nominal, 13th
Mexican international conference on artificial intelligence (MICAI-2014). Nature-Inspired
Computation and Machine Learning, 2014, Vol. 8857, pp. 261-271.
19. Gurulingappa H., Rajput AM., Roberts A., Fluck J., Hofmann-Apitius M., Toldo L. Development
of a benchmark corpus to support the automatic extraction of drug-related adverse effects
from medical case reports, J Biomed Inform, 2012, Vol. 45 (5), pp. 885-892.
20. Rutherford A., Xue N. Discovering implicit discourse relations through brown cluster pair representation
and coreference patterns, Proceedings of the 14th conference of the European
chapter of the association for computational linguistics. Association for Computational Linguistics,
2014, pp. 645-654.

STATISTICAL AND MACHINE METHODS FOR AUTOMATICALLY EXTRACTING CAUSAL RELATIONSHIPS FROM TEXT (REVIEW)

Abstract

References