DEVELOPMENT OF ALGORITHMS OF INTELLIGENT SERVICE FOR INFORMATION SEARCH AND MONITORING

  • M. S. Anferova Moscow Aviation Institute
  • A. M. Belevtsev Moscow Aviation Institute
Keywords: Technological trends, monitoring, search robot, artificial intelligence, Big Data, algorithm, text recognition, clustering

Abstract

This paper describes the problem of strategic analysis and the choice of directions for the development
of an innovative enterprise in the conditions of transition to the 6th technological order and
industry 4.0. In these conditions, search and analytical processing of information cannot be fully performed
without the use of automated information and analytical systems, including those based on artificial
intelligence. During the analysis, the main priority functions that the developed services should
provide were identified. The main difficulties in the development of these services are identified, such as:
pre-processing of data and automated checking of the relevance of databases. To effectively solve thetasks set, the intelligent monitoring and information retrieval service should use an integrated approach,
taking into account the effectiveness of applying methods for individual subtasks, and ensure high efficiency
of implementing all stages of the intelligent monitoring procedure. In this regard, this paper describes
not only the development of a general intelligent search algorithm, but also individual block
algorithms necessary to ensure the priority functions of the service being developed. The paper presents
the following algorithms: an information search algorithm necessary to solve the problem of full-text
search of documents within the database of information resources of the information and analytical
complex; an algorithm for the procedure for entering new documents; an algorithm for pre-processing
data that includes stemming and removing punctuation marks for subsequent text analysis; an algorithm
for evaluating the ranking and relevance of information, including vectorization of documents; an algorithm
for clustering information search results based on the Kohonen neural network; the algorithm for
checking the relevance of information is to check whether the local copy of the document corresponds to
the current version on the source's web resource. The Python programming language for the implementation
of the presented algorithm is proposed and justified. The system provides automated continuous
monitoring with a high frequency of sending a request without the participation of an operator, which
will increase the quality and efficiency of information search in conditions of a large volume of unstructured
information.

References

1. Belevtsev A.M., Sadreev F.G., Belevtsev A.A., Balyberdin V.A. Razrabotka intellektual'nykh
servisov monitoringa tekhnologicheskikh trendov v informatsionno-analiticheskikh
kompleksakh [Development of intelligent services for monitoring technological trends in information
and analytical complexes], Naukoemkie tekhnologii [High-tech technologies], 2019,
Vol. 20, No. 3, pp. 24-29.
2. Belevtsev A.M., Balyberdin V.A., Benderskiy G.P., Belevtsev A.A. Analiz napravleniy razvitiya
nano- i IT-tekhnologiy dlya postroeniya spetsializirovannykh setevykh kommunikatsionnykh
sistem novogo pokoleniya [Analysis of the directions of development of nano-and ITtechnologies
for the construction of specialized network communication systems of a new
generation], Izvestiya YuFU. Tekhnicheskie nauki [Izvestiya SFedU. Engineering Sciences],
2015, No. 3 (164), pp. 35-45.
3. Mikova N.S., Sokolova A.V. Monitoring global'nykh tekhnologicheskikh trendov:
teoreticheskie osnovy i luchshie praktiki [Monitoring global technological trends: theoretical
foundations and best practices], Forsayt [Foresight], 2014, Vol. 8, No. 4.
4. Anferova M.S., Belevtsev A.M. Analiz napravleniy sozdaniya algoritmov effektivnogo poiska
informatsii v setyakh obshchego i spetsial'nogo naznacheniya [Analysis of the directions of
creating algorithms for effective information search in general and special purpose networks],
Mater. III Vserossiyskoy nauchno-tekhnicheskoy konferentsii «Aktual'nye problemy
sovremennoy nauki i proizvodstva» [Materials of the III All-Russian Scientific and Technical
Conference "Actual problems of modern science and production"]. Ryazan': RGRTU, 2018.
5. Anferova M.S., Belevtsev A.M. Poiskovye roboty dlya avtomatizirovannogo monitoringa
informatsii v setyakh obshchego i spetsial'nogo naznacheniya [Search robots for automated
monitoring of information in general and special purpose networks], 18-ya Mezhdunarodnaya
nauchno-prakticheskaya konferentsiya «Upravlenie kachestvom» [18th International Scientific
and Practical Conference "Quality Management"], 2019.
6. Jacob Devlin and Ming-Wei Chang. Research Scientists, Google AI Language: Open Sourcing
BERT: State-of-the-Art Pre-training for Natural Language Processing (англ.). Google, Inc,
2018.
7. Charles L. Clarke A., Gordon V. Cormack Dynamic Inverted Indexes for a Distributed Full-
Text Retrieval System, MultiText Pro ject Technical Report MT-95-01. University of Waterloo,
Waterloo, Ontario N2L 3G1, Canada, 1995.
8. Pavlov Yu.N., Maystruk K.A. Sravnenie metodov otsenki tonal'nosti teksta [Comparison of
methods for assessing the tonality of the text], Molodoy uchenyy [Young scientist], 2016, No.
12 (116), pp. 59-64.
9. Olson David L, and Delen, Dursun. Advanced Data Mining Techniques. Springer, 1st edition
(February 1, 2008), 2008, 138 p.
10. Manning C., Raghavan P., Schütze H. Introduction to Information Retrieval. Cambridge
University Press, 2008.
11. Powers, David M.W. Evaluation: From Precision, Recall and F-Measure to ROC,
Informedness, Markedness & Correlation, Journal of Machine Learning Technologies, 2011,
No. 2 (1), pp. 37-63.
12. Lovins Julie Beth. Development of a Stemming Algorithm, Mechanical Translation and Computational
Linguistics, 1968, Vol. 11.
13. Slovar' po kibernetike [Dictionary of Cybernetics], ed. by akad. V.S. Mikhalevicha. 2nd. ed.
Kiev: Gl. red. Ukrainskoy sovetskoy entsiklopedii im. M.P. Bazhana, 1989, 751 p.
14. Salton G. and Buckley C. Term-weighting approaches in automatic text retrieval, Information
Processing & Management, 1988, Vol. 24 (5), pp. 513-523.
15. Ayvazyan S.A., Bukhshtaber V.M., Enyukov I.S., Meshalkin L.D. Prikladnaya statistika:
Klassifikatsiya i snizhenie razmernosti [Applied statistics: Classification and dimension reduction].
Mscow: Finansy i statistika, 1989, 607 p.
16. Berikov V.S., Lbov G.S. Sovremennye tendentsii v klasternom analize [Modern trends in cluster
analysis], Vserossiyskiy konkursnyy otbor obzorno-analiticheskikh statey po prioritetnomu
napravleniyu «Informatsionno-telekommunikatsionnye sistemy» [All-Russian competitive selection
of review and analytical articles in the priority direction "Information and telecommunications
systems"], 2008, 26 p.
17. Anferova M.S., Belevtsev A.M. Analiz napravleniy sozdaniya algoritmov effektivnogo poiska
informatsii v setyakh obshchego i spetsial'nogo naznacheniya [Analysis of the directions of
creating algorithms for effective information search in general and special purpose networks]
Mater. III Vseros-siyskoy nauchno-tekhnicheskoy konferentsii «Aktual'nye problemy
sovremennoy nauki i proizvodstva» [Materials of the III All-Russian Scientific and Technical
Conference "Actual problems of modern science and production"]. Ryazan': RGRTU, 2018.
18. Anferova M.S., Belevtsev A.M. Analiz napravleniy razvitiya tekhnologiy monitoringa v
usloviyakh bol'shogo ob"ema nestrukturirovannoy informatsii [Analysis of trends in the development
of monitoring technologies in the conditions of a large volume of unstructured information],
XXIV Vserossiyskaya nauchno-tekhnicheskaya konferentsiya s mezhdunarodnym
uchastiem im. professora O.N. P'yavchenko ”Komp'yuternye i informatsionnye tekhnologii v
nauke, inzhenerii i upravlenii” «KomTekh-2020» [XXIV All-Russian Scientific and Technical
Conference with international participation named after Professor O. N. Piavchenko "Computer
and information technologies in science, engineering and management ""Comtech-2020"].
19. Endryu M. Real'naya zhizn' i iskusstvennyy intellekt [Real life and artificial intelligence],
Novosti iskusstvennogo intellekta, RAII, 2000.
20. Belevtsev A.M., Balyberdin V.A., Belevtsev A.A., Sadreev F.G. O razrabotke intellektual'nykh
servisov monitoringa tekhnologicheskikh trendov v informatsionno-analiticheskikh
kompleksakh [On the development of intelligent services for monitoring technological trends
in information and analytical complexes], Naukoemkie tekhnologii [High-tech technologies],
2019, No. 3.
Published
2021-08-11
Section
SECTION I. INFORMATION PROCESSING ALGORITHMS