ALGORITHM FOR TRAINING DATA PREPARATION OF CONVOLUTIONAL NEURAL NETWORKS FOR LETTER AND CHARACTER RECOGNITION

Abstract

The accuracy of text image recognition remains limited in practice. This is due to the fact that the alphabet of symbols can include lowercase and uppercase letters with a similar font, as well as composite characters formed from several simpler characters. To solve this problem, the character recognition system is supplemented with semantic or structural analysis systems, which significantly complicates the information system for text recognition. Currently, convolutional neural networks are widely used for recognizing single characters, for which a database with images of recognized characters is used for training. The paper proposes an algorithm characterized in that the image of a single character for a training sample includes fragments of characters that can be located in a line in close proximity to the recognized character.  This allows you to expand the set of images for training and additionally include information in the image about the placement of the symbol in the string, its relative size and whether this symbol is composite. The formation of images for the training sample simulates the process of segmentation of a symbol by brightness, which is usually used when selecting a symbol for further recognition. At the same time, the size of the symbol is estimated, it is supplemented with images of neighboring symbols, and then the size of the area, the image that will be placed in the training sample, is estimated. The resulting image is scaled and cropped in such a way that images of a given size are received at the input of the neural network. In the work, to recognize the alphabet of symbols, including uppercase and lowercase characters of the Russian and English alphabets, numbers, symbols and punctuation marks, it is proposed to use a variety of convolutional neural networks, each of which is trained to recognize one character. The symbol is selected by comparing the responses of all neural networks and selecting the maximum response. The proposed algorithm for training data preparation is compared with a well-known algorithm based on the use of images of single characters. It is established that the proposed algorithm for preparing data for training provides an increase in the accuracy of recognizing the alphabet of 138 characters by more than two times.

Authors

References

1. Gorelik A.L., Skripkin V.A. Metody raspoznavaniya [Recognition methods]. Moscow: Vysshaya shkola, 1984, 208 p.

2. Goodfellow I., Bengio Y., Courville A. Deep Learning. MIT Press, 2016. Available at: httpp: // www.deeplearningbook.org.

3. Chupinin Yu.G. Patent RF 2661750: MPK G06K 9/20. Raspoznavanie simvolov s ispol'zovaniem is-kusstvennogo intellekta [Patent Ru No. 2661750, G06K 9/20. Character recognition using artificial intel-ligence]; Prior. 30.05.2017, Publ. 07/19/2018, Bul. No. 20.

4. Nikolenko S., Kadurin A., Arkhangel'skaya E. Glubokoe obuchenie. Pogruzhenie v mir neyronnykh setey [Deep learning. Dive into the world of neural networks]. Saint Petersburg: Piter, 2021, 476 p.

5. Forsyth D.A., Ponce J. Computer Vision: A Modern Approach. 2nd ed. New Jersey: Prentice Hall, 2011, 792 p.

6. Bolotova Yu.A., Spitsyn V.G., Rudometkina M.N. Raspoznavanie avtomobil'nykh nomerov na osnove metoda svyaznykh komponent i ierarkhicheskoy vremennoy seti [Recognition of license plates based on the method of connected components and a hierarchical time network], Komp'yuternaya optika [Com-puter Optics], 2015, Vol. 39, No. 2, pp. 275-280.

7. Kazanskiy N.L., Popov S.B. Raspredelennaya sistema tekhnicheskogo zreniya registratsii zheleznodorozhnykh sostavov [Distributed vision system for registration of railway trains], Komp'yuternaya optika [Computer Optics], 2012, Vol. 36, No. 3, pp. 419-428.

8. Izotov P.Yu., Sukhanov S.V., Golovashkin D.L. Tekhnologiya realizatsii neyrosetevogo algoritma v srede CUDA na primere raspoznavaniya rukopisnykh tsifr [The technology of implementing a neural network algorithm in the cuda environment using the example of handwriten digit recognition], Komp'yuternaya optika [Computer Optics], 2010, Vol. 34, No. 2, pp. 243-251.

9. Spitsyn V.G., Bolotova Yu.A., Fan N.Kh., Buy T.T.Ch. Primenenie veyvlet-preobrazovaniya Khaara, metoda glavnykh komponent i neyronnykh setey dlya opticheskogo raspoznavaniya simvolov na izobra-zheniyakh v prisutstvii impul'snogo shuma [Application of the Haar wavelet transform, the principal component method and neural networks for optical character recognition in images in the presence of pulsed noise], Komp'yuternaya optika [Computer Optics], 2016, Vol. 40, No. 2, pp. 249-257. DOI: 10.18287/2412-6179-2016-40-2-249-257.

10. Zaginaylo M.V., Fatkhi V.A. Raspoznavanie simvolov s pomoshch'yu apparata iskusstvennykh ney-ronnykh setey [Character recognition using artificial neural networks], Innovatsii i investitsii [Innova-tions and Investments], 2005, No. 5, pp. 145-147.

11. Rashid T. Sozdaem neyronnuyu set' [Make your own neural network]. Saint Petersburg: OOO «Al'fa-kniga», 2017, 272 p.

12. Fan N.Kh., Buy T.T.Ch., Spitsyn V.G. Raspoznavanie pechatnykh tekstov na osnove primeneniya vey-vlet-preobrazovaniya i metoda glavnykh komponent [Recognition of printed texts based on the applica-tion of the wavelet transform and the principal component method], Izvestiya Tomskogo politekhnich-eskogo universiteta [Proceedings of Tomsk Polytechnic University], 2012, Vol. 36, No. 5, pp. 154-157.

13. Miller E.G., Viola P.A. Ambiguity and constraint in mathematical expression recognition, in AAAI-98/IAAI-98 Proceedings, July 26-30, 1998, Madison, Wisconsin: AAAI, 1998, pp. 784-791.

14. Ong Kai Bin, Yew Kwang Hooi, Said Jadid Abdul Kadir, Haruhiro Fujita and Luqman Hakim Rosli. Enhanced Symbol Recognition based on Advanced Data Augmentation for Engineering Diagrams, In-ternational Journal of Advanced Computer Science and Applications (IJACSA), 2022, 13 (5). Available at: http://dx.doi.org/10.14569/IJACSA.2022.0130563.

15. Bhanbhro H., Yew K.H., Kusakunniran W., Amur Z. A Symbol Recognition System for Single-Line Diagrams Developed Using a Deep-Learning Approach, Applied Sciences, 2023, 13, pp. 8816. Availa-ble at: https: // doi.org/10.3390/app13158816.

16. Moreno-García, C.F.; Elyan, E.; Jayne, C. Heuristics-Based Detection to Improve Text/Graphics Seg-mentation in Complex Engineering Drawings, In Proceedings of the Engineering Applications of Neural Networks: 18th International Conference (EANN 2017), Athens, Greece, 25–27 August 2017, pp. 87-98.

17. Pratt W.K. Digital image processing. New York: Wiley, 1991, 698 p.

18. Muthukrishnan R, Radha M. Contour selection algorithms for image segmentation, International Jour-nal of Computer Science & Information Technology (IJCSIT), 2014,Vol. 3, No. 6, pp. 259-267.

19. Poynter Ya. Programmiruem s PyTorch: Sozdanie prilozheniy glubokogo obucheniya [Programming PyTorch for Deep Learning]. Saint Petersburg: Piter, 2020, 256 p.

20. Liu Yuxi (Hayden). PyTorch 1.x Reinforcement Learning Cookbook. Over 60 recipes to design, develop, and deploy self-learning AI models using Python. Birmingham–Mumbai: Packt, 2019, 527 p.

Скачивания

Published:

2025-07-24

Issue:

Section:

SECTION IV. MACHINE LEARNING AND DATA PROCESSING

Keywords:

Algorithm, alphabet, symbol, recognition, convolutional neural network, training sample

DOI

For citation:

D.А. Bezuglov , М.S. Mishchenko , S.E. Mishchenko ALGORITHM FOR TRAINING DATA PREPARATION OF CONVOLUTIONAL NEURAL NETWORKS FOR LETTER AND CHARACTER RECOGNITION. IZVESTIYA SFedU. ENGINEERING SCIENCES – 2025. - № 3. – P. 134-144.