• V.V. Kovalev Southern Federal University
  • N.E. Sergeev Southern Federal University
Keywords: Pattern, pattern recognition, convolutional neural networks, image processing


Recognition of certain patterns in video images captured by a camera is carried out using
training methods based on convolutional neural networks. The larger the number of images with
multiple features and the more diverse the training sample of video images, the better the convolutional
neural networks extract features from the sequence of video images that were not included in
the training sample. This is a consequence of increasing the accuracy of detecting visual images on
video images containing features of target images. However, there are limitations in improving the
detection performance when the size of the image to be detected is much smaller than the background
area, or when the image is described with little information. To solve problems of this kind, the authors
of the article have developed an algorithm for the spatio-temporal integration of information
about the movement of dynamic images. The algorithm processes a fixed number of video images at
certain points in time and extracts new independent signs of motion of dynamic images based on
space-time processing of video images. Further, it combines new local motion features with the original
video image features. This allows you to add a motion feature of dynamic images while preserving
the original image features that describe static images. Areas of the video image that characterize
the motion feature are displayed in a «color» cluster. The use of pre-processing is aimed at improving the accuracy of pattern detection, provided there are dynamic visual images on a static background.
If the camera is in scan mode, a static background can be provided with a video stabilizer.
Experimentally, estimates of integral criteria for the accuracy of detection neural network algorithms
have been obtained, showing an increase in the accuracy of detecting visual images using
the algorithm for spatial-temporal integration of motion information.


