ACCELERATION OF THE DIRECT PASSAGE IN THE IMPLEMENTATION OF CNN ON A LIMITED COMPUTING RESOURCE

  • А.Е. Shchelkunov Joint Stock Company «Scientific Design Bureau of Computing Systems»
  • V.V. Kovalev Joint Stock Company «Scientific Design Bureau of Computing Systems»
  • I. V. Sidko Joint Stock Company «Scientific Design Bureau of Computing Systems»
  • N. Е. Sergeev Joint Stock Company «Scientific Design Bureau of Computing Systems»
Keywords: Optimization of the execution of the direct pass of the CNN, tracking

Abstract

The work is devoted to the optimization of the neural network architecture for its launch on
a limited computing resource. Several optimization approaches are considered, estimates of the
complexity and execution time of the forward pass of the neural network are given. Comparative
estimates of the complexity of the network using different optimization approaches are given.
The paper presents an analysis of the selected network architecture, and estimates of the computational
complexity of individual components (modules) of the architecture are obtained. An analysis
of possible optimization methods for each module was made. The parameters of the considered
modules, the sizes of the input and output tensors are described. Several architectures were tested
to optimize the feature extraction module, ResNet 50, ResNet 18, MobileNet v3 small, MobileNet
v3 large. A comparative analysis of the computational complexity and execution time of the forward
pass for each architecture is presented. Forward pass times were measured on Nvidia's
Jetson AGX Xaver embedded computing device. Estimates of the execution time of the direct pass
for each module of the considered neural networks are presented. The paper presents the results of
comparing neural network accuracy estimates before and after architecture optimization. The test
data set consists of 100 video recordings. 5 different typical objects are involved in test videos,
10 different scenarios are recorded for each object class. For each of the developed architectures,
accuracy estimates were obtained, and a comparative analysis was made.

References

1. Available at: https://www.votchallenge.net/vot2021/.
2. Bhat G., Danelljan M., Luc Van Gool Radu Timofte. Learning Discriminative Model Prediction
for Tracking, IEEE/CVF International Conference on Computer Vision (ICCV), 2019.
3. Kristan M., Matas J., Leonardis A., et. all. The Ninth Visual Object Tracking VOT2021 Challenge
Results. VOT2021 challenge workshop, ICCV workshops, 2021.
4. Kristan M., Matas J., Leonardis A., et. all. The Eighth Visual Object Tracking VOT2020 Challenge
Results. ECCV2020 workshops, 2020.
5. Howard et al A. Searching for MobileNetV3, 2019 IEEE/CVF International Conference on
Computer Vision (ICCV), 2019, pp. 1314-1324. DOI: 10.1109/ICCV.2019.00140.
6. Jiang B., Luo R., Mao J., Xiao T. Acquisition of Localization Confidence for Accurate Object
Detection. ECCV 2018.
7. Bhat G., Danelljan M., Khan F. IEEE/CVF Conference on Computer Vision and Pattern
Recognition (CVPR), 2019.
8. Shchelkunov A.E., Kovalev V.V., Morev K.I., Sid'ko I.V. Metriki otsenki algoritmov
avtomaticheskogo soprovozhdeniya [Metrics for evaluating automatic tracking algorithms],
Izvestiya YuFU. Tekhnicheskie nauki [Izvestiya SFedU. Engineering Sciences,] 2020, No. 1,
pp. 233-245.
9. Kasturi R., Goldgof D., Soundararajan P., Manohar V., Garofolo J., Bowers R., Boonstra M.,
Korzhova V., Zhang J. Framework for performance evaluation of face, text, and vehicle detection
and tracking in video: data, metrics, and protocol, TPAMI, 2009, Vol. 31, No. 2, pp. 319-326.
10. Vencel E.S., Ovcharov L.A. Probability theory and its engineering applications. Handbook for
universities. 2-nd. Мoscow: High school, 2000, 480 p.
11. Nawaz T., Cavallaro A. A protocol for evaluating video trackers under real-world conditions,
Image Processing IEEE, 2013, Vol. 22, No. 4, pp. 1354-1361
12. Yang M., Wu Y., Hua G. Context–aware visual tracking, IEEE transactions on pattern analysis
and machine intelligence, 2006, No. 31, pp. 1195-1209.
13. Stauffer C., Grimson W. Learning patterns of activity using real–time tracking, IEEE Transactions
on Pattern Analysis and Machine Intelligence, 2000, pp. 747-757.
14. Sundaresan A., Chellappa R. Multi–camera tracking of articulated human motion using shape
and motion cues, IEEE Transactions on Image Processing, 2009, pp. 2114-2126.
15. Henriques F., Caseiro R., Martins P., Batista J. High-Speed Tracking with Kernelized Correlation
Filters, TPAMI, 2014, Vol. 42, No. 5, pp. 345-362.
16. Frey J. Dueck D. Clustering by Passing Messages Between Data Points, Science Today, 2007,
Vol. 315, pp. 972-976.
17. Smith K., Gatica-Perez D., Odobez J. Evaluating Multi-Object Tracking, CVPR Work. IEEE,
2005, Vol. 3, pp. 32-36.
18. Black J., Ellis T., Rosin P. A novel method for video tracking performance evaluation,
VS-PETS, 2003, pp. 125-132.
19. Kao E., Daggett M., Hurley M. An information theoretic approach for tracker performance
evaluation, CVPR, 2009, pp. 1523-1529.
20. Bashir F. Porikli F. Performance Evaluation of Object Detection and Tracking Systems,
PETS, 2006, pp. 190-203.
21. Checka N., Wilson K., Rangarajan V., Darrell T. A probabilistic framework for multi-modal
multi-person tracking, Proceedings of the IEEE Workshop on Multi-Object Tracking
(WOMOT ’03), 2003, pp. 203-212.
Published
2022-04-21
Section
SECTION V. TECHNICAL VISION