ANALYSIS OF ENCRYPTED NETWORK TRAFFIC BASED ON ENTROPY CALCULATION AND APPLICATION OF NEURAL NETWORK CLASSIFIERS
Abstract
Network traffic analysis allows you to solve many problems, such as: determining the pattern
of data transmission over the network, collecting statistics on the use of web applications,
monitoring and further researching network load, identifying potential malicious software and
network attacks, etc. 40% of Internet traffic belongs to unknown applications. This suggests that
for the area of network traffic analysis, the task of classifying applications has acquired particular
importance. Improvements in software in the field of network technologies have contributed to the
discovery of serious vulnerabilities in the implementation of some network protocols, namely TCP
and HTTP. By using network traffic analyzers, an attacker gained access to the contents of data
packets transmitted over the network. However, with the increasing qualifications of the information
community in the field of computer security, as well as with the development of network
technology standards, the analysis of network traffic has become noticeably more complicated.
The increased use of mathematical methods for protecting information, such as symmetric and
asymmetric cryptographic protocols, has led to the fact that most approaches to the analysis of
network traffic have lost their meaning and are no longer used. Therefore, the search for new
solutions to the problem of classifying network traffic, taking into account the possibility of its
encryption, is relevant. The article is devoted to the description of a new mixed approach to the
analysis of network traffic, based on the combined use of information theory and machine learning
algorithms. It also provides a comparative analysis of the proposed method with existing approaches
based on both information theory and machine learning. The aim of the research is to
develop an algorithm based on an intelligent approach to the analysis of network traffic. The proposed
algorithm is based on calculating entropy and using neural network classifiers. Research
objectives include: theoretical substantiation of the proposed approach in the field of information
theory, as well as machine learning algorithms; carrying out a structural description of the implemented
algorithms for calculating entropy and classifying applications that generate encrypted
traffic; comparative analysis of the proposed algorithm with existing approaches to the analysis of
encrypted network traffic. The result of the research is a new algorithm that allows classifying
various types of encrypted traffic with a high degree of reliability.
References
Tel'nova Yu.F. Moscow: Yuniti, 2017, 544 p.
2. Model' ugroz i narushitelya bezopasnosti personal'nykh dannykh, obrabatyvaemykh v spetsial'nykh
informatsionnykh sistemakh personal'nykh dannykh otrasli. Ministerstvo svyazi i massovykh
kommunikatsiy Rossiyskoy Federatsii [The model of threats and violators of the security of personal
data processed in special information systems of personal data of the industry. Ministry of Communications
and Mass Media of the Russian Federation]. Moscow, 2019. Available at:
http://minsvyaz.ru/common/upload/publication/1410084of.pdf (accessed 4 December 2020).
3. Bukovshin V.A., Boldyrikhin N.V. Sravnitel'noe issledovanie tekhnologiy analiza intensivnosti
setevogo trafika [Comparative study of network traffic intensity analysis technologies], Tr.
Severo-Kavkazskogo filiala Moskovskogo tekhnicheskogo universiteta svyazi i informatiki
[Proceedings of the North Caucasus Branch of the Moscow Technical University of Communications
and Informatics]. Rostov-on-Don, 2019, pp. 104-107.
4. Tatarnikova T.M. Statisticheskie metody issledovaniya setevogo trafika [Statistical methods of
network traffic research], Informatsionno-upravlyayushchie sistemy [Information and control
systems], 2018, No. 5 (96), pp. 35-43.
5. Altunin F.A., Knosal' V.M., Davydov R.V., Boldyrikhin N.V. Analiz metodov klassifikatsii
trafika [Analysis of traffic classification methods] Tr. Severo-Kavkazskogo filiala
Moskovskogo tekhnicheskogo universiteta svyazi i informatiki [Proceedings of the North Caucasus
Branch of the Moscow Technical University of Communications and Informatics]. Rostov-
on-Don, 2017, pp. 23-27.
6. Callado A., Kamienski C., Szabo G., Gero B., Kelner J., Fernandes S., Sadok D.A. Survey on
Internet Traffic Identificationб, Communications Surveys & Tutorials, IEEE, 3rd Quarter
2009, Vol. 11, Issue 3, pp. 37-52.
7. Kruglov V.V., Borisov V.V. Iskusstvennye neyronnye seti: teoriya i praktika [Artificial neural
networks: theory and practice]. Moscow: NITS INFRA-M, 2017, 283 p.
8. Kruglov V.V., Borisov V.V. Nechyotkaya logika i iskusstvennye neyronnye seti [Fuzzy logic
and artificial neural networks]. Moscow: NITS INFRA-M, 2016, 233 p.
9. Rutkovskaya D.A., Pilin'skiy M.V., Rutkovskiy L.A. Neyronnye seti, geneticheskie algoritmy i
nechyotkie sistemy [Neural networks, genetic algorithms, and fuzzy systems]. Moscow: DMK
Press, 2018, 512 p.
10. Stiven Nortkat, Dzhudi Novak. Obnaruzhenie narusheniy bezopasnosti v setyakh [Detection of
security violations in networks]. 3rd ed.: transl. from engl. Moscow: Izdatel'skiy dom
«Vil'yams», 2017, 448 p.
11. Medvedovskiy I.D., Sem'yanov P.V., Leonov D.G. Ataka na Internet [Attack on the Internet].
Moscow: DMK Press, 2017, 332 p.
12. Skudis E. Protivostoyanie khakeram [Opposition to hackers]. Moscow: DMK Press, 2003, 506 p.
13. Shiguo L. One-way hash function based on neural network. Department of Automation, Nanjing
University of Science & Technology, 2017. Available at: https://arxiv.org/abs/0707.4032
(accessed 4 December 2020).
14. Moore A., Zuev D., and Crogan M. Discriminators for use in flow-based classification. Department
of Computer Science Research Reports. RR-05-13, 2019. Available at:
https://www.cl.cam.ac.uk/~awm22/publication/moore2005discriminators.pdf (accessed 4 December
2020).
15. Alshammari R., Zincir-Heywood A.N. Can encrypted traffic be identified without port numbers, ip
addresses and payload inspection?, Computer networks, 2011, Vol. 55 (6), pp. 1326-1350.
16. Hinton G. et al. Deep neural networks for acoustic modeling in speech recognition: The shared
views of four research groups, IEEE Signal Processing Magazine, 2012, Vol. 29 (6), pp. 82-97.
17. Sun Q. et al. Statistical identification of encrypted web browsing traffic, in Proc, Conference:
Security and Privacy, 2002.
18. Paninski L. Estimation of entropy and mutual information, Neural Computation, 2003,
Vol. 15, pp. 1191-1253.
19. Gil G.D. et al. Characterization of encrypted and VPN traffic using time-related features, Conference:
The International Conference on Information Systems Security and Privacy. At: Italy,
Vol. 2016.
20. Sasaki Y. The truth of the F-measure, School of Computer Science, University of Manchester
MIB, 2007. Available at: https://www.toyota-ti.ac.jp/Lab/Denshi/COIN/people/yutaka.sasaki/
F-measure-YS-26Oct07.pdf (accessed 4 December 2020).
21. Tharwat A. Classification assessment methods, Faculty of Computer science and engineering,
Frankfurt university of applied science, 2018. Available at: https://www.emerald.com/ insight/
content/doi/10.1016/j.aci.2018.08.003/full/pdf?title=classification-assessment-methods
(accessed 4 December 2020).
22. Derczynski L. Complementarity, F-score and NLP evaluation, University of Sheffield, Proceedings
of the Tenth International Conference on Language Resources and Evaluation
(LREC'16), 2016. Available at: https://www.aclweb.org/anthology/L16-1040.pdf (accessed 4
December 2020).
23. Shennon K. Raboty po teorii informatsii i kibernetike [Works on information theory and cybernetics].
Moscow: Izd-vo inostr. lit., 1963, 829 p.