DATA CLUSTERING ALGORITHM FOR PROTECTING CONFIDENTIAL INFORMATION ON THE INTERNET
Keywords:
Information security, confidential information, clustering, cloud model, heuristic algorithmAbstract
The article is devoted to solving the scientific problem of protecting confidential information
in the Internet based on the algorithm for clustering significant amounts of data. The protection of
a computer network confidential information is a hot topic for research, especially in connection
with the growing use of information technology and the increase in data of valuable information
stored in the Internet. With the growth of information responsibility, the need for effective methods
of computer networks information security has become critical. In this scientific article, the authors
propose a solution to the problem of protecting computer networks confidential information
based on the big data clustering algorithm. Traditional intrusion detection methods have limitations
such as the ability to work only with one- or two-dimensional data, and also have a strong
reliance on prior knowledge. To eliminate these limitations, the authors propose a heuristic intrusion
detection algorithm that uses clustering based on a cloud model. The proposed algorithm
takes advantage of both labeled and unlabeled samples for data clustering, thereby reducing reliance
on a priori knowledge. The results of a computational experiment carried out on the proposed
algorithm were compared with several canonical intrusion detection algorithms. The results
showed that the proposed algorithm improved the performance of the intrusion detection system,
increased the accuracy of detection, reduced the false alarm rate, and enhanced the reliability of
the system. The dynamic weighting method used in the algorithm removed the complexity of highlevel
data processing and allowed the algorithm to learn itself, resulting in a relatively stable
cloud model. Despite the significant improvement in the performance of the proposed algorithm
compared to the canonical clustering algorithms, the results of the study also showed that the
algorithm has some limitations, such as a high false positive rate and sensitivity to data with certain
types of distribution. To eliminate these shortcomings, further improvement of the algorithm is
required. In general, the proposed heuristic clustering intrusion detection algorithm based on the
cloud model is a promising solution for protecting computer networks confidential information.








