Skip to main content Skip to main navigation menu Skip to site footer
##common.pageHeaderLogo.altText##
Izvestiya SFedU
Engineering sciences
  • Current
  • Previous issues
    • Archive
    • Issues 1995 – 2019
  • Editorial Board
  • About journal
    • Officially
    • The main tasks
    • Main sections
    • Specialties of the Higher Attestation Commission of the Russian Federation
    • Editor-in-Chief
Русский
ISSN 1999-9429 print
ISSN 2311-3103 online
  • Login
  1. Home /
  2. Search

Search

Advanced filters
Published After
Published Before

Search Results

Found 6 items.
  • ANALYTICAL REVIEW OF THE DECISION TREE ALGORITHM IN DATA INTELLIGENCE TECHNOLOGY

    E.V. Kuliev, V.A. Semenov, A.V. Kotelva, S.V. Ignateva
    2022-05-26
    Abstract ▼

    The decision algorithm is the preferred filtering algorithm in data mining technology, and
    its results are usually chosen in the form of "if-then" rules. Algorithm C4.5 is one of the decision
    algorithms that takes advantage of the ease of understanding and increasing importance, and also
    takes advantage of the advanced information rate gain of its advanced ID3 algorithm. After the
    theoretical analysis of the information, the algorithm C4.5 is selected to analyze the results of
    performance appraisal, and enterprise performance appraisal decisions by collecting data, preprocessing
    data, calculating information gain and determining selection parameters. The system isdeveloped in B/S architecture, an R&D project management platform that can perform evaluation
    analysis with decision analysis results evaluation tools and web coverage. The system includes
    information storage, task management, reporting, receipt and presentation control, information
    visualization and other functions of the management information system functions. They can realize
    project management functions, such as creating and managing a project, flow tasks, filling and
    managing information about functions, creating a performance evaluation system, creating reports
    of various sizes, building management. decision decision algorithm as the core technology,
    the system acquires scientific significant project management information with high data accuracy,
    and realizes visualization, which can help the enterprise to have a good management system in
    large areas. Task management, reporting, audit control, information visualization and other functions
    of the system's management reporting management functions are included.

  • AGGLOMERATIVE CLUSTERIZATION ALGORITHMS FOR THE PROBLEMS OF ANALYSIS OF LINGUISTIC EXPERT INFORMATION

    F.S. Bulyga, V.M. Kureichik
    2022-01-31
    Abstract ▼

    This article discusses and presents the main problems and principles of the data clustering
    process, in particular, the principles and tasks of clustering text arrays of linguistic expert information.
    In the course of this work, the main difficulties arising in the design of such systems were
    identified, for example: the need for preprocessing data, reducing the size of the initial sample,
    etc. To effectively perform the presented tasks, the implemented solution must have an integrated
    approach that takes into account the efficiency indicators of methods aimed at solving individual
    subtasks, as well as the ability to provide high efficiency indicators for the implementation of each
    stage of the clustering process. In the presented work, various groups of hierarchical clustering
    algorithms are considered, in particular, a subgroup of agglomerative clustering algorithms was
    considered in relation to the problems of clustering linguistic expert information. In the described
    work, a formal statement of the text clustering problem is given, and the main group of implemented
    solutions based on the principles of agglomerative clustering is determined: ROCK, CURE,
    CHAMELEON. A detailed review of each of the presented algorithms is carried out, and the main
    advantages and disadvantages of each of them are formulated. The advantage of this work can be
    considered the totality of the presented data on the algorithms, as well as the results of a comparative analysis, which make it possible to further assess the feasibility and potential probability of
    using these solutions from the presented group of agglomerative clustering algorithms. The novelty
    of this work lies in the formation of an overview analysis of existing approaches in the field of
    hierarchical clustering for solving the problems of cluster analysis of linguistic expert information,
    as well as the formation of the results of the comparative analysis of the considered algorithms.

  • SOLUTION OF THE PROBLEM OF INTELLECTUAL DATA ANALYSIS BASED ON BIOINSPIRED ALGORITHM

    E.V. Kuliev, D.Y. Zaporozhets, Y.A. Kravchenko, М.М. Semenova
    2022-01-31
    Abstract ▼

    The article discusses a bioinspired algorithm for solving the problems of intellectual analysis.
    The integration of bioinspired algorithms for solving data mining problems is a promising
    area of research. As a bioinspired algorithm, an algorithm based on the adaptive behavior of an
    ant colony is considered. The ant colony algorithm allows for a high-quality search for promising
    solutions to obtain optimal and quasi-optimal solutions. The algorithm has the ability to search for
    suitable logical conditions. The ant colony algorithm is based on the example of the behavior of
    living ants in nature. Ants are able to find the shortest solution by adapting to changes in the environment.
    The authors proposed a modified ant colony algorithm for solving the problem of data
    mining. The clustering problem was chosen as the task of data mining. Clustering is a combining
    of similar objects into groups, is one of the fundamental tasks in the field of data analysis and
    Data Mining. The list of application areas where it is applied is wide: image segmentation, marketing,
    anti-fraud, forecasting, text analysis and many others. The solution to this problem is of particular relevance in the context of the constantly growing volume of generated, transmitted and
    processed data. Classical clustering methods are optimized by combining with the proposed
    bioinspired optimization algorithm - the ant algorithm. The proposed method is a model in which
    ants are represented as agents that randomly move in the solution space with some restrictions
    (for example, obstacles in their path). To determine the effectiveness of the developed modified ant
    algorithm (ALA) with the clustering algorithm, the authors carried out a series of computational
    experiments. For comparison, we took the genetic algorithm, the monkey algorithm and the wolf
    algorithm. The simulation results prove that the clustering-based ant algorithm gives better results
    than other proposed algorithms.

  • TEXT VECTORIZATION USING DATA MINING METHODS

    Ali Mahmoud Mansour , Juman Hussain Mohammad, Y. A. Kravchenko
    2021-07-18
    Abstract ▼

    In the text mining tasks, textual representation should be not only efficient but also interpretable,
    as this enables an understanding of the operational logic underlying the data mining
    models. Traditional text vectorization methods such as TF-IDF and bag-of-words are effective and
    characterized by intuitive interpretability, but suffer from the «curse of dimensionality», and they
    are unable to capture the meanings of words. On the other hand, modern distributed methods effectively
    capture the hidden semantics, but they are computationally intensive, time-consuming,
    and uninterpretable. This article proposes a new text vectorization method called Bag of weighted
    Concepts BoWC that presents a document according to the concepts’ information it contains. The
    proposed method creates concepts by clustering word vectors (i.e. word embedding) then uses the
    frequencies of these concept clusters to represent document vectors. To enrich the resulted document
    representation, a new modified weighting function is proposed for weighting concepts based
    on statistics extracted from word embedding information. The generated vectors are characterized
    by interpretability, low dimensionality, high accuracy, and low computational costs when used in
    data mining tasks. The proposed method has been tested on five different benchmark datasets in
    two data mining tasks; document clustering and classification, and compared with several baselines,
    including Bag-of-words, TF-IDF, Averaged GloVe, Bag-of-Concepts, and VLAC. The results
    indicate that BoWC outperforms most baselines and gives 7 % better accuracy on average

  • INTELLIGENT DATA ANALYSIS IN ENTERPRISE MANAGEMENT BASED ON THE ANNEALING SIMULATION ALGORITHM

    E.V. Kuliev, А.V. Kotelva, М.М. Semenova, S.V. Ignateva, А.P. Kukharenko
    2022-11-01
    Abstract ▼

    The article considers an analytical review of the annealing simulation algorithm for the
    problem of efficient enterprise management. The optimization of the annealing simulation algorithm
    for the problem of efficient enterprise management has been carried out. For the analysis of
    cases, the optimization of the work schedule of workers in the organization was used. Established
    worker scheduling model with strong and weak constraints. The simulated annealing algorithm is
    used to optimize the strategy for solving the staff scheduling model. The simulated annealing algorithm
    is an algorithm suitable for solving large-scale combinatorial optimization problems. It also
    evaluates and obtains the optimal scheduling strategy. The simulated annealing algorithm has a
    good effect on the data mining of human resource management. Big data mining can help companies
    conduct dynamic analysis in talent recruitment, and the talent recruitment plan is carried out
    in a quality and standard way to analyze the characteristics of various talents from many angles
    and improve the level of human resource management. An algorithm has been developed that implements
    the operation of the annealing simulation algorithm. The simulated annealing algorithm
    makes new decisions based on the Metropolis criterion, so in addition to making an optimized
    decision, it also makes a reduced decision in a limited range. The Metropolis algorithm is a sampling
    algorithm mainly used for complex distribution functions. It is somewhat similar to the variance
    sampling algorithm, but here the auxiliary distribution function changes over time. Experimental
    studies have been carried out that show that a worker scheduling model based on strong
    and weak constraints is significantly better than a manual scheduling model, achieving an effective
    balance between controlling wage costs in an organization and increasing employee satisfaction.
    The successful application of a workforce scheduling model based on a simulated annealing
    algorithm brings new insights and insights to solve large-scale worker scheduling problems.
    The results presented can serve as a starting point for studying personnel management systems
    based on data mining technology.

  • MONITORING OF THE EDUCATION QUALITY AND IMPLEMENTING OF INDIVIDUAL LEARNING: DEMONSTRATION OF APPROACHES AND EDUCATIONAL DATA MINING ALGORITHMS

    Yass Khudheir Salal , S. M. Abdullaev
    2020-10-11
    Abstract ▼

    The quality monitoring system for traditional and distance education requires the development
    of machine learning classification and quantification techniques necessary to predict individual
    and collective student performance. This article theoretically and experimentally shows that
    the most promising approach that simultaneously solves both forecast tasks is to create heterogeneous
    ensembles consisting of an odd number of different base classifiers, such as decision trees,
    simple neural networks, naive Bayesian classifier and others. By training and testing 11 different
    binary classifiers on six different samples of educational data, we show that the individual determined
    forecast of such ensembles exceeds the accuracy of forecasts of both individual base classifiers
    and homogeneous ensembles created by bagging and busting technologies. The advantage of
    heterogeneous ensembles is decisive when we deal with the imbalance of sample characteristic ofeducational data. In these cases, only the forecasts with accuracies exceeding the relative frequency
    of the class of objects dominating in the sample of data can be considered as useful forecasts.
    The main advantage of the heterogeneous ensemble is the ability to transform the deterministic
    forecast into a probabilistic forecast, when instead of referring the object to a particular class, the
    probability of its belonging to individual classes is given. On this basis, we have proposed a new
    method of binary quantification, where individual probabilities of belonging to each of the classes
    of objects are summed up separately, and the resulting total probabilities are interpreted as relative
    frequencies of objects in the sample. As a result of experiments, it is shown that such ensemble
    binary quantification is significantly superior to the traditional "classify and count" method.

1 - 6 of 6 items

links

For authors
  • Submit article
  • Author Guidelines
  • Editorial Policy
  • Reviewing
  • Ethics of scientific publications
  • Open access policy
  • Supporting documents
Language
  • English
  • Русский

journal

* not an advertisement

index

Индексация журнала
* not an advertisement
Information
  • For Readers
  • For Authors
  • For Librarians
Address: 347900, Taganrog, Chekhov St., 22, A-211 Phone: +7 (8634) 37-19-80 E-mail: iborodyanskiy@sfedu.ru
Publication is free
More information about the publishing system, Platform and Workflow by OJS/PKP.
logo Developed by RDCenter