Article

Article title THE SOLUTION OF THE SYMBOLIC REGRESSION PROBLEM BY GENETIC SEARCH METHODS
Authors B.K. Lebedev, V.B. Lebedev, O.B. Lebedev
Section SECTION III. MODELING AND DESIGN
Month, Year 02, 2015 @en
Index UDC 681.325
DOI
Abstract The paper discusses new approaches to solving the problem of multiple nonlinear symbolic regression based on the ideas of genetic programming. The solution presented in the form of three chromosomes. Suggests ways to represent trees with arbitrary local degree of the vertices in the form of linear recording. Developed the structure and principles of encoding and decoding of chromosomes, which carry information about the tree structure and having a homologous structure. The structure of a binary tree can be set using Polish expression on the basis of the alphabet A = {O, }. Defined the basic properties of Polish expressions, which are required to be consistent with a binary tree. We propose a linear algorithm to reconstruct a tree in Polish expression. The structure and principles of encoding and decoding a chromosome to represent the Polish expression. The structure and principles of linear recording for the hierarchical tree without constraints on the local degree of internal vertices. Based on the analysis of defined properties such records. The structure and encoding of a chromosome to represent a tree record developed taking into account the above properties. The main genetic operators are crossover and mutation. In the above-described structure of chromosomes genes located in the same loci are homologous. Mutation is the adoption of the genome random values from a specified range of values for the gene in this locus. Implementation of the crossover is performed by exchanging homologous pairs of genes. The layout of the terminal and functional set of nodes in the tree is determined by two additional chromosomes. A distinctive feature of the ways to represent trees in a linear recording exclude the possibility of the loss of the elements of the terminal set, but the model can be arbitrary superposition of functions from some set. Multiple chromosome presenting solutions has allowed us to create a hierarchical structure of genetic operators that allows you to organize targeted searches and expands the possibilities and range of tasks symbolic regression. For large dimensions the temporary performance of the developed algorithm are higher than those of the compared algorithms with the best values of the objective function. For large dimensions the temporary performance of the developed algorithm are higher than those of the compared algorithms with the best values of the objective function. Program solving the problem of multiple nonlinear symbolic regression based on the ideas of genetic programming was implemented in C++ for IBM/PC. Experimental time complexity of the algorithm for one iteration with fixed values of control parameters is O(nlgn), and time complexity of the existing algorithms is O(n2), where n is a power terminal set.

Download PDF

Keywords Regression analysis; multiple nonlinear symbolic regression; method of least squares genetic programming; Polish expression; the hierarchical tree; the structure and coding of the chromosomes; the genetic operators.
References 1. Han J., Kamber M. Data mining: Concepts and Techniques. Morgan Kaufmann Publishers, 2001.
2. Ian H. Witten, Eibe Frank and Mark A. Hall Data Mining: Practical Machine Learning Tools and Techniques. 3rd ed. Morgan Kaufmann, 2011.
3. Radchenko S.G. Metodologiya regressionnogo analiza: Monografiya [Methodology regression analysis: Monograph]. K.: Korniychuk, 2011, 376 p.
4. Dreyper N., Smit G. Prikladnoy regressionnyy analiz [Applied regression analysis]. Moscow: Izdatel'skiy dom «Vil'yams», 2007, 912 p.
5. Strizhov V.V., Krymova E.A. Metody vybora regressionnykh modeley [Methods selection of regression models]. Moscow: VTs RAN, 2010, 60 p.
6. Barsegyan A.A., Kupriyanov M.S., Stepanenko V.V., Kholod I.I. Metody i modeli analiza dannykh: OLAP i Data Mining [Methods and models of data analysis: OLAP and Data Mining]. St. Petersburg: BKhV-Peterburg, 2004, 336 p.
7. Sergienko V.I., Bondareva I.B. Matematicheskaya statistika v klinicheskikh issledovaniyakh [Mathematical statistics in clinical research]. 2nd ed. Moscow: GEOTAR-Media, 2006, 304 p.
8. Konar A. Artificial intelligence and soft computing: behavioral and cognitive modeling of the human brain. CRC Press LLC. Boca Raton, Florida, 2000.
9. Lavan'ini I., Man'o F., Seral'ya R., Tral'di P. Kolichestvennye metody v mass-spektrometrii [Quantitative methods in mass spectrometry]. Moscow: Tekhnosfera, 2008, 176 p.
10. Lebedev V.B., Lebedev O.B. Roevoy intellekt na osnove integratsii modeley adaptivnogo povedeniya murav'inoy i pchelinoy koloniy [Swarm intelligence on the basis of the adaptive behaviour models integration of the ant and beer colonies], Izvestiya YuFU. Tekhnicheskie nauki [Izvestiya SFedU. Engineering Sciences], 2013, No. 7 (144), pp. 41-47.
11. Lebedev B.K., Lebedev O.B. Modelirovanie adaptivnogo povedeniya murav'inoy kolonii pri poiske resheniy, interpretiruemykh derev'yami [Modelling of an ant colony adaptive behaviour by search of the decisions interpreted by trees], Izvestiya YuFU. Tekhnicheskie nauki [Izvestiya SFedU. Engineering Sciences], 2012, No. 7 (132), pp. 27-35.
12. Alpert C.J., Mehta D.P., and Sapatnekar S.S. Handbook of Algorithms for Physical Design Automation. Boston, MA: Auerbach, 2009.
13. Koza J.R. Hierarchical genetic algorithms operating on populations of computer programs. In N.S. Sridharan (Ed.), Eleventh International Joint Conference on Artificial Intelligence. Morgan Kaufmann, 1989, pp. 768-774.
14. Koza J.R. Genetic Programming: On the Programming of Computers by means of Natural Selection. Cambridge MA, MIT Press, 1992.
15. Kureichik V.M., Lebedev B.K., Lebedev O.B. A hybrid partitioning algorithm based on natural mechansms of decision making, Scientific and Technical Information Processing, 2012, No. 39 (6), pp. 317-327.
16. Lebedev B.K. and. Lebedev V.B. Synthesis of Mathematical Expressions by Methods of Genetic Search, Proceedings of the International Scientific Conferences “Intelligent Systems (IEEE AIS’06)”and “Intelligent CAD’s (CAD- 2006)”. Scientific publication in 3 volumes. Moscow: Physmathlit, 2006, Vol. 3, pp. 29-34.
17. Lebedev B.K., Lebedev V.B. Evolyutsionnaya protsedura obucheniya pri raspo-znavanii obrazov derev'yami [Evolutionary procedure learning in pattern recognition trees], Izvestiya TRTU [Izvestiya TSURe], 2004, No. 8 (43), pp. 83-88.
18. Kureichik V.M., Lebedev B.K. and Lebedev V.B. VLSI Floorplanning Based on the Integration of Adaptive Search Models. ISSN 1064_2307, Journal of Computer and Systems Sciences International, 2013, Vol. 52, No. 1, pp. 80-96.

Comments are closed.