CALCULATION OF THE NUMBER OF SOLUTIONS TO THE EQUATION OF THE FIRST MULTIPLICITY OF TYPES UNDER RESTRICTIONS ON THE FREQUENCY OF OCCURRENCE OF ALPHABET CHARACTERS

  • A.K. Melnikov
Keywords: Probability, statistics, exact distribution, an accurate approximation of the vector of multiplicity of types, linear equation algorithmic complexity

Abstract

The article considers the number of solutions to the equation of the first multiplicity of types,
composed of vectors of multiplicity of types, each element of which is the number of occurrences of
elements of a certain type (any sign of the alphabet) in the sample under consideration. The equation
of the first multiplicity of types relates the number of occurrences of elements of all types in
the sample under consideration and the volume of this sample. The main attention is paid to the
conclusion and proof of the correctness of the expression that determines the number of nonnegative
integer solutions of the equation of the first multiplicity of types under conditions of restrictions
on the frequency of occurrence of alphabet characters. The solution of the equation ofthe first multiplicity of types is the basis for calculating exact approximations of the probabilities
of statistical values by the first multiplicity method, where the exact approximations are Δexact
distributions that differ from the exact distributions by no more than a predetermined, arbitrarily
small value Δ. The value that expresses the number of solutions to the equation of the first multiplicity
of types is one of the values that determine the algorithmic complexity of the method of the
first multiplicity, without knowing the value of which it is impossible to determine the parameters
of samples for which, under restrictions on the computational resource, exact approximations of
distributions can be calculated. Also, the value expressing the number of solutions to the equation
of the first multiplicity of types is used in the method of the first multiplicity to limit the search area
for solutions to the equation. The number of solutions to the equation of the first multiplicity is
considered under conditions of restriction on the maximum value of the elements of the multiplicity
vector, and the case is considered when one or more elements of the alphabet may be missing in
the sample. First obtained the expression that defines the number of nonnegative integer solutions
to equations of the first multiplicity of types in terms of restrictions on the values of the frequencies
of occurrence of signs and the possibility of absence of one or more characters of the alphabet in
the sample reviewed. Analytical expressions are obtained that allow calculating the number of
integer nonnegative solutions of the equation of the first multiplicity of types for any values of the
alphabet power, the sample size, and the limit on the maximum frequency of occurrence of alphabet
characters. The form of the obtained expression allows you to use it when studying the algorithmic
complexity of calculating exact approximations of probability distributions of statistical
values with a pre-specified accuracy Δ.

References

1. Mel'nikov A.K. Slozhnost' rascheta tochnykh raspredeleniy veroyatnosti simmetrichnykh
additivno razdelyaemykh statistik i oblast' primeneniya predel'nykh raspredeleniy [The complexity
of calculating the exact probability distributions of symmetric additive-separated statistics
and the application of limit distributions], Doklady TUSUR [Proceedings of Tomsk State
University]. Tomsk, 2017, Vol. 20, No. 4, pp. 126-130. ISSN 1818-0442.
2. Mel'nikov A.K., Ronzhin A.F. Obobshchennyy statisticheskiy metod analiza tekstov,
osnovannyy na raschete raspredeleniy veroyatnosti znacheniy statistik [Generalized statistical
method of text analysis based on calculation of probability distribution of statistical values]
Informatika i ee primeneniya [Informatics and its applications], 2016, Vol. 10, Issue 4,
pp. 91-97. ISSN 1992-2264.
3. Melnikov A.K. Application of the calculation method of near-exact statistics probability distributions,
Obozrenie prikladnoy i promyshlennoy matematiki [Review of Applied and Industrial
Mathematics], 2018, Vol. 25, Issue 2, pp. 153-154. ISSN 0869-8325. Available at:
https://tvp.ru/conferen/vsppmXIX/repso050.pdf (accessed 24 January 2019).
4. Yudin D.B., Yudin A.D. Matematiki izmeryayut slozhnost' [Mathematicians measure complexity].
Moscow: Knizhnyy dom "LIBROKOM", 2018, 192 p. ISBN 978-5-397-06042-4.
5. Borovkov A.A. Matematicheskaya statistika [Mathematical statistics]. Novosibirsk: Izd-vo IM
SORAN, Nauka, 1997, 772 p.
6. Kolchin V.F., Sevast'yanov B.A., Chistyakov V.P. Sluchaynye razmeshcheniya [Random
placements]. M.: Nauka, Glavnaya redaktsiya fiziko-matematicheskoy literatury, 1976, 224 p.
7. Sachkov V.N. Vvedenie v kombinatornye metody diskretnoy matematiki [Introduction to combinatorial
methods of discrete mathematics]. Moscow: Nauka. Gl. red. fiz.-mat. lit., 1982, 384 p.
8. Zuev Yu.A. Sovremennaya diskretnaya matematika: Ot perechislitel'noy kombinatoriki do
kriptografii XXI veka [Modern discrete mathematics: From enumerative combinatorics to
cryptography of the XXI century]. Moscow: LENARD, 2019, 720 p. (Fundamentals of
Information security No. 17). ISBN 978-5-9710-5660-7.
9. Zuev Yu.A. Sovremennaya diskretnaya matematika: Ot perechislitel'noy kombinatoriki do
kriptografii XXI veka. Bolee 700 zadach s resheniyami [Sovremennaya diskretnaya
matematika: Ot perechislitel'noy kombinatoriki do kriptografii XXI veka]. Moscow:
LENARD, 2019, 304 p. (Osnovy zashchity informatsii No. 18). ISBN 978-5-9710-5662-1.
10. Kholl M. Kombinatorika [Combinatorial theory]. Moscow: Mir, 1970, 424 p.
11. http://www.problems.ru/view_problem_details_new.php?id=30719.
12. Endryus G. Teoriya razbieniy [The theory of partitions]: transl. from engl. Moscow: Nauka.
Gl. red. fiz.-mat. lit., 1982, 256 p.
13. Dzh. Riordan. Vvedenie v kombinatornyy analiz [Introduction to combinatorial analysis].
Moscow.: Izd-vo inostrannoy lit., 1963, 288 p.
14. Dzh. Riordan. Kombinatornye tozhdestva [Combinatorial identities]. Moscow: Nauka. Gl. red.
fiz.-mat. lit., 1982, 255 p.
15. Fisher R.A. Statisticheskie metody dlya issledovateley [Statistical methods for research workers]:
transl. from engl. Moscow: Gosstatizdat, 1958, 73 p.
16. Mel'nikov A.K. Otnositel'naya algoritmicheskaya slozhnost' rascheta tochnykh priblizheniy
raspredeleniy veroyatnostey znacheniy statistik [Relative algorithmic complexity of exact approximations
for probability distributions of statistics values], Modeli, metody i tekhnologii
intellektual'nogo upravleniya (IU-2019): Mater. 12-y Mul'tikonferentsii po problemam
upravleniya (MKPU-2019) [Models, methods and technologies of intelligent management (IU-
2019): Materials of the 12th Multi-Conference on Management Problems (ICPU-2019)]: i
n 4 vol. Vol. 1. Rostov-on-Don; Taganrog: Izd-vo YuFU, 2019, pp. 108-112. ISBN 978-5-
9275-3189-9.
17. Mel'nikov A.K. Otnositel'naya algoritmicheskaya slozhnost' rascheta tochnykh priblizheniy
raspredeleniy veroyatnostey znacheniy statistik [Relative algorithmic complexity of exact approximations
for probability distributions of statistics values], Izvestiya YuFU. Tekhnicheskie
nauki [Izvestiya SFedU. Engineering Sciences], 2019, No. 7 (209), pp. 35-45.
18. Svidetel'stvo o gosudarstvennoy registratsii programmy dlya EVM № 2018664980. Raschet
veroyatnostey znacheniy statistiki maksimal'noy chastoty. Pravoobladatel' Mel'nikov A.K.
Avtor: Mel'nikov A.K. i Zelyukin N.B. Zayavka № 2018662123. Data postupleniya 01
noyabrya 2018 g. Data gosudarstvennoy registratsii v Reestre programm dlya EVM 27noyabrya 2018 g. [Certificate of state registration of computer software № 2018664980. Calculation
of probabilities of maximum frequency statistics. Proprietor – A.K. Melnikov. Authors: A.K.
Melnikov and N.B. Zeliukin. Application № 2018662123. Date of filing – November 01, 2018.
Date of state registration in the Register of computer software – November 27, 2018].
19. Mel'nikov A.K. Metodika rascheta raspredeleniya veroyatnostey znacheniy simmetrichnykh
additivno razdelyaemykh statistik, priblizhennykh k ikh tochnomu raspredeleniyu [A method
for calculating the probability distribution of values of symmetric additively separated statistics
that are close to their exact distribution], Nauchnyy vestnik NGTU [Science bulle-tin of the
Novosibirsk state technical university], 2018, No. 1 (70), pp. 153-166. ISBN 1814-1196. Doi:
10.17212/1814-1196-2018-1-153-166.
20. Hutchinson T.P. 1979. The validity of the chi-squared test when expected frequencis are small:
A list of recent research references, Commun. Stat. A Theor., Vol. 8, No. 4, pp. 327-335.
Published
2021-02-25
Section
SECTION II. MATHEMATICAL AND SYSTEM SOFTWARE FOR SUPERCOMPUTERS