METHODOLOGY OF TOPOLOGICAL RESTRICTIONS FOR INTENSIVELY USED FPGA RESOURCE

  • К.N. Alekseev “Supercomputers and Neurocomputers Research Center” Co Ltd
  • DА. Sorokin “Supercomputers and Neurocomputers Research Center” Co Ltd
  • А.L. Leont'ev “Supercomputers and Neurocomputers Research Center” Co Ltd
Keywords: Reconfigurable computer systems, FPGA, CAD, Physical Constraints, Placement Constraints, Timing Closure

Abstract

In the paper we consider the problem of achieving high real performance of reconfigurable
computer systems in implementing computationally expensive tasks from various problem areas.
The parameters of the programs executed on reconfigurable systems determine their real performance.
The main component of these programs is the computing data processing structures implemented
as FPGA configuration files. At the same time, one of the key parameters of any computing
structure is the clock frequency of its operation, which directly affects its performance. However,
there are several problems concerning the achievement of high clock rates, and they cannot be solved
with the help of modern CAD tools. The reason is the non-optimal topological placement of functional
blocks of the computing structure within the field of FPGA primitives, especially with high resource
utilization. Due to this, the load on the FPGA switching matrix is increasing, and, as a result,
the connections among functionally dependent FPGA primitives turn out to be much longer than is
acceptable. In addition, excessive connection length is observed when tracing connections among
primitives that are placed on different FPGA chips or are physically separated by on-chip peripherals.
In the paper we describe a methodology which provides optimization of the placement of computing
structure elements on FPGA primitives, and minimizes the length of traces among primitives,
and also minimizes the number of traces among physically separated FPGA topological sections.
To prove the proposed methodology, we implemented the test task "FIR-filter" on a reconfigurable
computer "Tertius." We have demonstrated the main problems concerning reaching the target clock
rate and have described a method for their solution. Owing to our methodology, it is possible to
increase the clock rate; hence, the performance of Tertius will increase by 25% without revising
the functional circuit of the task’s computing structure. According to our current research of the
suggested methodology and its efficiency, we claim that CAD tools, used for creating topological
restrictions and based on our methodology, will significantly reduce the time for developing programs
with the required characteristics for reconfigurable computer systems.

References

1. Kalyaev A.V., Levin I.I. Modul'no-narashchivaemye mnogoprotsessornye sistemy so
strukturno-protsedurnoy organizatsiey vychisleniy [Modular-stackable multiprocessor systems
with structural and procedural organization of computing]. Moscow: Yanus-K, 2003. 380 p.
2. Kalyaev I.A., Levin I.I., Semernikov E.A., Shmoylov V.I. Rekonfiguriruemye mul'tikonveyernye
vychislitel'nye struktury [Reconfigurable multiconveyor computing structures]. 2nd ed., rev.
and suppl., under the general ed. of I.A. Kalyaeva. Rostov-on-Don: Izd-vo YuNTS RAN,
2009, 344 p. ISBN 978-5-902982-61-6.
3. Alekseev K.N., Sorokin D.A., Leont'ev A.L. Metod upravleniya razmeshcheniem elementov
vychislitel'noy struktury pri maksimal'noy utilizatsii resursov PLIS [The method of managing
the placement of elements of the computing structure with maximum utilization of FPGA resources],
XIV Vserossiyskaya mul'tikonferentsiya po problemam upravleniya (MKPU-2021):
Mater. XIV mul'tikonferentsii (Divnomorskoe, Gelendzhik, 27 sentyabrya – 2 oktyabrya 2021
g.) [XIV All-Russian Multi–conference on Management Problems (MCPU-2021): Materials of
the XIV multi-conference (Divnomorskoe, Gelendzhik, September 27- October 2, 2021)]: In 4
vol. Vol. 2. Ed. board: I.A. Kalyaev, V.G. Peshekhonov, etc. Rostov-on-Don; Taganrog: Izdvo
YuFU, 2021. ISBN 978-5-9275-3846-1, pp. 238-240.
4. AMD Xilinx Vivado Overview. Available at: https://www.xilinx.com/products/designtools/
vivado.html (accessed 18 May 2022).
5. Intel Quartus Prime Software Suite Overview. Available at: https://www.intel.com/content/
www/us/en/software/programmable/
quartus-prime/overview.html (accessed 18 May 2022).
6. Alekseev K., Levin I., Sorokin D. Implementation of surface-related multiple prediction task on
reconfigurable computer systems, Bulletin of the South Ural State University. Series: Mathematical
Modelling, Programming and Computer Software, 2020, No. 13 (1), pp. 81-94.
7. Alekseev K.N., Sorokin D.A., Matrosov A.Yu., Semernikova E.E. Strukturno-protsedurnaya
realizatsiya algoritma prognozirovaniya kratnykh voln na PLIS [Structural and procedural implementation
of the algorithm for predicting multiple waves on FPGA], Izvestiya YuFU.
Tekhnicheskie nauki [Izvestiya SFedU. Engineering Sciences], 2016, No. 12, pp. 16-28.
8. Alekseev K.N., Levin I.I. Realizatsiya obratnoy kinematicheskoy zadachi seysmorazvedki dlya
mikroseysmicheskogo monitoringa na rekonfiguriruemykh vychislitel'nykh sistemakh v
real'nom masshtabe vremeni [Implementation of the inverse kinematic problem of seismic exploration
for microseismic monitoring on reconfigurable computing systems in real time],
Izvestiya YuFU. Tekhnicheskie nauki [Izvestiya SFedU. Engineering Sciences], 2018, No. 8
(202), pp. 221-231.
9. Levin I.I. Pelipets A.V. Effektivnaya realizatsiya rasparallelivaniya na rekonfiguriruemykh
sistemakh [Effective implementation of parallelization on reconfigurable systems], Vestnik
komp'yuternykh i informatsionnykh tekhnologiy [Bulletin of Computer and Information Technologies],
2018, No. 8, pp. 11-16.
10. Levin I.I., Doronchenko Yu.I., Sorokin D.A., Chistyakov A.E. Modelirovanie rasprostraneniya
akusticheskikh voln v massivnoy porode s primeneniem rekonfiuriruemoy vychislitel'noy
sistemy [Modeling of acoustic wave propagation in a massive rock using a reconfigurable
computing system], Neftyanoe khozyaystvo [Oil industry], 2016, No. 3, pp. 50-53.
11. Sorokin, D.A., Dordopulo A.I. Metodika sokrashcheniya apparatnykh zatrat v slozhnykh
sistemakh pri reshenii zadach s sushchestvenno-peremennoy intensivnost'yu potokov dannykh
[Methodology for reducing hardware costs in complex systems when solving problems with
significantly variable intensity of data flows], Izvestiya YuFU. Tekhnicheskie nauki [Izvestiya
SFedU. Engineering Sciences], 2012, No. 4, pp. 213-219.
12. RippleFPGA. Available at: https://github.com/cuhk-eda/ripple-fpga (accessed 18 May 2022).
13. AMF-Placer. Available at: https://github.com/zslwyuan/AMF-Placer accessed 18 May 2022).
14. Marrouff D., Shamli A., Martin T., Grewal G., and Areibi S. A Deep-Learning Framework for
Predicting Congestion during FPGA Placement, in 30th Int’l Conference on Field Programmable
Logic and Applications, Sweden, September 2020, pp. 138-144.
15. Chak-Wa Pui, Gengjie Chen, Yuzhe Ma, Evangeline F.Y. Young, and Bei Yu. Clock-aware
UltraScale FPGA placement with machine learning routability prediction, In IEEE/ACM International
Conference on Computer-Aided Design (ICCAD), 2017, pp. 915-922.
16. Vivado Design Suite User Guide: Using Constraints. Available at: https://www.xilinx.com/
content/dam/xilinx/support/documentation/sw_manuals/xilinx2021_1/ug903-vivado-usingconstraints.
pdf (accessed 18 May 2022).
17. Vivado Design Suite User Guide: Design Analysis and Closure Techniques. Available at:
https://www.xilinx.com/content/dam/xilinx/support/documents/sw_manuals/xilinx2021_2/ug9
06-vivado-design-analysis.pdf#nameddest=xPerformingTimingAnalysis (accessed 18 May
2022).
18. NiansongZ., XiangCh., Nachiket K. RapidLayout: Fast Hard Block Placement of FPGAoptimized
Systolic Arrays using Evolutionary Algorithms. Available at: https://arxiv.org/
abs/2002.06998 (accessed 18 May 2022).
19. UltraScale Architecture Configurable Logic Block. Available at: https://docs.xilinx.com/
v/u/en-US/ug574-ultrascale-clb (accessed 18 May 2022).
20. Versal Architecture and Product Data Sheet: Overview. Available at: https://docs.xilinx.com/
v/u/en-US/ds950-versal-overview (accessed 18 May 2022).
21. Intel® Agilex™ FPGAs and SoCs Device Overview. Available at: https://www.intel.com/
content/www/us/en/docs/
programmable/683458/current/fpga-and-soc-device-overview.html (accessed 18 May 2022).
22. NITS SE i NK: Tertsius [SIC SE and NC: Tertius. – Access mode]. Available at:
http://superevm.ru/index.php?page=tertsius (accessed 20 May 2022).
23. UltraScale Architecture and Product Data Sheet: Overview. Available at:
https://docs.xilinx.com/v/u/en-US/ds890-ultrascale-overview (accessed 20 May 2022).
24. Layons R. Tsifrovaya obrabotka signalov [Digital signal processing]: transl. from the engl. by
A.A. Britova. 2nd ed. Moscow: BINOM, 2007, 652 p.
25. Levin I.I., Semernikov E.A. Ustoychivost' konveyernykh rekursivnykh fil'trov [Stability of
conveyor recursive filters], Vestnik Yuzhnogo nauchnogo tsentra Rossiyskoy akademii nauk
[Bulletin of the Southern Scientific Center of the Russian Academy of Sciences]. Rostov-on-
Don: Izd-vo YuNTS RAN, 2005, Vol. 1, V. 2, pp. 28-40.
26. NITS SE i NK: Segin plata vychislitel'nogo modulya [SIC SE and NC: Seguin computing
module board]. Available at: http://superevm.ru/index.php?page=segin-plata-vychislitelnogomodulya
(accessed 20 May 2022).
Published
2022-11-01
Section
SECTION II. INFORMATION PROCESSING ALGORITHMS