In many scientific applications, it is necessary to compute the sums of floating-point num-bers. Summation is a building block for many numerical algorithms, such as dot product, Taylor series, polynomial interpolation and numerical integration. However, the summation of large sets of numbers in finite-precision IEEE 754 arithmetic can be very inaccurate due to the accumulation of rounding errors. There are various ways to diminish rounding errors in the floating-point sum-mation. One of them is the use of multiple-precision arithmetic libraries. Such libraries provide data structures and subroutines for processing numbers whose precision exceeds the IEEE 754 floating-point formats. In this paper, we consider multiple-precision summation on hybrid CPU-GPU platforms using MPRES, a new software library for multiple-precision computations on CPUs and CUDA compatible GPUs. Unlike existing multiple-precision libraries based on the binary representation of numbers, MPRES uses a residue number system (RNS). In RNS, the num-ber is represented as a tuple of residues obtained by dividing this number by a given set of moduli, and multiple-precision operations such as addition, subtraction and multiplication are naturally divided into groups of reduced-precision operations on residues, performed in parallel and with-out carry propagation. We consider the algorithm for the addition of multiple-precision floatingpoint numbers in MPRES, as well as three summation algorithms: (1) recursive summation, (2) pairwise summation, and (3) block-parallel hybrid CPU-GPU summation. Experiments show that the hybrid algorithm allows the full utilization of the GPU’s resources, and therefore demonstrates better performance. On the other hand, the parallel computation of the digits (residues) of multi-ple-precision significands in RNS reduces computation time.
