Abstract
This work is motivated by the need to solve realistic problems with complex energy, space, and angle dependence, which requires parallel multigroup transport sweeps combined with efficient acceleration of the thermal upscattering. We present various iterative schemes based on the two-grid (TG) diffusion synthetic acceleration (DSA) method. In its original form, the TG method is used with the Gauss-Seidel iterative scheme over energy groups, which makes it impractical for parallel computation. We therefore formulate a Jacobi-style version. Furthermore, we propose a new scheme that reduces the overall number of transport sweeps by removing the need to fully converge the within-group iterations before the TG step. This becomes possible by adding an additional within-group DSA solve after each transport sweep. Fourier analyses are carried out to ascertain the effectiveness of the proposed scheme, with further corroboration from massively parallel numerical results from practical problem calculations. We discuss several implementation strategies of the new scheme, paying particular attention to the consequences on the overall efficiency of adding additional diffusion solves with a relatively low number of degrees of freedom per process.
Acknowledgments
This material is based upon work supported by the U.S. Department of Energy, National Nuclear Security Administration, under award number(s) DE-NA0002376.
Notes
a By effectivity, we mean how effective a particular scheme is in accelerating the convergence (reducing the number of iterations); by efficiency, we mean computational efficiency, i.e., how expensive it is to perform one iteration of the scheme.
b Of course, in the case of an eigenvalue problem, the fixed source problem (1) represents only one iteration in the overall fission source update scheme.
c Typically, message passing interface ranks, but depending on the mode of execution, a process can also represent an OpenMP thread, a POSIX thread, etc.
d Note that the number of sweeps in the fast range was the same for all methods (69 608) such that the fast flux is fully resolved in that range to within machine underflow precision ().
e Even though not implemented in PDT at this moment, this also allows for nested parallelism when performing these block inverses in parallel.
f Unless the matrices are saved for each group; we did not go this way as at the time of implementation, the Hypre library did not provide functions for this purpose in its public interface. Instead, we formulated the gaDSA scheme that effectively circumvents this issue, as we will show momentarily.
g Our preliminary results obtained by using a combination of different iterative methods for different parts of the solution spectra are encouraging, and we plan to report these results in a future publication.
h We used what we call semivolumetric partitioning to distinguish it from the fully volumetric case; in the language of CitationRef. 1, it falls into the category of volumetric partitioning schemes.
i For the sake of completeness, we state the aggregation settings here: single group set with groups, polar angular aggregation (16 angles per angle set), cells per cell set, and cell sets per process.
j Note that the Richardson scheme performs one sweep per transport iteration + one sweep to compute the right side, whereas on top of that, GMRES requires an additional sweep before every restart, at the end of the calculation, and at the beginning to compute the initial residual.