Abstract
Two types of broadcast in algorithms are determined: (1) a data broadcast, where one data value is used for more than one computation and (2) a computational broadcast where one variable is computed in more then one computation. Both types of broadcast are preferred to be eliminated when a processor array implementation is desired by using VLSI technology.
When the algorithm computes only one variable value for each index vector then the computational broadcast can be eliminated in a straight forward manner by introducing counter values resulting in a single assignment code. However in cases when the algorithm computes two or more variable values that are specified by a different computational broadcast, has not been considered. As far as is known it has been solved by deriving localized algorithms in single assignment code heuristically.
In this paper we define this problem in terms of a system of affine recurrence equations and analyze the data dependencies introduced. Then we show a synthesis procedure that eliminates the computational broadcast and a few examples of implementation are shown. The QR decomposition algorithm is also presented in a localized single assignment code by using the proposed method and several different parallel implementations are discussed.
∗University “kiril i Metodij” of Skopje, PMF Institute za Informatika, Arhimedova 5, p.f. 162, YU-91000 Skopje, Macedonia, Yugoslavia.
∗University “kiril i Metodij” of Skopje, PMF Institute za Informatika, Arhimedova 5, p.f. 162, YU-91000 Skopje, Macedonia, Yugoslavia.
Notes
∗University “kiril i Metodij” of Skopje, PMF Institute za Informatika, Arhimedova 5, p.f. 162, YU-91000 Skopje, Macedonia, Yugoslavia.