Abstract
Folding transformations on processor arrays as introduced in [1] result in smaller processor arrays, more work for the processing elements, a decrease in I/O time, pipelineable implementations and circular data flow. Some implementations of folding transformations have been considered by Megson and Evans in [2], Choffrut and Culik in [3], Yaacoby and Cappello [4] and by the authors in [1, 5-7]
Planar processor arrays arc considered in this paper and the generalised procedure given in [7] is applied. The examples show how the complexity of the data communications and the processor operations arc analysed and the solutions chosen are those that offer regular data flow without data collision. According to this chosen procedure a set of processors requires switching functions, which by applying the double mapping technique introduced in [5] can be avoided
The most common types of matrix algorithms are presented, determining all the possible planes (lines) of symmetry and vectors of the interlocking translation. Two examples, the matrix multiplication algorithm and the LU decomposition algorithm are shown
The efficiency analysis shows that the implementation obtained utilizes the processor array with double efficiency.