99
Views
3
CrossRef citations to date
0
Altmetric
Original Articles

Automatic generation of fast algorithms for matrix–vector multiplication

&
Pages 626-644 | Received 28 Jun 2015, Accepted 03 Sep 2016, Published online: 16 Mar 2017

References

  • R. Ahmed and B.L. Evans, Optimization of signal processing algorithms, Conference Record of the Asilomar Conference on Signals, Systems and Computers, IEEE, Asiomar, Florida, 1997, pp. 1401–1406.
  • N. Ahmed and K.R. Rao, Orthogonal Transforms for Digital Signal Processing, 1st ed., Springer, Berlin, Heidelberg, 1975.
  • R.E. Blahut, Fast Algorithms for Digital Signal Processing, 1st ed., Addison-Wesley, Boston, 1985.
  • I.N. Bronshtein, K.A. Semendyayev, G. Musiol, and H. Mühlig, Handbook of Mathematics, Springer, Berlin, Germany, 2015, p. 889.
  • R.A. Brualdi, Combinatorial Matrix Classes, Cambridge University Press, Cambridge, UK, 2006, pp. 2–22.
  • A. Cariow, Strategies for the synthesis of fast algorithms for the computation of the matrix–vector products, J. Signal Process. Theory Appl. 3 (2014), pp. 1–19.
  • H.K. Garg, Digital Signal Processing Algorithms: Number Theory, Convolution, Fast Fourier Transforms, and Applications, 1st ed., Computer Science and Engineering, CRC Press, Boca Raton, Florida, 1998.
  • K. Goto and R.A.V.D. Geijn, Anatomy of high-performance matrix multiplication, ACM Trans. Math. Softw. 34 (2008), pp. 1–25. doi: 10.1145/1356052.1356053
  • D. Henderson, S.H. Jacobson, and A.W. Johnson, The theory and practice of simulated annealing, in Handbook of Metaheuristics, F. Glover and G.A. Kochenberger, eds., International Series in Operations Research and Management Science, Vol. 57, chap. 10, Springer, New York, 2003, pp. 287–319.
  • A.A. Hopgood, Intelligent Systems for Engineers and Scientists, CRC Press, Boca Raton, Florida, 2011, pp. 151–153.
  • A. Juels and M. Wattenberg, Stochastic hillclimbing as a baseline method for evaluating genetic algorithms, Tech. Rep., University of California at Berkeley, Berkeley, CA, 1994.
  • S. Ledesma, G. Aviña, and R. Sanchez, Practical considerations for simulated annealing implementation, in Simulated Annealing, C.M. Tan, ed., chap. 20, InTech, Vienna, Austria, 2008, pp. 401–420.
  • C.D. Meyer, Matrix Analysis and Applied Linear Algebra, SIAM: Society for Industrial and Applied Mathematics, Philadelphia, 2001.
  • P.A. Milder, F. Franchetti, J.C. Hoe, and M. Püschel, Computer generation of hardware for linear digital signal processing transforms, ACM Trans. Des. Autom. Electron. Syst. 17 (2012), pp. 1–33. doi: 10.1145/2159542.2159547
  • B. Mitra, S. Jha, and P.P. Chaudhuri, A simulated annealing based state assignment approach for control synthesis, VLSI Design, Fourth CSI/IEEE International Symposium, New Delhi, 1991.
  • C. Mouilleron, Efficient computation with structured matrices and arithmetic expressions, Ph.D. diss., Ecole normale supérieure de lyon – ENS LYON, 2011.
  • M. Püschel, J.M.F. Moura, B. Singer, J. Xiong, J. Johnson, D. Padua, M. Veloso, and R.W. Johnson, Spiral: A generator for platform-adapted libraries of signal processing algorithms, J. High Perform. Comput. Appl. 18 (2004), pp. 21–45. doi: 10.1177/1094342004041291
  • D.J. Rabideau and A. Steinhardt, Simulated annealing for mapping DSP algorithms onto multiprocessors, Signals Syst. Comput. 1 (1993), pp. 668–672. doi: 10.1109/ACSSC.1993.342603
  • C. Rambabu and I. Chakrabarti, An efficient hillclimbing-based watershed algorithm and its prototype hardware architecture, J. Signal Process. Syst. 52 (2008), pp. 281–295. doi: 10.1007/s11265-007-0157-3
  • P.A. Regalia and S.K. Mitra, Kronecker products, unitary matrices, and signal processing applications, Soc. Ind. Appl. Math. 31 (1989), pp. 583–613.
  • S. Russell and P. Norvig, Artificial Intelligence: A Modern Approach, Pearson, London, UK, 2009, pp.111–114.
  • M. Schmid and R. Schneider, Parallel Simulated Annealing Techniques for Scheduling and Mapping DSP – Applications onto Multi-DSP Platforms, Proceedings of the International Conference on Signal Processing Applications and Technology, Miller Freeman, Inc., Orlando, Florida, 1999.
  • R. Singh and S.K. Arya, Design of IIR digital filter using simulated annealing, Proceedings of the International Conference on Advanced Computing and Communication Technologies, Rohtak, Haryana, 2011.
  • D.G. Spampinato and M. Püschel, A Basic Linear Algebra Compiler, Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization, no. 23, CGO '14, ACM, Orlando, FL, 2014, pp. 23–32.
  • A. Ţariov, Strategies of rationalization of computing matrix–vector products, Metody Inf. Stos. 13 (2008), pp. 147–158.
  • A. Ţariov, Algorithmic aspects of computing rationalization in digital signal processing, PPH ZAPOL, Szczecin, Poland, 2011.
  • S.V. Vaseghi, Advanced Digital Signal Processing and Noise Reduction, Wiley, 2008, pp. 128–130 (chap. 4).
  • R. Vuduc and H.J. Moon, Fast sparse matrix vector multiplication by exploiting variable block structure, in High Performance Computing and Communications, L.T. Yang, O.F. Rana, B. Di Martino and J. Dongarra, eds., Vol. 3726, Springer, Berlin, Heidelberg, 2005, pp. 807–816.
  • B. Wess, Minimization of data address computation overhead in DSP programs, Design Automat. Embed. Syst. 4 (1999), pp. 167–185. doi: 10.1023/A:1008961206784

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.