1,020
Views
15
CrossRef citations to date
0
Altmetric
Articles

Efficient CNN Accelerator on FPGA

&

References

  • Y. Ma, Y. Cao, S. Vrudhula, and J. Seo, “An automatic RTL compiler for high-throughput FPGA implementation of diverse deep convolutional neural networks,” in 27th International Conference on Field Programmable Logic and Applications (FPL), Ghent, 2017, pp. 1–8.
  • A. Podili, C. Zhang, and V. Prasanna, “Fast and efficient implementation of Convolutional Neural Networks on FPGA,” in IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP), Seattle, WA, 2017, pp. 11–18.
  • Y. Guan, H. Liang, N. Xu, W. Wang, S. Shi, X. Chen, G. Sun, W. Zhang, and J. Cong, “FP-DNN: An automated framework for mapping deep neural networks onto FPGAs with RTL-HLS hybrid templates,” in IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), Napa, CA, 2017, pp. 152–9.
  • Q. Xiao, Y. Liang, L. Lu, S. Yan and Y-W. Tai, “Exploring heterogeneous algorithms for accelerating deep convolutional neural networks on FPGAs,” in 54th ACM/EDAC/IEEE Design Automation Conference (DAC), Austin, TX, 2017, pp. 1–6.
  • H. Li, X. Fan, L. Jiao, W. Cao, X. Zhou, and L. Wang, “A high performance FPGA-based accelerator for large-scale convolutional neural networks,” in 26th International Conference on Field Programmable Logic and Applications (FPL), Lausanne, 2016, pp. 1–9.
  • K. Guo, L. Sui, J. Qiu, J. Yu, J. Wang, S. Yao, S. Han, Y. Wang, and H. Yang, “Angel-Eye: A complete design flow for mapping CNN onto embedded FPGA,” IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., Vol. 37, no. 1, pp. 35–47, Jan. 2018. doi: 10.1109/TCAD.2017.2705069
  • Michael Mathieu, Mikael Henaff, and Yann LeCun, “Fast Training of Convolutional Networks through FFTs,” in International Conference on Learning Representations (ICLR 2014), CBLS, April 2014.
  • A. Lavin and S. Gray, “Fast Algorithms for Convolutional Neural Networks,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 2016, pp. 4013–21.
  • Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” in 25th International Conference on Neural Information Processing Systems (NIPS), Lake Tahoe, NV, 2012, pp. 1097–1105.
  • K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” ICLR, 2015, pp. 1–14.
  • K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 2016. pp. 770–8.
  • L. Lu, Y. Liang, Q. Xiao, and S. Yan, “Evaluating Fast Algorithms for Convolutional Neural Networks on FPGAs,” in IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), Napa, CA, 2017, pp. 101–8.
  • J. Yu, Y. Hu, X. Ning, J. Qiu, K. Guo, Y. Wang, and H. Yang, “Instruction driven cross-layer CNN accelerator with winograd transformation on FPGA,” in 2017 International Conference on Field Programmable Technology (ICFPT), Melbourne, VIC, 2017, pp. 227–30. DOI: 10.1109/FPT.2017.8280147.
  • S. Kala, J. Mathew, B. R. Jose, and S. Nalesh, “UniWiG: Unified Winograd-GEMM Architecture for Accelerating CNN on FPGAs,” in 2019 32nd International Conference on VLSI Design and 2019 18th International Conference on Embedded Systems (VLSID), Delhi, NCR, India, 2019, pp. 209–214. DOI: 10.1109/VLSID.2019.00055.
  • M. Motamedi, P. Gysel, V. Akella, and S. Ghiasi, “Design space exploration of FPGA-based Deep Convolutional Neural Networks,” 21st Asia and South Pacific Design Automation Conference (ASP-DAC), Macau, 2016, pp. 575–80.
  • C. Baskin, E. Zheltonozhskii, and N. Liss, Streaming Architecture for Large-Scale Quantized Neural Networks on an FPGA-Based Dataflow Platform. IPDPS, 2018.
  • Y. Ma, M. Kim, Y. Cao, S. Vrudhula, and J-s. Seo, “End-to-end scalable FPGA accelerator for deep residual networks,” in 2017 IEEE International Symposium on Circuits and Systems (ISCAS), 2017.
  • T. Abtahi, A. Kulkarni, and T. Mohsenin, “Accelerating convolutional neural network with FFT on tiny cores,” in IEEE International Symposium on Circuits and Systems (ISCAS), Baltimore, MD, 2017, pp. 1–4.
  • J. Shen, Y. Qiao, Y. Huang, M. Wen, and C. Zhang, “Towards a multi-array architecture for accelerating large-scale matrix multiplication on FPGAs,” in IEEE International Symposium on Circuits and Systems (ISCAS), Florence, 2018, pp. 1–5.
  • S. Kala, B. R. Jose, J. Mathew, and S. Nalesh, “High-performance CNN accelerator on FPGA using unified Winograd-GEMM architecture,” in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol. 27, no. 12, pp. 2816–28. Dec. 2019.
  • H. Zeng, R. Chen, C. Zhang, and V. Prasanna, “A framework for generating high throughput CNN implementations on FPGAs,” in Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA 18), ACM, New York, NY, 2018, pp. 117–26. Available https://doi.org/10.1145/3174243.3174265.
  • L. Lu and Y. Liang, “SpWA: An efficient sparse Winograd convolutional neural networks accelerator on FPGAs,” in 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC), San Francisco, CA, 2018, pp. 1–6. DOI: 10.1109/DAC.2018.8465842.
  • H. Lu, X. Wei, N. Lin, G. Yan, and X. Li, “Tetris: Re-architecting Convolutional Neural Network Computation for Machine Learning Accelerators,” in 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), San Diego, CA, 2018, pp. 1–8. doi: 10.1145/3240765.3240855.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.