1,020
Views
15
CrossRef citations to date
0
Altmetric
Articles

Efficient CNN Accelerator on FPGA

&
 

Abstract

Convolutional neural networks (CNNs) are classical models for computer vision and machine learning applications such as video surveillance, pattern recognition, weather forecasting, traffic, and safety. CNNs involve computationally intensive operations and require huge off-chip memory bandwidth, which makes it a challenging task to deploy on real-time embedded systems. Compared to central processing units and graphic processing units, field programmable gate arrays (FPGA)-based CNNs are gaining popularity owing to their flexibility and efficiency. In this work, we present an efficient CNN accelerator based on blocked Winograd-GEMM architecture with high performance. We implement ResNet-18 CNN model on XC7VX690T FPGA using proposed architecture. This implementation operates at a clock frequency of 200 MHz and gives average throughput of 383 GOPS which is comparable to other state-of-art implementations. This manuscript is an extended version of [S. Kala, J. Mathew, B. R. Jose, and S. Nalesh, “UniWiG: Unified Winograd-GEMM Architecture for Accelerating CNN on FPGAs,” in 2019 32nd International Conference on VLSI Design and 2019 18th International Conference on Embedded Systems (VLSID), Delhi, NCR, India, 2019, pp. 209–214. DOI: 10.1109/VLSID.2019.00055.].

Additional information

Notes on contributors

S Kala

S Kala received BTech degree in electronics and communication engineering from MG University, India in 2006 and MS (Engg) from Center for NanoScience and Engineering (CeNSE), Indian Institute of Science Bangalore (IISc), India in 2013. Currently, she is an Assistant Professor at Indian Institute of Information Technology (IIIT) Kottayam, Kerala, India. Her research interests include reconfigurable architectures, hardware acceleration of deep learning algorithms and digital signal processing systems. Corresponding author. Email: [email protected]

S Nalesh

S Nalesh received BTech degree in electronics and communication engineering from National Institute of Technology Calicut (NIT), India in 2003 and MTech in integrated electronics & circuits from Indian Institute of Technology (IIT) Delhi, India in 2010 and PhD degree from CADLab, Indian Institute of Science (IISc) Bangalore, India in 2018. He is Assistant Professor in Department of Electronics, Cochin University of Science and Technology, India. His research interests include hardware accelerators for deep learning, reconfigurable many core processor architectures and application synthesis, VLSI implementation of DSP algorithms and hardware security. Email: [email protected]

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.