86
Views
0
CrossRef citations to date
0
Altmetric
Research Article

MHAMD-MST-CNN: multiscale head attention guided multiscale density maps fusion for video crowd counting via multi-attention spatial-temporal CNN

, &
Pages 1777-1790 | Received 30 Apr 2021, Accepted 04 Mar 2023, Published online: 23 Mar 2023
 

ABSTRACT

Video-based crowd counting and density estimation (CCDE) is vital for crowd monitoring. The existing solutions lack in addressing issues like cluttered background and scale variation in crowd videos. To this end, a multiscale head attention-guided multiscale density maps fusion for video-based CCDE via multi-attention Spatial-Temporal CNN (MHAMD-MST-CNN) is proposed. The MHAMD-MST-CNN has three modules: a multi attention spatial stream (MASS), a multi attention temporal stream (MATS), and a final density map generation (FDMG) module. The spatial head attention modules (SHAMs) and temporal head attention modules (THAMs) are designed to eliminate the background influence from the MASS and the MATS, respectively, by mapping the multiscale spatial or temporal features to head maps. The multiscale de-backgrounded features are utilised by the density map generation (DMG) modules to generate multiscale density maps to deal with scale variation due to perspective distortion. The multiscale density maps are fused and fed into the FDMG module to obtain the final crowd density map. The MHAMD-MST-CNN has been trained and validated on three publicly available benchmark datasets: the Venice, the Mall, and the UCSD. The MHAMD-MST-CNN provides competitive results as compared with the state-of-the-arts in terms of mean absolute error (MAE) and root mean squared error (RMSE).

Acknowledgements

The support and the resources provided by ‘PARAM Shivay Facility’ under the National Supercomputing Mission, Government of India at the Indian Institute of Technology, Varanasi, are gratefully acknowledged.

Disclosure statement

No potential conflict of interest was reported by the authors.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.