Abstract
We develop a new method to locally cluster curves and discover functional motifs, that is, typical shapes that may recur several times along and across the curves capturing important local characteristics. In order to identify these shared curve portions, our method leverages ideas from functional data analysis (joint clustering and alignment of curves), bioinformatics (local alignment through the extension of high similarity seeds) and fuzzy clustering (curves belonging to more than one cluster, if they contain more than one typical shape). It can employ various dissimilarity measures and incorporate derivatives in the discovery process, thus exploiting complex facets of shapes. We demonstrate the performance of our method with an extensive simulation study, and show how it generalizes other clustering methods for functional data. Finally, we provide real data applications to Italian Covid-19 death curves and Omics data related to mutagenesis. Supplementary materials for this article are available online.
Supplementary Materials
Supplementary material includes proofs, additional methods and results. An R implementation (with examples) is available at https://github.com/marziacremona/ProbKMA-FMD.
Acknowledgments
We thank Matthew Reimherr and Piercesare Secchi for discussions about functional data methodology; Kateryna D. Makova and Di (Bruce) Chen for help with the mutagenesis application; Valeria Vitelli and Davide Floriello for their sparse functional clustering code.
Disclosure Statement
The authors report there are no competing interests to declare.
Correction Statement
This article was originally published with errors, which have now been corrected in the online version. Please see Correction (http://dx.doi.org/10.1080/10618600.2024.2356159)