474
Views
4
CrossRef citations to date
0
Altmetric
Research Article

Clustering Individuals on Limited Features of a Vector Autoregressive Model

, , &
Pages 768-786 | Published online: 20 May 2020
 

Abstract

Dynamical interplays in emotions have been investigated using vector autoregressive (VAR) models, whose estimates can be used to cluster participants into unknown groups. The present study evaluated a clustering algorithm, the alternating least square (ALS) algorithm, for accuracy in predicting individual group membership. We systematically manipulated (a) the number of variables in a model, (b) the size of group differences in regression coefficients, and (c) the number of regression coefficients that vary across the groups (i.e., effective features). The ALS algorithm works reliably when there are at least 5 effective features with very large group differences in a 5-variable model; and 9 effective features with very large group differences in a 10-variable model. These findings suggest that the ALS algorithm is sensitive to group differences that are present only in several coefficients of a VAR model, but that the group differences have to be large. We also found that the ALS algorithm outperforms another clustering method, Gaussian mixture modeling. The ALS algorithm was further evaluated with unbalanced sample sizes between groups and with a greater number of groups in data (Study 2). A real data application was provided to illustrate how to interpret the detected group differences (Study 3).

Article information

Conflict of interest disclosures: Each author signed a form for disclosure of potential conflicts of interest. No authors reported any financial or other conflicts of interest in relation to the work described.

Ethical principles: The authors affirm having followed professional ethical guidelines in preparing this work. These guidelines include obtaining informed consent from human participants, maintaining ethical treatment and respect for the rights of human or animal participants, and ensuring the privacy of participants and their data, such as ensuring that individual participants cannot be identified in reported results or from publicly available original or archival data.

Funding: This work was not supported.

Role of the funders/sponsors: None of the funders or sponsors of this research had any role in the design and conduct of the study; collection, management, analysis, and interpretation of data; preparation, review, or approval of the manuscript; or decision to submit the manuscript for publication.

Acknowledgements: We thank Yudai Iijima for sharing the ESM data. The ideas and opinions expressed herein are those of the authors alone, and endorsement by the authors’ institution is not intended and should not be inferred.

Notes

1 A VAR model can be mapped onto a network diagram with nodes representing variables and with edges representing connectivity (cross-regression coefficients) between the variables. In ESM data, a “network” can be defined for each individual, which provides indices that characterize individual psychological networks (e.g., centrality) in analogy to a social network. There is, however, an ongoing debate on how to conceptualize and configure an individualistic psychological network, for which the direct application of the social-network measures may not be appropriate due to the conceptual differences (Bringmann et al., Citation2019). Although such a psychological network is out of our focus, the network analysis (on VAR estimates) can be an interesting direction to model the interrelations of symptoms and/or emotions.

2 Performance of the hierarchical clustering on individual VAR estimates was already examined by Bulteel et al. (Citation2016). The outputs of the hierarchical clustering were used as initial values for the ALS optimization, which are known to be less accurate than the final predictions of the ALS algorithm. Therefore, we did not include this approach in the current evaluation.

3 RMSE was inversely related to the accuracy predicting participants’ group membership. This is because (a) we evaluated RMSE only when the correct number of groups was indicated; (b) even if the indicated number of groups was correct, the ALS algorithm could make a wrong prediction on participants’ group membership, particularly in the conditions of less EFs and smaller effect sizes. In this case, VAR models were specified on incorrect grouping, which resulted in increased RMSE.

4 Note that the ALS algorithm does not provide variance estimates for regression coefficients (because it assumes no individual differences within a group).

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 352.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.