ABSTRACT
Simulation methods were developed to find p-values for univariate mixture distributions. Looking for the maximum elbow of the unexplained variance in a kernel density estimation was determined to be insufficient to evidence that data is multimodal. It is shown by theoretical and simulated results that random distributions have similar elbowing behaviour. It is only elbowing, or variance convergence, above the level in random distributions that indicates a mode has been found. However, kernel h-distance and variance by step were together found to be better predictors of confidence intervals of modal data. Confidence intervals of p = 0.1 and p = 0.01 are modelled in equations and can be applied to data with any standard deviation and sample sizes of 12 and over. The methods were tested on data generated from bimodal up to five modes.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Data Availability
Data and code are available at: http://hobackas.faculty.udmercy.edu/research-reports/2021-07/cluster-analysis-data-and-code.zip