83
Views
1
CrossRef citations to date
0
Altmetric
Research Article

Evaluating reliability of tree-patterns in extreme-K categorical samples problems

ORCID Icon, , &
Pages 3828-3849 | Received 09 Feb 2021, Accepted 30 Jun 2021, Published online: 21 Jul 2021
 

Abstract

Exploratory Data Analysis (EDA) approaches are adopted to address the difficult extreme-K categorical sample problem. Due to observed data's categorical nature, all comparisons among populations are performed by comparing their distributions in the form of a histogram with symbolic bins. A distance measure is designed to evaluate the discrepancy between two symbol-based histograms to facilitate Hierarchical Clustering (HC) algorithms. The resultant binary HC-tree then serves as the basis for our EDA task of discovering tree-patterns of interest. Since each population-leaf's location within a binary HC-tree's geometry is expressed through a binary code sequence, a binary code segment characterizes all commonly shared tree-patterns for all members. We then generate a large ensemble of mimicries of the observed dataset based on multinomial distributions and construct a large ensemble of binary HC-trees. Upon each identified tree-pattern which we determined based on the observed dataset, we evaluate its reliability and uncertainty through two histograms.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work was supported by National Science Foundation (NSF) of the USA [650042].

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.