Abstract
Data characteristics are often summarized and represented by a set of variables. Identifying the relationship between these variables is crucial for prediction, hypothesis testing, and decision making. The relation between two variables is often quantified using a correlation factor. Once the correlation between a response and an independent variable is quantified, it can be used to make predictions regarding response variable for the observed factor. That is, if two variables are correlated, by observing one, we can make predictions about the other one. A more accurate prediction can be made where there is strong relationship between variables. Several correlation factors have been introduced. Among them, Pearson’s Correlation Coefficient has been commonly used, while Distance Correlation and Maximal Information Coefficient have been recently introduced to address the shortcomings of Pearson’s Correlation Coefficient. These correlation coefficients are developed to measure associations in different trends. For example, Pearson’s Correlation is used when dealing with linear trends while Spearman’s correlation is used when dealing with monotonic trends. However, in many applications, the underlying relationship is not obvious to determine the appropriate choice of the correlation coefficient. In this paper, we compare these factors through a series of simulations and we propose a single generic factor by aggregating these factors for general applications.