Full article: New two-way discrete frequency table with application to English Premier League data

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

ABSTRACT

A bivariate discrete frequency table, one of the significant exploratory data analysis (EDA) tools, organizes data systematically. The existing frequency table is straightforward, but when the number of elements in the data is large enough, the table can be complicated. In this research, we proposed a new bivariate discrete frequency table by grouping the elements in each variable. The table can be constructed using the R code provided with the article. We described the table using simulations from the bivariate binomial distribution, bivariate Poisson distribution. Real data, obtained from the English Premier League website, is also used to illustrate the new table. The findings indicated that the proposed bivariate frequency table provides a better alternative when the number of elements is substantial and reveals the essential data features.

KEYWORDS:

1. Introduction

Data are more attractive and capture the minds of people if depicted in either tabular or graphical form. The tabular representations are precise and provide the reader with apparent features of the data; however, the graphical representations have more visual significance since they are useful in detecting patterns in a dataset (Beniger & Robyn, Citation1978; Davies, Citation1929; Gelman, Citation2011; Gelman et al., Citation2002; Kastellec & Leoni, Citation2007; Xu & Wang, Citation2020). The hidden raw data features can only be uncovered if the data is organized in a meaningful form, such as a frequency table. A frequency table partitions raw data into classes of appropriate sizes, displaying observations and their respective number of occurrences (Kenney, Citation1939; Manikandan, Citation2011; Mohammed, Adam, Ali et al., Citation2020). Generally, the main reason for summarizing raw data is to explore the extra information therein. It is also easier to understand the underlying distribution, the features of variables, and know the statistical tool to be used for inference.

Data obtained as a result of measurements such as length, height, weight, or temperature, assume values within interval or range. Such measured observations are called continuous data. If $x_{1}, x_{2}, \dots, x_{n}$ are continuous observations, $\overset{ˉ}{x} \in C_{D}$ , $M e d i a n \in C_{D}$ and the $m o d e \notin C_{D}$ . Continuous data take values within a given interval and generally are measured values such as the amount of rainfall, length, or area, whereas discrete data are whole numbers (Gardiner et al., Citation1979). A set of discrete data is often obtained by counting or enumeration, while continuous data are usually obtained through measurement (Fisher & Marshall, Citation2009; Kenney, Citation1939). Discrete data are countable finite observations and the table that summarizes the discrete data. The elements are the natural classes; there are no class limits and class boundaries (Gravetter et al., Citation2020; Kenney, Citation1939). The discrete frequency table is classified into two, based on the number of variables. A table that organizes data on only a single discrete variable is known as a univariate discrete frequency table. Meanwhile, a bivariate discrete frequency table is a table that displays data on two joint discrete variables.

The existing bivariate discrete frequency table is straightforward and very significant. However, when the number of elements in the joint discrete data is large enough, it leads to a very long table that can be difficult to handle. In this research, we proposed a new bivariate discrete frequency table containing datasets with a large number of elements. The table can be constructed by grouping the elements in the joint discrete data.

2. Bivariate discrete frequency table

Let $(x_{1}, y_{1}), (x_{1}, y_{1}), \dots, (x_{n}, y_{n})$ be $n$ pairs of discrete observations of variables $X$ and $Y$ , the existing frequency table is given as Table . The notations, $m_{1}$ and $m_{2}$ ,, respectively, denote the number of elements in the two joint discrete datasets, $x_{i}$ , $i = 1, 2, \dots, m_{1}$ denote the elements of variable $X$ displayed in the columns and $y_{j}$ , $j = 1, 2, \dots, m_{2}$ are the elements of the second variable $Y$ presented in the rows, $f_{i j}$ is the joint frequency of variables X and Y in cell $i j$ .

New two-way discrete frequency table with application to English Premier League data

ABSTRACT

1. Introduction

2. Bivariate discrete frequency table

Table 1. Typical bivariate discrete frequency table

Number of classes

3. Proposed bivariate discrete frequency table

Table 2. General g-element bivariate discrete frequency table

Table 3. Bi-element bivariate discrete frequency table with both m1 and m2 even

Table 4. Bi-element bivariate discrete frequency table with m1 even and m2 odd

Table 5. Bi-element bivariate discrete frequency table with m1 odd but m2 even

Mode of the proposed frequency table

4. Results and discussion

Simulation

Table 6. Bivariate discrete frequency table constructed using a sample of size 1000 simulated from the bivariate binomial distribution with parameters n=20, πx=0.5, πy=0.5

Table 7. Bi-element bivariate discrete frequency table constructed using a sample of size 1000 simulated from the bivariate binomial distribution with parameters n=20, πx=0.5, πy=0.5

Table 8. Bivariate discrete frequency table constructed using a sample of size 100,000 simulated from the bivariate binomial distribution with parameters n=50, πx=0.5, πy=0.5

Table 9. Tri-element bivariate discrete frequency table constructed using a sample of size 100,000 simulated from the bivariate binomial distribution with parameters n=50, πx=0.5, πy=0.5

Table 10. Bivariate discrete frequency table constructed using a sample of size 1000 simulated from the bivariate poisson distribution with parameters μ1=2.5, μ2=3.5, and μ3=2.5

Table 11. Bi-element bivariate discrete frequency table constructed using a sample of size 1000 simulated from the bivariate poisson distribution with parameters μ1=2.5, μ2=3.5, and μ3=2.5

Table 12. Bivariate discrete frequency table constructed using a sample of size 1000 simulated from the bivariate poisson distribution with parameters μ1=6.5, μ2=5.5, and μ3=4.5

Table 13. Tri-element bivariate discrete frequency table constructed using a sample of size 1000 simulated from the bivariate poisson distribution with parameters μ1=6.5, μ2=5.5, and μ3=4.5

Table 14. Bivariate discrete frequency table constructed using a sample of size 1000 simulated from the bivariate poisson distribution with parameters μ1=6.5, μ2=5.5, and μ3=4.5

Table 15. Tri-element bivariate discrete frequency table, where the first class is having a different number of elements, constructed using a sample of size 1000 simulated from the bivariate poisson distribution with parameters μ1=6.5, μ2=5.5, and μ3=4.5

Application

Table 16. Bivariate discrete frequency table constructed using data on the number of wins and clean sheets for English Premier League clubs from season 2006/2007 to 2017/2018

Table 17. Tri-Element bivariate discrete frequency table constructed using data on the number of wins and clean sheets for English Premier League clubs from season 2006/2007 to 2017/2018

5. Conclusion

Public interest statement

Acknowledgements

Disclosure statement

Additional information

Funding

Notes on contributors

M. B. Mohammed

H. S. Zulkafli

N. Ali

O. R. Olaniran

References

6. Appendix

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date

Table 2. General $g$ -element bivariate discrete frequency table

Table 3. Bi-element bivariate discrete frequency table with both $m_{1}$ and $m_{2}$ even

Table 4. Bi-element bivariate discrete frequency table with $m_{1}$ even and $m_{2}$ odd

Table 5. Bi-element bivariate discrete frequency table with $m_{1}$ odd but $m_{2}$ even

Table 6. Bivariate discrete frequency table constructed using a sample of size 1000 simulated from the bivariate binomial distribution with parameters $n = 20$ , $π_{x} = 0.5$ , $π_{y} = 0.5$

Table 7. Bi-element bivariate discrete frequency table constructed using a sample of size 1000 simulated from the bivariate binomial distribution with parameters $n = 20$ , $π_{x} = 0.5$ , $π_{y} = 0.5$

Table 8. Bivariate discrete frequency table constructed using a sample of size 100,000 simulated from the bivariate binomial distribution with parameters $n = 50$ , $π_{x} = 0.5$ , $π_{y} = 0.5$

Table 9. Tri-element bivariate discrete frequency table constructed using a sample of size 100,000 simulated from the bivariate binomial distribution with parameters $n = 50$ , $π_{x} = 0.5$ , $π_{y} = 0.5$

Table 10. Bivariate discrete frequency table constructed using a sample of size 1000 simulated from the bivariate poisson distribution with parameters $μ_{1} = 2.5$ , $μ_{2} = 3.5$ , and $μ_{3} = 2.5$

Table 11. Bi-element bivariate discrete frequency table constructed using a sample of size 1000 simulated from the bivariate poisson distribution with parameters $μ_{1} = 2.5$ , $μ_{2} = 3.5$ , and $μ_{3} = 2.5$

Table 12. Bivariate discrete frequency table constructed using a sample of size 1000 simulated from the bivariate poisson distribution with parameters $μ_{1} = 6.5$ , $μ_{2} = 5.5$ , and $μ_{3} = 4.5$

Table 13. Tri-element bivariate discrete frequency table constructed using a sample of size 1000 simulated from the bivariate poisson distribution with parameters $μ_{1} = 6.5$ , $μ_{2} = 5.5$ , and $μ_{3} = 4.5$

Table 14. Bivariate discrete frequency table constructed using a sample of size 1000 simulated from the bivariate poisson distribution with parameters $μ_{1} = 6.5$ , $μ_{2} = 5.5$ , and $μ_{3} = 4.5$

Table 15. Tri-element bivariate discrete frequency table, where the first class is having a different number of elements, constructed using a sample of size 1000 simulated from the bivariate poisson distribution with parameters $μ_{1} = 6.5$ , $μ_{2} = 5.5$ , and $μ_{3} = 4.5$