271
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Decomposition of measure from symmetry for analyzing collapsed ordinal square contingency tables

, &
Pages 5814-5827 | Received 13 Jun 2022, Accepted 29 Jun 2023, Published online: 24 Jul 2023

Abstract

In some situations, square contingency tables with ordered categories are analyzed by considering collapsed tables where adjacent categories are combined. This study proposes measures to represent the degree of departure from symmetry using collapsed tables. The proposed measures are defined as the arithmetic mean of submeasures of each collapsed 3 × 3 table. Additionally, a theorem affirms that the value of the measure for symmetry is equal to the sum of the value of the proposed measures. Finally, examples are given.

1 Introduction

Data is often organized using a square contingency table with the same classifications for categorical data analysis. Examples include matched pairs data comparisons, inter-rater reliability, changes before and after treatment, and social mobility. Independence between the row and column classifications does not hold in these square contingency tables because many observations fall in or near the main diagonal cells. Then analysis focuses on the off-diagonal cells (i.e., symmetry or asymmetry of the row and column classifications) instead of independence.

Various models have been proposed to analyze the symmetry of a square contingency table (e.g., Bowker Citation1948; Read Citation1977; and McCullagh Citation1978). Moreover, a measure to represent the degree of departure from the model is crucial when the model of symmetry does not fit the data well (e.g., Tomizawa Citation1994, Citation1995; and Tomizawa and Saitoh Citation1999).

Furthermore, to simplify the interpretation in data research, analysis of ordinal square contingency tables with many dimensions are often divided into three categories (e.g., “high,” middle,” and low”). Various methods can be used to reclassify the original categories into three groups. Consequently, a comprehensive evaluation of the different collapsed patterns is reasonable when there are numerous ways to combine ordinal categories and it is difficult to decide how to choose unique cutpoints. For example, contingency tables may be collapsed into a 3 × 3 table in clinical research when analyzing data for clinical scales or laboratory data (see Shinoda et al. Citation2020; Aizawa, Yamamoto, and Tomizawa Citation2021). Additionally, social research often treats scales with ordered categories. As an example, we use the General Social Survey conducted by the National Opinion Research Center at the University of Chicago (Davern et al. Citation2021). compares the frequency that low-income respondents spent a social evening with neighbors or friends in 2012 and 2016. shows the frequency that low-income respondents suffer everyday discrimination in 2021. When the dimensions are large or the sample size is small, previous measures cannot be estimated because the sum of symmetric cells is zero. Often analysis not only compares the degree of departure from the models but also examines the relationships among models using the decomposition of a measure from symmetry. As an example, Yamamoto, Shimada, and Tomizawa (Citation2015) proposed a measure ΨS to represent the degree of departure from symmetry (S) using collapsed a 3 × 3 table. Noting that the measure ΨS is described in detail in the next section.

Table 1 Frequency of spending evenings with friends and neighbors for low-income respondents in 2012 and 2016 from Davern et al. (Citation2021).

Table 2 Frequency of everyday discrimination of low-income respondents in 2021 from Davern et al. (Citation2021).

This article proposes Kullback-Leibler (KL)-type measures to represent the degree of departure from the collapsed global symmetry (CoGS) or conditional symmetry (CS), which are denoted by ΨCoGS or ΨCS, and demonstrates that the value of ΨS is equal to the sum of the values of ΨCoGS and ΨCS.

2 Materials (reviews)

Consider an R × R square contingency table with the same row and column ordinal classifications. Let X and Y denote the row and column variables, respectively. Additionally, let Pr (X=i,Y=j)=pij for i=1,,R;j=1,,R. The S model Bowker (Citation1948) is defined as pij=pji(i=1,,R; j=1,,R; ij).

Also, see Bishop, Fienberg, and Holland (Citation1975, p.282). The global symmetry (GS) model proposed by Read (Citation1977) is given as δ1=δ2,where δ1=i=1R1j=i+1Rpij and δ2=j=1R1i=j+1Rpij. The CS model proposed by McCullagh (Citation1978) is defined as pij=Γpji(i<j).

Also, see Agresti (Citation1990, p.361). A special case of this model is the S model when Γ = 1.

Tomizawa (Citation1994) and Tomizawa, Seo, and Yamamoto (Citation1998) proposed measures that represent the degree of departure from S for nominal square contingency tables. Incidentally, Tomizawa (Citation1994) considered KL-type measures using the Shannon entropy and Gini concentration, while Tomizawa, Seo, and Yamamoto (Citation1998) considered a generalization of Tomizawa’s (Citation1994) measures using the power divergence (Cressie and Read Citation1984) or the diversity index (Patil and Taillie Citation1982). For ordinal square contingency tables, Tomizawa, Miyamoto, and Hatanaka (Citation2001) proposed a measure to represent the degree of departure from S. Moreover, Tomizawa (Citation1995) and Tomizawa and Saitoh (Citation1999) considered measures representing the degree of departure from GS and CS, respectively. Tomizawa and Saitoh (Citation1999) also gave the theorem that the measure from S is equal to the sum of the measure from GS and the measure from CS. See Appendixes A, B, and C for the measure ψS,ψGS, and ψCS proposed by Tomizawa (Citation1994), Tomizawa (Citation1995), and Tomizawa and Saitoh (Citation1999), respectively.

Here, we consider the (R1)(R2)/2 (being (R12)) ways of collapsing the R × R original table with ordered categories into a 3 × 3 table by choosing cutpoints after the sth and tth rows and after the sth and tth columns for 1s<tR1. We refer to each collapsed 3 × 3 table as the Tst table (1s<tR1). In the collapsed Tst table, let Gkl(s,t) denote the corresponding probability for row value k (k=1,2,3) and column value l (l=1,2,3). That is, G11(s,t)=i=1sj=1spij,G12(s,t)=i=1sj=s+1tpij,G13(s,t)=i=1sj=t+1Rpij,G21(s,t)=i=s+1tj=1spij,G22(s,t)=i=s+1tj=s+1tpij,G23(s,t)=i=s+1tj=t+1Rpij,G31(s,t)=i=t+1Rj=1spij,G32(s,t)=i=t+1Rj=s+1tpij,G33(s,t)=i=t+1Rj=t+1Rpij.

Then the S model is expressed as Gkl(s,t)=Glk(s,t)(k=1,2,3; l=1,2,3; kl),for all s and t (1s<tR1). See Yamamoto, Tahata, and Tomizawa (Citation2012). Similarly, the CS model can be expressed as Gkl(s,t)=ΓGlk(s,t)(k<l), for all s and t (1s<tR1). Here, consider a model defined by Δ1(s,t)=Δ2(s,t), where Δ1(s,t)=k=12l=k+13Gkl(s,t) and Δ2(s,t)=l=12k=l+13Gkl(s,t) for all s and t (1s<tR1). This model is not equivalent to the GS model. This model is referred to as the CoGS model (Yamamoto, Tahata, and Tomizawa Citation2012).

Assuming that {Gkl(s,t)+Glk(s,t)>0} for 1s<tR1 and 1k<l3, the measure to represent the degree of departure from S proposed by Yamamoto, Shimada, and Tomizawa (Citation2015) can be expressed as ΨS=1(R12)s=1R2t=s+1R1ΨS(s,t),where ΨS(s,t)=1log2k=12l=k+13[Gkl(s,t)log(Gkl(s,t)(Gkl(s,t)+Glk(s,t))/2)+Glk(s,t)log(Glk(s,t)(Gkl(s,t)+Glk(s,t))/2)],Δ(s,t)=k=12l=k+13(Gkl(s,t)+Glk(s,t)),Gkl(s,t)=Gkl(s,t)Δ(s,t).

It should be noted that the measure ΨS using collapsed 3 × 3 tables completely differs from the measure proposed by Tomizawa, Miyamoto, and Hatanaka (Citation2001).

3 Measures for collapsed 3 × 3 tables

3.1 Measure of departure from CoGS

Assuming that {Δ(s,t)>0} for 1s<tR1, consider a measure representing the degree of departure from CoGS, which is defined as ΨCoGS=1(R12)s=1R2t=s+1R1ΨCoGS(s,t),where ΨCoGS(s,t)=1log2k=12Δk(s,t)log(Δk(s,t)1/2),Δ1(s,t)=Δ1(s,t)Δ(s,t),Δ2(s,t)=Δ2(s,t)Δ(s,t).

The measure ΨCoGS is between 0 and 1. ΨCoGS has the following characteristics:

  1. ΨCoGS=0 if and only if CoGS holds;

  2. ΨCoGS=1 if and only if Δ1(s,t)=0 (then Δ2(s,t)>0) or Δ2(s,t)=0 (then Δ1(s,t)>0) for 1s<tR1.

Note that the definition of the maximum departure from CoGS can also be expressed as pij = 0 (then pji > 0) or pji = 0 (then pij > 0) for all i < j. Although the CoGS model is not equivalent to the GS model, the maximum departure has the same definition in both models.

3.2 Measure of departure from CS

Assuming that {Δ1(s,t)>0},{Δ2(s,t)>0} and {Gkl(s,t)+Glk(s,t)>0} for 1s<tR1 and 1k<l3, a measure representing the degree of departure from CS is defined as ΨCS=1(R12)s=1R2t=s+1R1ΨCS(s,t),where ΨCS(s,t)=1log2k=12l=k+13[Gkl(s,t)log(Gkl(s,t)Δ1(s,t)(Gkl(s,t)+Glk(s,t)))+Glk(s,t)log(Glk(s,t)Δ2(s,t)(Gkl(s,t)+Glk(s,t)))].

ΨCS is between 0 and 1. ΨCS has the following characteristics:

  1. ΨCS=0 if and only if CS holds;

  2. ΨCS=1 if and only if Δ1(s,t)=Δ2(s,t) and Gkl(s,t)=0 (then Glk(s,t)>0) or Glk(s,t)=0 (then Gkl(s,t)>0) for 1s<tR1 and 1k<l3.

3.3 Relationships between the measures

Assuming that {Δ1(s,t)>0},{Δ2(s,t)>0} and {Gkl(s,t)+Glk(s,t)>0} for 1s<tR1 and 1k<l3, the following theorems are obtained.

Theorem 1.

The value of ΨS is equal to the sum of the value of ΨCoGS and the value of ΨCS.

Proof.

Consider that the sum of the value of ΨCoGS and the value of ΨCS can be expressed as ΨCoGS+ΨCS=1(R12)s=1R2t=s+1R1(ΨCoGS(s,t)+ΨCS(s,t)),where ΨCoGS(s,t)+ΨCS(s,t)=1log2k=12l=k+13[Gkl(s,t){log(Δ1(s,t)1/2)+log(Gkl(s,t)Δ1(s,t)(Gkl(s,t)+Glk(s,t)))}+Glk(s,t){log(Δ2(s,t)1/2)+log(Glk(s,t)Δ2(s,t)(Gkl(s,t)+Glk(s,t)))}]=1log2k=12l=k+13[Gkl(s,t)log(Gkl(s,t)(Gkl(s,t)+Glk(s,t))/2)+Glk(s,t)log(Glk(s,t)(Gkl(s,t)+Glk(s,t))/2)].

Consequently, ΨCoGS(s,t)+ΨCS(s,t)=ΨS(s,t). Then ΨCoGS+ΨCS=ΨS. □

From Theorem 1, ΨCS is expressed as ΨCS=ΨSΨCoGS. Therefore, the measure ΨCS should indicate the degree of departure from S without the influence of the degree of departure from CoGS. That is, ΨCS indicates the degree of departure from S under the condition that there is a structure of CoGS.

Under Theorem 1, Theorem 2 can be obtained considering ΨCS0.

Theorem 2.

The value of ΨS is greater than or equal to the value of ΨCoGS. The equality holds if and only if there is a conditional symmetry structure in the R × R table.

From 0ΨS1 and 0ΨCoGS<1 (note that ΨCoGS1 due to the assumption that {Δ(s,t)>0}), we see that 0ΨCS1. Therefore, (i) ΨS=ΨCoGS if and only if there is a structure of CS (ΨCS=0), and (ii) ΨS=1 and ΨCoGS=0 if and only if ΨCS=1. Moreover, according to the KL information, ΨCS represents the degree of departure from CS, and the degree of departure increases as the value of ΨCS increases. That is, ΨCS is the difference between the degree of departure from S and that from CoGS.

4 Approximate confidence interval for measure

Let nij denote the observed frequency in the ith row and jth column of the table (i=1,,R; j=1,,R). The sample version of ΨCoGS (ΨCS), which is, Ψ̂CoGS (Ψ̂CS), is given by ΨCoGS where {pij} is replaced by {p̂ij}. Here, p̂ij=nij/n and n=nij. Assuming that {nij} results from full multinomial sampling, we consider the approximate standard error for Ψ̂CoGS and a large-sample confidence interval for ΨCoGS. Using the delta method, n(Ψ̂CoGSΨCoGS) has an asymptotically (as n) normal distribution with a mean of zero and a variance σ2[ΨCoGS]. See Appendixes D and E for details of σ2[ΨCoGS] and σ2[ΨCS].

Let σ̂2[ΨCoGS] denote σ2[ΨCoGS] where {pij} is replaced by {p̂ij}. Then σ̂[ΨCoGS]/n is an estimated approximate standard error for Ψ̂CoGS, and Ψ̂CoGS±zp/2σ̂[ΨCoGS]/ n is an approximate 100(1p) percent confidence interval for ΨCoGS, where zp/2 is the 100(1p/2)th percentile of the standard normal distribution.

5 Examples

shows the cross classifications of the respondents’ opinions about spending the evening with friends or neighbors in 2012 and 2016. Here, we considered low-income respondents (annual income under $15,000). The response categories are (1) “almost every day,” (2) “once or twice a week,” (3) several times a month,” (4) “about once a month,” (5) “several times a year,” (6) “about once a year,” and (7) “never.” The degree of departure from CS between is compared using ΨCS. shows that the estimated values of ΨCS are 0.100 for and 0.133 for . Although the confidence intervals overlap, the degree of departure from CS is greater in than in . By contrast, the existing measure ψCS cannot be estimated, because the sum of the symmetric cells is 0.

Table 3 Estimates of measures ΨCS,ΨCoGS, and ΨS, approximate standard errors, and 95% confidence intervals applied to the data in .

describes the cross classifications of respondent’s opinions regarding everyday discrimination (people act as if they think the respondent is not smart; the respondent is harassed) in 2021. Here, we considered low-income respondents (annual income under $8,000). The response categories are (1) “almost every day,” (2) “at least once a week,” (3) “a few times a month,” (4) “a few times a year,” (5) “less than once a year,” and (6) “never.” The confidence intervals for ΨS and ΨCoGS do not include zero (see ), indicating that S and CoGS do not have a structure. By contrast, the confidence interval for ΨCS includes zero (see ), suggesting that CS has a structure in the table. Moreover, Theorem 1 shows that the lack of structure of the S model is due to the lack of structure of the CoGS model rather than that of the CS model. Thus, the CS model may reveal that respondents treated as not smart are harassed more often in daily life. By contrast, existing measures ψCS and ψS cannot be estimated because the sum of symmetric cells is 0.

6 Discussion

The value of ΨCS increases as the difference between the degree of departure from S and that from CoGS increases. This is especially pronounced as the degree of departure from S increases while that from GS decreases. The value of ΨCS reaches the maximum (= 1) when the degree of departure from S is maximized (= 1) and that from CoGS is minimized (= 0). Therefore, ΨCS is useful to visualize the degree of departure from CS on the complete asymmetry with the CoGS structure.

It is also meaningful to consider collapsed 3 × 3 tables only when the original square contingency table has ordered categories because the collapsed tables are obtained by combining adjacent categories. The measure in square ordinal tables should depend on the listing order of the categories. The proposed measures are not invariant under arbitrary similar permutations of row and column categories, except for the reverse order. Moreover, whether the submeasures of the collapsed tables are invariant does not matter because each collapsed 3 × 3 table obtained from an original square table is unique.

We show that even if the estimated measures of ψCS and ψS cannot be calculated because the sum of symmetric cells is zero in the original table, the estimated measures of ΨCS and ΨS can be calculated in some cases. This property may be useful, especially for a small-size sample or a large dimension of R.

It should be noted that Yamamoto, Shimada, and Tomizawa (Citation2015) gave the power divergence-type (including the KL) measure to represent the degree of departure from S. However, using the power divergence does not provide a result similar to Theorem 1. The CoGS and CS models do not impose restrictions on the diagonal cell probabilities. Therefore, it seems natural that the proposed measures and their ranges are independent of the diagonal cell probabilities.

The asymptotic normal distribution of n(Ψ̂CoGSΨCoGS) is not applicable when ΨCoGS=0 or ΨCoGS=1 because σ2[ΨCoGS]=0. Additionally, the asymptotic normal distribution of ΨCS is not applicable for the same reason. The above issues are common to existing measures not only the proposed measures. As an example to address these issues, we conducted Monte Carlo simulations using the proposed measure (see Appendix F). In the simulations, four contingency tables were assumed with true values of the proposed measure ranging from 0.000 to 0.090, and the sampling distribution of the proposed measures were visualized in a histogram. When the true value of the measure is 0.000, the asymptotic normal distribution may be difficult to be applicable. However, with a large enough sample, the asymptotic normal distribution may be applicable even if the true value of the measure is close to 0.000. Therefore, we would consider using resampling methods or Bayesian methods when the true values of measures are very close to 0.000 or 1.000, or the small sample size. As an example of a Bayesian approach, Momozaki et al. (Citation2021) considered new estimators of measures that can reduce the bias and mean squared error even without a sufficient sample size using the Bayesian estimators of cell probabilities.

7 Conclusion

Considering collapsed 3 × 3 tables can simplify the interpretation. If determining how to choose unique cutpoints is difficult, then it is reasonable to evaluate the collapsed square contingency tables for various patterns. Thus, we consider that all ΨCoGS(s,t) (ΨCS(s,t)) are combined at the same weights 1/(R12). The proposed measures are useful for comparing the degrees of departure from S, CoGS, and CS in several tables since the proposed measures always range between 0 and 1 and are independent of the sample size and the dimension R. The proposed theorems are also meaningful to comprehend the relation of these three measures.

Acknowledgments

The authors would like to thank the referees for their comments.

Additional information

Funding

This work was supported by a Grant-in-Aid for Research Activity Start-up from JSPS MEXT KAKENHI (Number 21463537).

References

  • Agresti A. 1990. Categorical data analysis. New York: John Wiley.
  • Aizawa, M., K. Yamamoto, and S. Tomizawa. 2021. Measure of departure from average marginal homogeneity for the analysis of collapsed ordinal square contingency tables. Biometrical Letters 58 (1):81–94. doi: 10.2478/bile-2021-0006.
  • Bishop Y. M. M., Fienberg S. E., and Holland P. W. 1975. Discrete multivariate analysis: Theory and practice.Cambridge: The MIT Press.
  • Bowker, A. H. 1948. A test for symmetry in contingency tables. Journal of the American Statistical Association 43 (244):572–4. 18123073 doi: 10.1080/01621459.1948.10483284.
  • Cressie, N, and T. R. Read. 1984. Multinomial goodness-of-fit tests. Journal of the Royal Statistical Society: Series B (Methodological) 46 (3):440–64. doi: 10.1111/j.2517-6161.1984.tb01318.x.
  • Davern M., Bautista R., Freese J., Morgan S. L., and Smith T. W., Bautista R., Freese J., Morgan S. L., and Smith T. W., 2021. General Social Surveys, 1972-2021 Cross-section [machine-readable data file, 68,846 cases]. Principal Investigator, Davern, M.; Co-Principal Investigators, Bautista, R., Freese, J., Morgan, S.L., and Smith, T.W.; Sponsored by National Science Foundation.-NORC ed.- Chicago: NORC at the University of Chicago [producer and distributor]. Data accessed from the GSS Data Explorer website at gssdataexplorer.norc.org.
  • Mccullagh, P. 1978. A class of parametric models for the analysis of square contingency tables with ordered categories. Biometrika 65 (2):413–8. doi: 10.1093/biomet/65.2.413.
  • Momozaki T., Cho K., Nakagawa T., and Tomizawa S. 2021. Estimation of measures for two-way contingency tables using the Bayesian estimators. arXiv preprint, https://arxiv.org/abs/2109.09339.
  • Patil, G. P., and C. Taillie. 1982. Diversity as a concept and its measurement. Journal of the American Statistical Association 77 (379):548–61. doi: 10.1080/01621459.1982.10477845.
  • Read, C. B. 1977. Partitioning chi-squape in contingency tables: A teaching approach. Communications in Statistics- Theory and Methods 6 (6):553–62. doi: 10.1080/03610927708827513.
  • Shinoda, S., K. Yamamoto, K. Tahata, and S. Tomizawa. 2020. A measure of asymmetry for ordinal square contingency tables with an application to modified LANZA score data. Journal of Applied Statistics 47 (7):1251–60. 35707026 doi: 10.1080/02664763.2019.1673325.
  • Tomizawa, S. 1994. Two kinds of measures of departure from symmetry in square contingency tables having nominal categories. Statistica Sinica 4 (1):325–34.
  • Tomizawa, S. 1995. Measures of departure from global symmetry for square contingency tables with ordered categories. Behaviormetrika 22 (1):91–8. doi: 10.2333/bhmk.22.91.
  • Tomizawa, S., N. Miyamoto, and Y. Hatanaka. 2001. Theory and methods: Measure of asymmetry for square contingency tables having ordered categories. Australian & New Zealand Journal of Statistics 43 (3):335–49. doi: 10.1111/1467-842X.00180.
  • Tomizawa, S., and K. Saitoh. 1999. Kullback-Leibler information type measure of departure from conditional symmetry and decomposition of measure from symmetry for contingency tables. Calcutta Statistical Association Bulletin 49 (1-2):31–40. doi: 10.1177/0008068319990103.
  • Tomizawa, S., T. Seo, and H. Yamamoto. 1998. Power-divergence-type measure of departure from symmetry for square contingency tables that have nominal categories. Journal of Applied Statistics 25 (3):387–98. doi: 10.1080/02664769823115.
  • Yamamoto, K., F. Shimada, and S. Tomizawa. 2015. Measure of departure from symmetry for the analysis of collapsed square contingency tables with ordered categories. Journal of Applied Statistics 42 (4):866–75. doi: 10.1080/02664763.2014.993362.
  • Yamamoto, K., K. Tahata, and S. Tomizawa. 2012. Some symmetry models for the analysis of collapsed square contingency tables with ordered categories. Calcutta Statistical Association Bulletin 64 (1-2):21–36. doi: 10.1177/0008068320120102.

Appendix A

Appendix A

Assuming that {pij+pji>0} for 1i<jR, the measure to represent the degree of departure from the symmetry proposed by Tomizawa (Citation1994) is given as ψS=1log2i=1R1j=i+1R[pijlog(pij(pij+pji)/2)+pjilog(pji(pij+pji)/2)],where δ=i=1R1j=i+1R(pij+pji),   pji=pji/δ.

Appendix B

Appendix B

Assuming that δ1+δ2>0, the measure to represent the degree of departure from the global symmetry proposed by Tomizawa (Citation1995) is given as ψGS=1log2i=12δilog(δi1/2),ZψGS=1log2i=12δilog(δi1/2),where δ1=i=1R1j=i+1Rpij,   δ2=j=1R1i=j+1Rpij,   δi=δiδ.

Appendix C

Appendix C

Assuming that δ1>0,δ2>0 and {pij+pji>0} for 1i<jR, the measure to represent the degree of departure from the conditional symmetry proposed by Tomizawa and Saitoh (Citation1999) is given as ψCS=1log2i=1R1j=i+1R[pijlog(pijδ1(pij+pji))+pjilog(pjiδ2(pij+pji))].

Appendix D

Appendix D

Using the delta method,  n(Ψ̂CoGSΨCoGS) has an asymptotic variance σ2[ΨCoGS] given as σ2[ΨCoGS]=i=1R1j=i+1R(pijWij2+pjiVji2)[i=1R1j=i+1R(pijWij+pjiVji)]2,where Wij=1(R12)s=1R2t=s+1R1I(1)1log2Δ2(s,t)(Δ(s,t))2log(Δ1(s,t)Δ2(s,t)),Vji=1(R12)s=1R2t=s+1R1I(2)1log2Δ1(s,t)(Δ(s,t))2log(Δ2(s,t)Δ1(s,t)),1=(1is,s+1jt)(1is,t+1jR)(s+1it,t+1jR),2=(1js,s+1it)(1js,t+1iR)(s+1jt,t+1iR), and I(·) is the indicator function.

Appendix E

Appendix E

Using the delta method,  n(Ψ̂CSΨCS) has an asymptotic variance σ2[ΨCS] given as σ2[ΨCS]=i=1R1j=i+1R(pijAij2+pjiBji2)[i=1R1j=i+1R(pijAij+pjiBji)]2,where Aij=1(R12)s=1R2t=s+1R11(Δ(s,t))2log2[I(1)C12(s,t)I(2)D12(s,t)+I(3)C13(s,t)I(4)D13(s,t)+I(5)C23(s,t)I(6)D23(s,t)],Bij=1(R12)s=1R2t=s+1R11(Δ(s,t))2log2[I(7)E12(s,t)I(8)F12(s,t)+I(9)E13(s,t)I(10)F13(s,t)+I(11)E23(s,t)I(12)F23(s,t)],Ckl(s,t)=Δ(s,t)log(Δ(s,t)Gkl(s,t)Δ1(s,t)(Gkl(s,t)+Glk(s,t)))Gkl(s,t){log(Δ(s,t)Gkl(s,t)Δ1(s,t)(Gkl(s,t)+Glk(s,t)))1}Glk(s,t){log(Δ(s,t)Glk(s,t)Δ2(s,t)(Gkl(s,t)+Glk(s,t)))1}Δ(s,t)Gkl(s,t)Δ1(s,t),Dkl(s,t)=Gkl(s,t){log(Δ(s,t)Gkl(s,t)Δ1(s,t)(Gkl(s,t)+Glk(s,t)))1}+Glk(s,t){log(Δ(s,t)Glk(s,t)Δ2(s,t)(Gkl(s,t)+Glk(s,t)))1}+Δ(s,t)Gkl(s,t)Δ1(s,t),Ekl(s,t)=Δ(s,t)log(Δ(s,t)Glk(s,t)Δ2(s,t)(Gkl(s,t)+Glk(s,t)))Gkl(s,t){log(Δ(s,t)Gkl(s,t)Δ1(s,t)(Gkl(s,t)+Glk(s,t)))1}Glk(s,t){log(Δ(s,t)Glk(s,t)Δ2(s,t)(Gkl(s,t)+Glk(s,t)))1}Δ(s,t)Glk(s,t)Δ2(s,t),Fkl(s,t)=Gkl(s,t){log(Δ(s,t)Gkl(s,t)Δ1(s,t)(Gkl(s,t)+Glk(s,t)))1}+Glk(s,t){log(Δ(s,t)Glk(s,t)Δ2(s,t)(Gkl(s,t)+Glk(s,t)))1}+Δ(s,t)Glk(s,t)Δ2(s,t), 1=(1is,s+1jt),2=(1is,t+1jR)(s+1it,t+1jR),3=(1is,t+1jR),4=(1is,s+1jt)(s+1it,t+1jR),5=(s+1it,t+1jR),6=(1is,s+1jt)(1is,t+1jR),7=(1js,s+1it),8=(1js,t+1iR)(s+1jt,t+1iR),9=(1js,t+1iR),10=(1js,s+1it)(s+1jt,t+1iR),11=(s+1jt,t+1iR),12=(1js,s+1it)(1js,t+1iR), and I(·) is the indicator function.

Appendix F

Appendix F

To confirm the sampling distribution of the proposed measures, we conducted Monte Carlo simulations. As a 6 × 6 table, we assumed random sampling by a multinomial random number based on the structures of probabilities in . The sample size was considered n=180,360,1,800 (i.e., sparseness index=5,10,50). Each simulation studies were performed based on 1,000 trials.

The true values of both proposed measures were ΨCoGS=ΨCS=0.000 in , ΨCoGS=0.027 and ΨCS=0.022 in , ΨCoGS=0.059 and ΨCS=0.061 in , ΨCoGS=0.087 and ΨCS=0.092 in .

Then, represented the sampling distributions. (the true values are 0.000) showed that the sampling distributions of the proposed measures were not normal centered at the estimates, as expected. (the true values are 0.022, 0.027) showed that the asymptotic normal distribution could be applicable with a large enough sample. Furthermore, if the true value was greater than about 0.060, the sampling distribution was found to be almost normally centered.

Therefore, we must be careful using asymptotic distributions when the true value of a measure is expected to be 0.000 and/or the sample size is not large enough.

Fig. F1 The sampling distribution obtained from the structure of probabilities in .

Fig. F1 The sampling distribution obtained from the structure of probabilities in Table F1a.

Fig. F2 The sampling distribution obtained from the structure of probabilities in .

Fig. F2 The sampling distribution obtained from the structure of probabilities in Table F1b.

Fig. F3 The sampling distribution obtained from the structure of probabilities in .

Fig. F3 The sampling distribution obtained from the structure of probabilities in Table F1c.

Fig. F4 The sampling distribution obtained from the structure of probabilities in .

Fig. F4 The sampling distribution obtained from the structure of probabilities in Table F1d.

Table F1 Patterns of probability structure in a square contingency table.