775
Views
74
CrossRef citations to date
0
Altmetric
Original Articles

Social Science Methods for Twins Data: Integrating Causality, Endowments, and Heritability

, &
Pages 88-141 | Published online: 16 May 2011
 

Abstract

Twins have been extensively used in economics, sociology, and behavioral genetics to investigate the role of genetic endowments on a broad range of social, demographic, and economic outcomes. However, the focus in these literatures has been distinct: The economic literature has been primarily concerned with the need to control for unobserved endowments—including as an important subset, genetic endowments—in analyses that attempt to establish the impact of one variable, often schooling, on a variety of economic, demographic, and health outcomes. Behavioral genetic analyses have mostly been concerned with decomposing the variation in the outcomes of interest into genetic, shared environmental, and non-shared environmental components, with recent multivariate analyses investigating the contributions of genes and the environment to the correlation and causation between variables. Despite the fact that twins studies and the recognition of the role of endowments are central to both of these literatures, they have mostly evolved independently. In this paper, we develop formally the relationship between the economic and behavioral genetic approaches to the analyses of twins, and we develop an integrative approach that combines the identification of causal effects, which dominates the economic literature, with the decomposition of variances and covariances into genetic and environmental factors that are the primary goal of behavioral genetic approaches. We apply this integrative ACE-β approach to an illustrative investigation of the impact of schooling on several demographic outcomes such as fertility and nuptiality and health.

Acknowledgments

An earlier version of this paper was presented at the conference on “Integrating Genetics and the Social Sciences” in Boulder, CO, June 2–3, 2010. We are grateful for the many helpful comments and suggestions provided by Jason D. Boardman, Joseph Rodgers and the participants of the Boulder conference. We also gratefully acknowledge the generous support for this research through NIH grants RO1 HD046144 and RO1 HD043417.

Notes

1While not the focus of our discussion here, it is important to point out that there have been many other uses of twins data in the social sciences. Historically, for example, the predominant use probably has been for univariate heritability estimates of the ratio of genetic variance to phenotypic variance in a linear model. Also in economics, the combination of identical and fraternal twins has been used to investigate how intrafamilial allocations (say, of schooling among children) respond to individual-specific endowments (e.g., CitationBehrman et al. 1994). The birth of twins has also been used to represent unexpected increases in fertility and to estimate quantity-quality fertility models and to study the consequences of fertility on other life-course outcomes (CitationRosenzweig and Wolpin 1980a, Citation1980b). CitationBehrman, Kohler, and Schnittker (2010) provide a comprehensive treatment of twins methods for social scientists that includes both conceptual and methodological discussions that are beyond the scope of this paper.

2An extensive literature exists that discusses these assumptions and the potential implications of violations of these assumptions (e.g., CitationBehrman et al. 2010; CitationDerks et al. 2006; CitationGuo 2005; CitationHobcraft 2003; CitationPlomin et al. 2005). There also exist several ways to test or relax these assumptions if additional data are available (e.g., CitationBehrman et al. 2010; CitationNeale and Maes 2004; CitationPlomin et al. 2005), including for example the incorporation of assortative mating if data on spouses is available, or the consideration of dominance genetic effects if additional sibling categories (half siblings, adopted children) are available.

5This point holds even if some random assignment (e.g., of incentives for attending school) is used as an instrument to attempt to identify the impact of schooling x on fertility y. Such identification occurs only under the assumption that the random assignment does not affect the outcome y through other channels (e.g., financial wealth accumulation) than through x.

6We use this presentation because this paper is concerned with integrating the economic and behavioral genetics approaches to twins data. The more conventional way of presenting the economic fixed-effects model is as follows, where the fertility y ij of twin i in pair j is related to schooling x ij as yij = βxij + fi + aij + vij and xij = affj + aaaij + uij , where β = the effect of schooling on fertility (to be estimated), f j = family endowments common to both twins in pair j, a ij = endowments specific to twin i in pair j, v ij = random fertility shocks specific to twin i in pair j, α f = effect of family endowments f j on schooling x ij , α a = effect of individual-specific endowments a ij on schooling x ij , and u ij = disturbance affecting x ij but not y ij except indirectly through x ij . The model can also be extended to allow for sibling endowment effects on schooling by specifying x ij = α f f j + α a a ij + α s a kj + u ij , where α s a kj is the effect of twin i's co-twin k's endowment on i's schooling x ij .

Figure 1. Path-diagram for the economics fixed-effects model for twins.

Figure 1. Path-diagram for the economics fixed-effects model for twins.

7For a critical discussion of this key assumption, see CitationGriliches (1979) and CitationBound and Solon (1999).

8In addition, as siblings other than twins are of differential ages, the argument of sibling models that within-sibling estimates control for all relevant social endowments so that the path e yx in can reasonably be assumed to be zero is weaker than in the case of twins who are born at the same time and thus share factors such as parents' ages, socioeconomic conditions, and soon all at the same age.

9More generally, can also represent the effect of any other sibling's specific endowments on i's schooling attainment.

10If the correlation in measurement error between siblings (ρϵ) is nonzero, , where φ = (1 – ρ w )/(1 – ρ x ). Note that the measurement error bias in the within-sibling estimate is decreasing in ρ w and is less in the within-sibling estimate than in the standard estimate if ρ w > ρ x . We are not aware of any estimates of ρ w . However, what appears to be random noise in cross-sectional data may have a family component if the measurement error is due to such unobserved factors as exaggeration or modesty or to failure to control for school quality, all of which may be shared by siblings.

11Ashenfelter and Krueger also find that correcting for measurement error leads to larger estimates than found by conventional ordinary least squares models. Behrman et al. and subsequent studies using this method have yielded measurement-error corrected estimates that are usually less than the OLS estimates, suggesting that conventional cross-sectional estimates of the schooling-wage association are, in any case, too large.

12For a detailed discussion of this ACE model and similar approaches for the study of twins and families, see for example CitationNeale and Maes (2004).

Figure 2. ACE model for the analyses of genetic, shared environmental and unique environmental components to variation in phenotype x.

Figure 2. ACE model for the analyses of genetic, shared environmental and unique environmental components to variation in phenotype x.

13The ACE model can be fit using any structural equation program, but some programs are better for samples of relatives. Mx (and more recently, its successor OpenMx) is perhaps the single most popular program for estimating behavioral genetics models, but other programs have functions that are also well-suited (CitationNeale Boker, Xie, and Maes 2006; CitationOpenMx Development Team 2010). On their web page, for example, M-Plus provides example scripts for assorted models using twins, including those discussed here. Likewise, LISREL scripts are provided in CitationNeale and Cardon (1992).

14The ACE model can easily be generalized to other relatives by focusing on the correlation among the A factors, as for example, parents and offspring share 50% of genes, half siblings 25%, first-cousins 12.5%, and so on. Such models also require making assumptions regarding C, which are less definitive than assumptions regarding A. Identifying genetic influences also requires relatives who differ in their level of shared genetic variance, which means that surveys in which all members of a household are interviewed are usually not sufficient for calculating heritability, as the expected child-parent and child-child correlations are all 0.5.

15The equal environments assumption requires that environmentally caused similarity for a particular phenotype be the same for both MZ and DZ twins and, thus, that the shared environment correlation between twins be identical for both types of twins. While some critics argue that this assumption is regularly and severely violated (CitationRichardson and Norgate 2005), the validity of this assumption needs to be evaluated in the specific context. Empirical tests of the equal environment assumption often provide support for the acceptability of this assumption. According to the critics, for example, MZ twins experience more similar environments than DZ twins, thereby inflating differences between the two types of twins and, in turn, inflating estimates of heritability. In response to such concerns, the validity of the equal environments assumption has been evaluated using mislabeled twins (twins labeled DZ when they are in fact MZ) or and MZ twins who are in fact treated differently (CitationScarr 1968). Both methods rely on the idea that MZ twins who are treated more individually should show more differences than those who are treated more similarly. Studies using both methods provide evidence for the validity of the assumption. Physical similarity, for example, is unrelated to twin similarity in personality (CitationMorris-Yates, Andrews, Howie, and Henderson 1990; CitationPlomin, Willerman, and Loehlin 1976) and concordance on many psychiatric disorders, with the notable exception of bulimia (CitationHettema, Neale, and Kendler 1995). CitationPlomin et al. (1976) find evidence that MZ twins who resembled each other more were less similar in personality, leading to a downwardly biased estimate of heritability. CitationKendler et al. (1993) explore concordance for several common psychiatric disorders as a function of real zygosity, as revealed by biological tests, and perceived zygosity, as reported by twins or their families. In their study, 15% of twin pairs (one or both members) disagreed with the zygosity assigned by investigators but perceived zygosity had no bearing on concordance for psychiatric disorders, including three disorders commonly studied by sociologists (i.e., major depression, generalized anxiety, and alcoholism).

16In addition to additive genetic factors, the model can easily be modified to include dominance effects; in standard twins data, however, additive genetic contributions cannot be distinguished from dominance genetic effects except under the restrictive assumption of no shared environmental influences, and our discussion, therefore, focuses on the additive genetic model; for a more extensive discussion of how additive and dominance genetic influences can be incorporated in twins and sibling analyses, see CitationNeale and Maes (2004).

17Twins data that include information about the characteristics of spouses can potentially identify the extent of assortative mating and can include this aspect explicitly in the analyses (see CitationNeale and Maes 2004). In addition, the assumption of no assortative mating in behavioral genetics analyses tends to be “conservative” in the sense that estimates of heritability in traditional behavioral genetics analyses will be biased towards zero if there is positive assortative mating.

18In addition to the structural equation (ACE) approach to estimating heritability, CitationDeFries and Fulker (1985) propose a method of estimating heritability (h 2) and common environmental influences (c 2) with twins data by a simple linear regression of a twin's trait on the co-twin's trait and the degree of genetic relatedness (see also CitationKohler and Rodgers 2000). In addition, several extensions of DeFries-Fulker analyses have been proposed that allow the consideration of genetic non-additivity (CitationWaller 1994), observed differences in non-shared environment (CitationRodgers, Rowe, and Li 1994), and binary or censored observations (CitationKohler and Rodgers 1999).

19This specification is also sometimes referred to as the Cholesky decomposition because it is based on a decomposition of the variance-covariance matrix into lower triangular matrices that is known as the “Cholesky decomposition”.

Figure 3. Bivariate ACE model for schooling and fertility.

Figure 3. Bivariate ACE model for schooling and fertility.

20The presentation of the bivariate ACE model uses the “Cholesky decomposition approach” of presenting this model; though this is the most frequently used bivariate ACE specification, there are other specifications of the latent genetic and social endowments that are observationally equivalent (CitationNeale and Cardon 1992; CitationNeale and Maes 2004).

21To identify β model in , additional moment conditions that link x (schooling) on the outcome y within the same individual would be required; whereas an extended ACE framework that adds additional sibling relationships (half-sibs, cousins, etc.) allows to identify more complex genetic models (e.g., see CitationNeale and Maes 2004), these additional sibling categories do not provide additional moment conditions linking x and y within individuals that provide identification of the causal pathway β between x (schooling) on the outcome y in the ACE-β model in .

Figure 4. ACE-β model with direct effect of schooling on fertility.

Figure 4. ACE-β model with direct effect of schooling on fertility.

22In , the condition of distinct latent influences (endowments and individual specific factors) for both x and y implies that all of the coefficients a yx , c yx and e yx are equal to zero. A direction of causality model would then try to separate whether the path between x and y is directed from x to y (xy) or vice versa (yx).

23Though e yx = 0 is a plausible assumption to achieve the identification of the model parameter in the ACE-β model, it is not the only possible assumption. Alternative assumptions are c yx = 0 or a yx = 0.

24Although the path β might be seen in the ACE-β framework as absorbing the influence of the individual-specific factors along the cross-path e yx in the conventional ACE model, the interpretation of the two approaches is fundamentally different, and the two specifications imply different moment conditions (see below) and can result in different estimates for all model parameters. In the ACE-β framework, β measures the direct causal effect of schooling (x ij ) on fertility (y ij ), and all individual-specific shocks to schooling affect fertility only through schooling. In the conventional ACE model, e yx measures the extent to which unobserved shocks that affect schooling also have an direct effect on fertility. Distinctions of this sort are informative, and a conventional ACE and an ACE-β model can lead to very different conclusions. In the empirical illustration we provide below, for instance, we show that the negative relationship between schooling and fertility observed within individuals is attributed in the ACE model to a negative coefficient for the path a yx : that is, to genetic factors that have a positive effect on schooling and a negative effect on fertility. In the ACE-β model, the negative association between schooling and fertility is predominantly due to a causal negative effect β of schooling on fertility. The interpretation of the ACE-β results is, therefore, much more consistent with the social science literature on the interrelation between schooling and fertility (e.g., CitationKravdal and Rindfuss 2008).

25Unique environmental influences affecting schooling x, however, are assumed to have no direct affect on fertility y, and in both approaches, unique environmental influences on schooling are assumed to affect fertility only through their effect on schooling.

26It is important to point out that, while the path-diagram of the children-of-twins design (CitationD'Onofrio et al. 2003) is isomorphic to that of the ACE-β framework in , the focus of these models is distinctly different: The COT design focuses on the estimation of the causal connection between parental behaviors and child outcomes, using parents who are twins to provide partial control for parental genetic endowments that also affect these child outcomes; in contrast to this intergenerational perspective, the ACE-β model focuses on behaviors/outcomes that occur across the life course of individuals, using the twins design to control for the genetic and social endowments ( and , and and ) that affect both x ij (schooling) and y (fertility).

27In some cases, if the data include other sibling categories in addition to twins, dominance and additive genetic effects can be estimated; also, when the data include information on spouses, aspects of assortative mating can be considered.

28In particular, this relationship follows from the variances/covariances in because

  • and the right-side term equals to β under the assumption of the ACE-β model that e yx = 0.

29Though this definition of heritability would be identical between the ACE-β and the bivariate ACE model, the estimated heritability would differ because both models would generally yield different parameter estimates.

30This follows by solving for the variance/covariance matrix of the observed phenotype P in the ACE-β model as given in the Appendix.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 129.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.