5,498
Views
2
CrossRef citations to date
0
Altmetric
Article Commentary

Recognizing and avoiding bias to improve child custody evaluations: Convergent data are not sufficient for scientific assessment

Abstract

Numerous academics have raised concerns about the scientific validity of custody evaluations. To improve the quality of evaluations, the root cause of highly trained people frequently producing reports with limited scientific value (no scientifically based nexus between the data and the conclusion) needs to be understood. Review of 50 evaluations found violations of scientific methodology including: cherry picking data, arbitrary determinations of contested facts, using ad hoc hypotheses rather than the results of empirical research, presenting highly speculative inferences as expert opinions, misinterpretation of psychological testing, incorrect interpretation of the behavior of abused children, and reliance on the presence of convergent data rather than hypothesis testing to reach conclusions. The key finding was that errors almost always helped the same parent, indicating that bias drove many of the errors. Combining cognitive psychology and behavioral economics provides an understanding of the high frequency and power of bias in the reports. Intuition, which is highly susceptible to bias, forges our initial opinions, and then confirmation bias affects the vetting, valuing, remembering and interpretation of the data used in abortive attempts at rigorous analysis. To improve the scientific validity of evaluations several steps are needed including (1) better oversight of evaluations, (2) educating evaluators and courts about scientific methodology, the high risk of bias, and the results of empirically based research, and (3) adoption of more structured approaches to custody evaluations.

Introduction

Researchers and practitioners have raised serious concerns about the scientific validity of custody evaluations including failure to use the findings of empirical research, presenting speculations rather than expert opinions, confirmation bias, lack of a consistent methodology, and lack of a nexus between data and conclusions (Emery et al., Citation2005; Martindale, Citation2005; Pepiton et al., Citation2014). In the words of Scott and Emery (Citation2014), “Clinical testimony in custody proceedings often fails to meet even minimal standards of scientific validity” (p. 71). On the surface the problem would appear to be a lack of training.

Lawyers have asked me to review custody evaluations performed by a variety of evaluators (psychologists and psychiatrists) in the United States on 50 occasions. The most common problems found in the reports were cherry picking data, making determinations of contested facts without providing a reasonable basis, using ad hoc hypotheses, ignoring the findings of empirically based research, inappropriately discounting allegations of domestic violence and child mistreatment, presenting speculative inferences as expert opinions, minimizing the issues of attachment and availability, and relying on convergent data rather than testing competing hypotheses. Unexpectedly, I found that violations of scientific methodology almost always favored one parent and hurt the other, suggesting that bias was playing a key role.

Case examples

Case 1

The evaluator wrote that Mother was overwhelmed emotionally and physically, had poor control of her anger, and suffered from a personality disorder, and possibly bipolar disorder. The possible impact on her parenting was not discussed. Instead, the evaluator presented an ad hoc hypothesis, claiming that these weaknesses were balanced by mother being the more democratic and responsive parent. The data, however, showed the opposite. For example, the mother objected to the children continuing their activities because she did not have the time and energy to transport them and she objected to father doing so. While dismissing valid reasons for concern about mother’s parenting, the evaluator essentially fabricated reasons to criticize father’s parenting. The evaluator claimed the father had compulsive personality traits, and then went through the criteria for compulsive personality disorder indicating how each was bad for parenting, even though the data did not support a conclusion that father had these characteristics or met the criteria. Particularly perplexing, although agreeing with father that the children should be allowed to continue in their current activities, the evaluator also criticized father for his position, writing that the children needed to learn to deal with frustration.

As often occurs in custody evaluations, both parents denied common weaknesses on the Minnesota Multiphasic Personality Inventory version 2 (MMPI-2) validity scales (Siegel, Citation1996). The evaluator asserted that this indicated that father’s ability to see the world in a logical and relevant manner was impaired. Although the mother received the same results on the MMPI-2 validity scales, the evaluator did not suggest it indicated there was a problem with her ability to understand the world. The evaluator also criticized father for starting the new custody battle, when it was actually mother who initiated it. Minimizing one parent’s weaknesses and exaggerating the weaknesses of the other parent destroys an evaluation’s scientific validity.

Case 2

Mother and the two middle school aged children reported that father had serious anger issues. Nevertheless, the children wanted weekly contact with their father. Father sought to divide the children’s time equally. Father denied the allegations of anger problems, said he never yelled, and accused mother of parental alienation. Father’s psychological testing was consistent with the allegations presented by the children and their mother. Moreover, mother produced audiotapes of father harshly berating the children and badmouthing mother to the children. The custody evaluator had the recordings but failed to report the crucial evidence. The children’s medical professionals also indicated father had serious anger issues and was self-centered. The evaluator dismissed their statements as well. The custody evaluator only spent 20 minutes with each child, but nevertheless opined that the children were lying and the problem was parental alienation by the mother. There was no evidence of a campaign of alienation other than father saying he heard mother and the children whispering and he assumed mother was badmouthing him. Although father had not been an active parent and had limited availability to personally take care of the children, while mother was a stay-at-home parent, the evaluator recommended the children spend half of their time at their father’s home, and if the relationship did not improve, they should primarily be in his home. In time, the judge admonished father for his harsh treatment of the children, the children wanted no contact with their father and multiple psychologists recommended the children’s contact with their father cease. The evaluator’s attempt to cure parental alienation that did not exist, by increasing their time in their father's home led the children to become estranged from their father and object to all visitation.

Understanding why and how bias powerfully affects well-trained evaluators, and exploring ways to ameliorate the problem, became the primary challenge for the author. The answer was found in the literatures on cognitive psychology and behavioral economics.

How bias & intuition subvert rational analysis

Dual processing theories hold that intuitive and analytic processes operate in parallel to give meaning to observations (Kahneman, Citation2011). System I (intuition) is automatic, outside of conscious awareness, associative, and very vulnerable to bias. Intuition is essentially pattern recognition. System II (deliberation-rigorous analysis) is deliberately begun, conscious, provides rule-based inferences, requires considerable cognitive resources, and is less susceptible to bias. People quickly develop intuitive understanding of situations, and then the rational analytic system reviews, and hopefully corrects, inaccurate intuitive judgments. Rigorous analysis is most successful when dealing with simple systems in which all variables are known and can be accurately measured. This is not the case in child custody evaluations. Family members’ descriptions of their history yield a mass of data of unclear significance, much of which is contradicted by other data. Therefore, evaluators must engage in sensemaking, a “process of constructing a meaningful representation (i.e., making sense) of some complex aspect of the world” including the connections between people and events (Lebiere et al., Citation2013, p. 1). In sensemaking, people must decide which observations are important and what they mean prior to analyzing the mass of accumulated data. Determinations of the significance and meaning of bits of datum are generally done intuitively rather than by rigorous hypothesis testing.

Therefore, intuition has crucial impact at two points in the evaluation process. It provides the initial understanding of what is occurring, and subsequently affects the evaluator’s assessment of the significance and meaning of each piece of available datum considered in the attempt at rational analysis. Intuition can seriously subvert attempts at rigorous analysis since, as Hammond opined (1996), “intuition is a hazard, a process not to be trusted, not only because it is inherently flawed by ‘biases’ but because the person who resorts to it is innocently and sometimes arrogantly overconfident when employing it” (p. 88). In other words, once the observer has an initial intuition-based impression, confirmation bias, the tendency to search for, vet, value, interpret and remember facts so that they support the existing belief, markedly undermines attempts at rigorous analysis (Martindale, Citation2005).

As a result, our “analysis of situations and appraisal of the environment…goes on mainly at the nonconscious level” (Mandler, Citation1975, p. 241) outside of our conscious control and is dominated by intuitive processes that are highly vulnerable to bias. Our analytic processes often produce rationales for preexisting beliefs, rather than scientifically based opinions (Shafir et al., Citation1993). In other words, the reasons people present for their opinions are often post-hoc rationalizations rather than the true reasons (Nisbett & Wilson, Citation1977a).

Types of bias

Heuristics

When faced with complex situations that strain our analytic abilities, people sometimes use heuristics, mental short-cuts, to estimate probabilities and values. Heuristics work by attribute substitution. When seeking to make an assessment of a complex situation or system, people will unconsciously replace the overall situation with a simpler factor that can be measured, and assume that the assessment of the simple factor provides the same result as analysis of the entire situation would have. There are three types of attribute substitution (heuristics): representativeness, availability, and affect (Tversky & Kahneman, Citation1974).

The representativeness heuristic refers to substituting an outward, readily assessable characteristic of a situation or person for the target attribute one is actually interested in. When people or experiences have outward similarities observers tend to assume that deeper similarities also exist. Single cues reminiscent of a prior experience or person may be sufficient to create strong beliefs about the current situation or individual. Moreover, as Nisbett and Wilson (Citation1977b) opined, a single readily apparent positive trait can positively influence the assessment of all other traits (Halo Effect), while a single significant negative trait can negatively influence the assessment of other traits (Horn Effect).

Availability refers to the preference given to powerful, recent, or repeated experiences when searching for an analogy to facilitate understanding of a new situation. For example, an evaluator, who recently had a case in which a parent made false allegations of abuse, is more likely to opine that a child’s rejection of a parent is due to parental alienation than if the evaluator had a recent case that they believed was actual abuse.

The affect heuristic refers to how our emotional reactions to people (our countertransference reactions) affect our assessments of them. If an evaluator is significantly more comfortable with one parent than the other, (s)he will tend to assess that parent more positively than the other parent. A frequent scenario entails a parent with psychopathic traits (lack of empathy, ruthless, manipulative, lack of remorse and guilt, charming) who mistreats the other parent and the children, and a victimized parent who is appropriately anxious about the children’s welfare. Finding it more pleasant speaking with the charming, psychopathic parent than with the anxious one, the affect heuristic leads evaluators to believe the charming, psychopathic parent’s rendition of events.

Other sources of bias

The Primary Attribution Error refers to the tendency to underestimate the importance of situational factors on people’s behavior, and to assume that what we observe is the result of their personality (Ross, Citation1977). For example, the normal anxiety parents suffer when facing the prospect of their children being forced to spend much of the week with their estranged spouse who mistreats them can be misinterpreted as evidence of an anxiety disorder or a problematic need to be in control. Similarly, evaluators sometimes reject valid complaints that a parent has anger issues or lacks empathy because the parent succeeds in being on their best behavior during an observation.

Anchoring refers to the power of initial impressions to constrain final ones (Tversky & Kahneman, Citation1974). For example, if an evaluator believes a parent is patient, but then finds evidence of irritability, the final conclusion will likely be that the parent is not as patient as thought, rather than that the parent is irritable. Anchoring occurs because our schemata (mental constructs organizing knowledge) are resistant to change and are more likely to be modified than discarded. Anchoring is remarkably powerful. For example, a father made mental illness allegations about his child’s mother, relegating her to minimal visitation. Even after multiple evaluations found that the allegations were false, the court continued to greatly limit mother’s time with her daughter. Moreover, when multiple professionals expressed serious concern about father’s treatment of the child, concerns that were far more serious and soundly based than father’s false allegations about mother, the court brushed the reports aside and suggested mother had somehow manipulated the independent professionals. The mechanism by which anchoring works is confirmation bias. Discrepant information is rejected, spun, or forgotten.

The term “motivated bias” refers to the tendency to believe that the action/opinion that would benefit you is the correct action/opinion. For example, hoping for future referrals, evaluators may be motivated to support the position they believe the guardian or judge holds. Once an evaluator renders an opinion in a case, that evaluator, and other potential evaluators, are motivated to support the original opinion rather than admit that they or a colleague made a mistake.

General principles for improving analytic rigor & reducing bias

Convergent data are not sufficient

Given the massive amount of data available to custody evaluators, and the ease with which data can be cherry picked and interpreted to support either parent, convergent data are generally insufficient to render an opinion to a reasonable degree of medical or psychological certainty. Further complicating the situation, Bem’s (Citation1972) self-perception theory would predict that the process of searching for convergent data leads evaluators to become more committed to their hypotheses. Overconfidence in one’s opinion can lead to premature closure of the analysis, failure to consider other possibilities, and is a major factor in medical/psychiatric error (Croskerry & Norman, Citation2008).

The need for hypothesis testing

Rigorous hypothesis testing, the core of the scientific method, decreases the risk of confirmation bias and improves the accuracy of assessments (Vallee-Tourangeau et al., Citation2000). Hypothesis testing entails searching for information supporting each hypothesis, and for information that could contradict each hypothesis. Attempting to use the available data to support each hypothesis helps the evaluator to appreciate that some of the data can be interpreted in different ways. It is particularly important to search for data to disprove the hypotheses (Popper, Citation1959) as well as to support them. For example, when assessing an allegation that a parent is domineering, it is crucial to search for examples of flexibility, as well as examples of domineering behavior.

Heuer (Citation2007) described a methodology for the Assessment of Competing Hypotheses (ACH) specifically designed to reduce the impact of cognitive biases on analysis. The steps in his methodology are:

  1. Identify possible hypotheses.

  2. Delineate evidence for and against each.

  3. Prepare a table listing the various hypotheses, the data for and against each, and which evidence is most important.

  4. Simplify the table, removing unimportant information.

  5. Assess the relative likelihood of the hypotheses, focusing on disproving them.

  6. Question the truth and importance of key assumptions and evidence.

  7. Present the relative likelihood of each hypothesis.

To avoid bias, it is helpful for an evaluator to engage in metacognition, and step back from the issue at hand, in order to think about the way a conclusion is being drawn and whether it meets the requirements of scientific assessment (Croskerry, Citation2005). As they interpret the meaning and significance of individual pieces of datum, evaluators should pause and assess whether the datum could be interpreted differently, and how reliable the datum is. At the end of the analysis, it is generally valuable to perform a “premortem” (i.e., assume that the assessment is incorrect and attempt to figure out what went wrong in the analytic process) (Klein, Citation2007). Having a colleague review the report and look for ways to challenge it can also be helpful.

Focus on empirical research

Empirical research has identified factors that have significant positive and negative effects on children’s welfare after divorce. For example, parental narcissism, anxiety and depression are problematic if the parent cannot shield the child from these issues (Dutton et al., Citation2011; Gunlicks & Weissman, Citation2008). Limiting the child’s time with the parent with whom the child has the greatest attachment, and limiting overall parent-child time, are problematic for children, while warmth, emotional support, authoritative parenting, adequate monitoring, support for school work and having age appropriate expectations are beneficial (Baumrind, Citation1975; Kelly & Johnston, Citation2005; Marvin & Schutz, Citation2009; Moran & Weinstock, Citation2011). Supporting the child’s strongest attachment(s) is very important to the child’s welfare (Byrne et al., Citation2005; Kraus & Pope, Citation2009). As Van der Kolk (Citation2014) noted, attachment is “the secure base from which a child moves out into the world… having a safe haven promotes self-reliance and ... the self-awareness, empathy, impulse control and self motivation that make it possible to become contributing members of the larger social culture” (p. 113).

The 50 reports reviewed generally failed to discuss these issues, or simply said that the child was attached to both parents, even when the children expressed marked distress being with one of the parents. Moreover, many reports used aspects of “Parental Alienation Syndrome” (PAS), widely discredited for several years now, to opine that a child’s rejection of a parent is the result of parental alienation (Lubit, Citation2019). Such reports lack scientific validity.

Various writers have noted that some evaluators have limited knowledge about the typical behaviors of abused children and spouses, and therefore misinterpret observations and data (Geffner et al., Citation2009; Pepiton et al., Citation2014; Saunders & Faller, Citation2016). For example, not realizing that maltreated children can behave in a warm and friendly way with a harsh parent during an observation session (lest the parent become angry when the observation is over and they are alone with the parent), custody evaluators sometimes misinterpret this behavior as evidence that there was no mistreatment (McDonald, Citation1998; Pepiton et al., Citation2014). This is a prominent example of the importance of the Fundamental Attribution Error. For example, a child reported significant mistreatment by her father and that she had nightmares about him. Evidence from psychological testing, and the observation of professionals, was consistent with the child’s report. In my observation of the child and her father, the girl showed no sign of discomfort. When I noted that she seemed comfortable, she reported that if she did not hide her discomfort, he would chastise her when no one was around. How a child and parent behave during observations may not be a valid indication of whether there has been interpersonal violence or abusiveness.

The use of ad hoc hypotheses (hypotheses created to discredit data that contradict the preferred hypothesis), rather than the results of empirical research, should raise serious questions about a report's scientific validity. Claiming that abuse complaints are false, without providing an adequate scientific basis, is an ad hoc hypothesis. Stating that one or two positive aspects of an individual negate the significance of serious negative factors is an ad hoc hypothesis. For example, an evaluator opined that father was a good parent, although he had been an absentee parent, had a bad temper and was being investigated for assaulting his teenage child, because he was “intelligent” and intelligent people can be good parents.

Many reports contained statements that were contrary to empirical research. A number of evaluators claimed that since there was no abuse, the child’s rejection of a parent was the result of parental alienation, although research has shown that various factors can lead a child to refuse visitation including harsh, self-centered, and restrictive parenting that does not reach the level of abuse (Johnston & Sullivan, Citation2020; Lubit, Citation2019). Evaluators at times rejected allegations of abuse because there was no “objective evidence,” although more often than not when abuse occurs there is no “objective evidence.”

Avoiding misuse of empirical research

Evaluators sometimes claim that research is relevant to the case they are dealing with when it is not. Care needs to be taken in interpreting research and applying research findings to specific situations. For example, evaluators and courts frequently assert that it has been shown that children are better off if both parents remain involved and the children spend considerable time with both parents. It is not scientific to assume that this general finding applies to all families. While children benefit from having two good, supportive parents in their lives, they do not benefit from being forced to spend time with a parent who mistreats them (Lamb & Kelly, Citation2009).

A second type of problem occurs when correlations are assumed to show causation, and the direction of causation is assumed rather than proven. Evaluators sometimes assert that fathers are more likely to stay active if they have overnight visitations, that fathers remaining active is better for children, and so fathers should have early overnights. While it is possible that giving fathers early overnights leads them to stay active, it is also possible that there is no causal connection, but that good parents both choose to stay involved and are given overnight visitation. Similarly, Bernet and Baker (Citation2013), in responding to critics saying that there were no data supporting parental alienation theory, wrote that people have found a correlation between disparaging statements by a parent and a child rejecting a parent. While it could be that disparaging comments lead children to reject parents, it could also be that when a parent is highly problematic, the children reject the parent and the preferred parent accurately presents negative information about the rejected parent.

Gathering and assessing information

The gathering and vetting of information present a prime opportunity for bias to derail attempts at objective analytic assessment. Relying on our memory of what was said, and our intuition to assess which parent’s version of events is most accurate, leaves the door wide open for bias to enter. Research has shown that clinicians are more likely to remember confirmatory findings than discrepant data, and sometimes even create distorted memories for data consistent with the final opinion (Arkes & Harkness, Citation1980). Even with note taking during interviews, important information is often omitted, especially if it goes against the evaluator’s preferred hypothesis (Lamb et al., Citation2000). This is a serious issue, given the critical role in scientific analysis of searching for contradictory information.

Although the court is the ultimate decider of fact, custody evaluators must often make assumptions about contested facts to reach tentative conclusions. It is helpful for evaluators to annotate these assumptions, explain why the assumptions were made, and state how their opinion would change if the court makes different findings of fact. Coming to conclusions about contested facts can be quite difficult. Observation of demeanor is notoriously unreliable (Bond & DePaulo, Citation2006), and what seems most likely may not be true. It is often helpful to search for data or corroborating information to assess which parent’s version of events is more accurate. For example, Jane’s parents disagreed about how much caretaking each did when mother’s work kept her overseas. Father said he did 99% of the parenting while mother made the very unlikely assertion that she did 60%. Review of passports and day care records showed that the child was either with mother in Europe or mother was in the US and taking the child to daycare 60% of the time as the mother asserted, seriously damaging father’s credibility.

It is vital that we hear information from both parents on each issue, and that we give the parents an opportunity to respond to the assertions of the other parent. It is crucial to avoid the temptation to believe the assertions that fit our preferred hypothesis or support the parent we feel more comfortable with. Unless there is a very solid basis for believing one parent over the other on a given issue, we should accept that we do not know the truth and should inform the court of the limitations of our knowledge.

Ignoring base rates is another frequent cause of error in assessments (Tversky & Kahneman, Citation1982). When there is no highly accurate method of assessing an issue, it is crucial to know the base rate for the issue (i.e., the frequency of the issue). For example, if the base rate for fabricating abuse allegations is 10%, 10 out of a group of 100 cases will be fabrications. If a method of assessing if allegations of abuse are fabricated has a 20% false positive rate, out of the 90 children who are telling the truth the test will claim 18 are making false allegations. Therefore, of the 28 cases in which the test finds people to be making false allegations, 18 are actually telling the truth. In general, when the false positive rate is greater than the base rate, a positive finding by the test is more likely to be false than true. In forensic evaluations, where information is contested and the evaluator’s ability to assess the accuracy of information is limited, an understanding of base rates (Bayesian Analysis) is necessary to assess the likelihood of a problem existing (Proeve, Citation2009; Tversky & Kahneman, Citation1982).

In assessing an evaluator’s abilities, it is important to see if the evaluator’s assessments of situations yield results consistent with base rates. If the evaluator finds that most children being interviewed are fabricating abuse, when the base rate is under 10%, the evaluator’s method of assessing for fabrication is flawed and likely biased.

Not wanting to make children take sides, or believing children are not reliable, many evaluators fail to gather crucial information from them. However, children can often provide invaluable information, including the children’s perspective concerning which parent is warmer, which parent is more patient, how much time each parent historically spent with them, what each parent does with them, where they feel safest, to whom they have the greatest attachment, and what each parent says about the other parent (Pepiton et al., Citation2014). “Children are not only relevant and competent witnesses to the process of their parent’s divorce, they are also the most reliable witnesses of their own experience” (Butler et al., Citation2002, p. 99). Research has shown that allegations of mistreatment by children are far more likely to be true than fabricated (Everson & Boat, Citation1989; Trocme & Bala, Citation2005).

Debiasing training

People who believe they are inherently objective are more vulnerable to bias than people who are aware of the risk (Uhlmann & Cohen, Citation2007). Commons et al. (Citation2004) assert that “forensic psychiatrists underestimate the biasing effects of their own conflicts of interest and of other factors… a state of relative denial exists among respondents as to the power of potentially biasing factors to affect their decision making” (p. 73). Even those who accept that they are vulnerable to bias often have undue confidence in introspection as a debiasing method (Neal & Brodsky, Citation2016).

Morewedge et al. (Citation2015) have used computer based interactive training to teach subjects about common sources of cognitive bias in order to improve decision-making. Regardless of the technique used, there needs to be repeated practice and training to prevent biases from affecting decisions (Banaji & Greenwald, Citation2013).

Improved oversight

Crucial role of feedback

Obtaining feedback on assessments is central to improving diagnostic accuracy (Schiff, Citation2008). However, custody evaluators generally receive relatively little feedback and so they do not know when they make a mistake. Peer review of reports, before submission to the court, would likely improve scientific rigor and reduce bias. While peer review would add expense to the forensic evaluation process, effective peer-review could reduce the amount of litigation, saving court time and litigant finances. Moreover, intensive review of reports would provide needed training and lead evaluators to take greater care in their work. Feedback is crucial for experience to provide greater expertise.

Vetting of experts and their opinions

Courts will generally accept as experts any psychologist, social worker, or psychiatrist. However, “psychology training and knowledge currently does not provide the expertise to perform the complex function of evaluating and comparing non-commensurable factors” (Scott & Emery, Citation2014, p. 71). Many evaluators lack “adequate scientific training and are therefore unaware of the serious limitations of the data they collect, the validity of the testing instruments they use, or the rigor needed to make inferences and draw conclusions from this information” (Kelly & Johnston, Citation2005, p. 233). Clinical training in psychology and psychiatry often encourages people to speculate about clients rather than limiting assertions to what can be scientifically determined. A significant shift in perspective is needed when clinicians engage in forensic work.

Lawyers frequently cite the extensive experience of evaluators as evidence of an evaluator’s skill. In reality, however, years of clinical experience, without intensive oversight, do not lead to better judgments (Garb, Citation1989). In fact, a dangerous feedback loop is often created over time in which the evaluator becomes increasingly confident based on having had many cases, even though all may have been incorrectly assessed. Overconfidence is a major reason for errors (Croskerry & Norman, Citation2008).

Not all opinions by experts are “expert opinions.” Frequently, experts proffer opinions that are speculations rather than true expert opinions. To forge an expert opinion, the evaluator needs to follow scientific methodology, including generating competing hypotheses, searching for information to support or to disprove each, and make use of relevant empirical research and solidly established principles of the field. Searching only for confirmatory data, and using ad hoc hypotheses and personal theories, is not scientific.

Roles for professional organizations

Professional organizations could play a valuable role in improving reports by providing education, certification and review of the work of evaluators. Expansion of current guidelines, including specifying the required content of reports would be beneficial. For example, perhaps evaluators could be required to opine on each of the factors that empirical research has shown to be significant. Professional organizations could develop position papers concerning the factors that are important, and list key articles as authoritative. Reports could be required to present a list of all significant allegations by each parent and the response by the other parent. Perhaps evaluators could be required to ask children and adults about specific important issues, such as how much time has each parent spent with the children in the past and what will be their availability to personally take care of the children in the future, how patient is each parent, does either parent have anger issues, and with whom the child feels safest. Reports could be required to contain clearly described testing of competing hypotheses.

Simplifying the question

Weighing multiple non-commensurate factors can be very difficult. Assessing the full range of parenting skills (including weekend and holiday skills) may be irrelevant or misleading in determining the best arrangement for the child. If weekends and holidays will be divided equally and the issue is how do divide up weeknights, it is most efficacious to focus on who is the better parent for the nights in question. If a parent is often away or not home until late, or preoccupied with work on a given weeknight, detailed assessment of parenting skills for that night has limited value. If availability and attachment were generally given high priority, the difficulty of assessing a myriad of non-commensurate factors might become unnecessary.

Sometimes, ascertaining which parent presents the least risk is more important than who is probably the better parent. If there is a significant possibility of harm in one of the houses (such as when allegations of mistreatment are seen as possible), focusing on the greater risk of harm will generally be more important.

Conclusion

The synergy between heuristics and other biasing mechanisms that affect initial intuitive assessment, and confirmation bias which impairs attempts at rigorous analysis, make it difficult to produce scientifically rigorous reports. The common practice of searching for and presenting convergent data as the basis for opinions in custody cases further exacerbates the problem. Testing of competing hypotheses, with a focus on attempting to disprove them, is crucial to scientific assessment.

Encouragement to follow current custody guidelines, greater familiarity with the results of relevant empirical research, and training in scientific methodology are all needed. Nevertheless, by themselves they are insufficient to ensure that most custody evaluations meet reasonable scientific standards. Evaluators also need to appreciate the power and danger of implicit bias, have the time and cognitive resources necessary for scientific analysis, and be motivated to invest the required resources (Gawronski & Creighton, Citation2013). Currently, the required motivation is often lacking. Evaluators generally do not get feedback on the accuracy of their work, and therefore do not realize there is a problem. Moreover, experts have relative immunity for their work. Intended to enable experts to speak freely and thereby provide courts with optimal information, it has a serious unintended consequence. It decreases experts’ motivation to intensively develop their skills and to engage in rigorous scientific analysis. Moreover, relative immunity permits experts to present speculations rather than limiting themselves to truly expert opinions, and at times permits evaluators to misrepresent data as they seek to bolster their ultimate opinion. Families and children need and deserve protection from poor reports more than evaluators need immunity, which often undermines their work.

References