0
Views
0
CrossRef citations to date
0
Altmetric
Review Article

A critical discussion of pediatric gender measures to clarify the utility and purpose of “measuring” gender

ORCID Icon, , , ORCID Icon, ORCID Icon, ORCID Icon, ORCID Icon, , ORCID Icon, , , ORCID Icon & ORCID Icon show all

Abstract

Background

Pediatric gender clinics and researchers commonly use scales to measure different dimensions of gender (e.g. identity, dysphoria, satisfaction). There has been little investigation into the relevance and consumer acceptability of these scales within contemporary understandings and experiences of gender.

Aims

This study aimed to comparatively review and evaluate measures of gender used with children and adolescents, to inform the use of gender measures in pediatric populations.

Methods

A narrative review of the literature was conducted to identify measures that are used to describe dimensions of gender within pediatric populations. The measures were evaluated for their inclusivity, validity, and utility.

Results

19 measures were identified. Our results found that most pediatric gender measures are not inclusive of non-binary genders, and do not accommodate some understandings and expressions of gender. Many are based on outdated terminology and stereotyped expectations of gender expression, and some are potentially distressing for the young person completing the measure. Some gender measures, used in conjunction with self-identification and as an adjunct to clinical interviews, hold clinical utility for understanding gender. If a measure is deemed clinically helpful, it is vital that the purpose of the measure is explained to the young person, and they are supported through the administration of the measure.

Discussion

This review is a guide for choosing gender measures for clinical practice or research purposes. Specialist gender services and researchers should aim to provide an open, accepting, and affirmative approach; any gender measure should be chosen with consideration of its validity, and whether the measure adds value over and above self-identification and talking together about gender. There is a need for the development, and validation in pediatric populations, of measures that ensure the inclusivity of non-binary genders, language tailored to target ages and timepoints in gender transition, and updated, culturally appropriate language and examples.

Introduction

Trans is an umbrella term that describes gender identities which differ from the gender presumed at birth; including people who identify as non-binary, agender, bigender, male, female, genderqueer, genderfluid, or cultural expressions of gender diversity (such as the Sistergirls and Brotherboys of First Nations in Australia or Two-Spirit people of First Nations in North America). Being trans may also be part of someone’s history, rather than a part of their current identity. Some trans people experience gender dysphoria, distress due to the difference between the gender presumed at birth and their experienced gender (American Psychiatric Association, Citation2013). Gender dysphoria may be associated with physical characteristics, misgendering and/or social and cultural expectations of cisnormative gender roles (Nicholas, Citation2019; Schulz, Citation2018).

For trans young people who wish to receive puberty suppression and/or gender-affirming hormones (including estrogen or testosterone), a diagnostic assessment for gender dysphoria is recommended by existing guidelines (Coleman et al., Citation2022; Hembree et al., Citation2017; Telfer et al., Citation2018). Measurements of gender have been developed to quantify and categorize aspects of gender identity, as an adjunct to clinical assessment. However, literature regarding empirically based gender assessments, comparisons, and validations is limited, particularly for trans children and adolescents. Given increased referrals to pediatric trans care, it is timely to critically examine how gender measures may reflect the diversity of trans identities and support optimal care.

The use of gender measures may potentially increase or reduce barriers to accessing gender-affirming care. For example, where a medical diagnosis of Gender Dysphoria or Gender Incongruence is required to access care, the use of an inappropriate measure might prevent a young person from receiving care. Such barriers are important to consider, as lack of access to gender-affirming care can lead to an increase in psychological distress, including suicidality, anxiety, and depression (Cohen-Kettenis et al., Citation2011; de Vries & Cohen-Kettenis, Citation2012). Conversely, access to gender-affirming medical care and gender-affirming hormone treatments has been associated with good or improved psychosocial wellbeing in trans young people who request it (Achille et al., Citation2020; Allen et al., Citation2019; Arnoldussen et al., Citation2022; Chen et al., Citation2023; Doyle et al., Citation2023; Grannis et al., Citation2021; Green et al., Citation2022; Khatchadourian et al., Citation2014; Kuper et al., Citation2020; Lavender et al., Citation2023; López de Lara et al., Citation2020; Mahfouda et al., Citation2018; Mahfouda et al., Citation2017; Turban et al., Citation2020).

Psychometric tests used for diagnosis are validated against a “gold standard” of diagnosis. For Gender Incongruence (World Health Organiza­tion, Citation2018) and Gender Dysphoria (American Psychiatric Association, Citation2013), like other conditions defined by an individual’s subjective experience, the “gold standard” is a diagnosis based on structured clinical interviews by an appropriately trained and experienced health professional, or the consensus diagnosis arrived at by a multidisciplinary clinical team. In a gender clinic setting in which a thorough assessment will be carried out, it is worth questioning whether quantitative gender measures add value. There is a long history of use of standardized, validated psychometric measures of many characteristics, for example, depression symptoms, autism spectrum traits, or cognitive abilities. A body of psychology literature provides guidance regarding the ideal properties of psychometric tests, their evaluation, and their use and misuse (see e.g. Irwing et al., Citation2018; Kline, Citation2015). Essential considerations include a precise definition of the constructs to be measured; psychometric properties (including reliability, validity, responsiveness to change, and establishment of population norms); and when a test is being selected for use, clarity regarding the purpose of measurement, appropriateness of the test for this purpose, and applicability of results to different populations. Consumer acceptability of a measure has more recently been recognized as an essential part of its evaluation. The question “has this measure been validated, and is it appropriate for this specific purpose?” gives rise to a multifaceted consideration of all of the above. If gender is to be measured, these standards should be applied to evaluate pediatric gender measures and their use.

Shulman et al. (Citation2017) conducted a comparative review of the use of gender measures with trans adults, and Bloom et al. (Citation2021) published a review of measures of gender identity, expression, and dysphoria in trans children and adolescents. These reviews are valuable; however, they did not critically analyze the development and use of the identified measures. Gender-specific health services should aim to ensure that young people feel safe and supported at all times during assessment and care. This study aims to evaluate the psychometric properties of each pediatric gender measure and to critically appraise its inclusiveness, clinical utility, consumer acceptability, and alignment with contemporary understandings of gender. We aim to assist professionals in clinical and research settings in their selection of gender measures, and guide future development of improved measures, whilst discussing broader questions about the measurement of gender.

Methods

A narrative review of gender measures that are used, or developed for use, in pediatric trans populations was conducted. Measures included were those developed before May 2023 that focused on evaluating dimensions related to gender, including gender identity and expression, gender preoccupation, gender satisfaction, and gender dysphoria. Studies reporting the use of adult measures in pediatric populations were included. Measures were excluded if they did not focus on identity (e.g. measures of internalized transphobia), or if they had only been studied in adult populations. The identified scales and citations referenced in the published literature were used to guide further search using a snowball sampling method. A literature search was also conducted with variations of the following search terms: transgender, transgender youth, gender diverse youth, gender dysphoria, gender dysphoria youth, gender incongruence, gender measures, gender scales, youth gender measures, youth gender scales. Validation studies of each measure were searched for by using search terms containing the title of the measure. All available validation studies have been included.

The following measures were identified: Body Image Scale (Lindgren & Pauly, Citation1975), Body Image Scale—Gender Spectrum (McGuire, Spencer, et al., Citation2016), Gender Feeling Amplitude (Riley, Citation2017), Gender Identity/Gender Dysphoria Questionnaire for Adults and Adolescents (Deogracias et al., Citation2007), Gender Identity Inter­view for Children (Zucker et al., Citation1993), Gender Identity Questionnaire for Children – Parent Report (Johnson et al., Citation2004), Genderqueer Identity Scale (McGuire et al., Citation2019), Gender Preoccupation and Stability Questionnaire—2nd Edition (Bowman et al., Citation2022), Perth Gender Picture (Moore et al., Citation2021), Recalled Childhood Gender Identity and Experience—Gender Spectrum (Parent and Child versions; (Berg et al., Citation2016a; Berg et al., Citation2016b), Recalled Childhood Gender Identity/Gender Role Questionnaire (Zucker et al., Citation2006), Trans Youth CAN! Gender Distress Scale (Bauer et al., Citation2021a), Trans Youth CAN! Gender Positivity Scale (Bauer et al., Citation2021b), Utrecht Gender Dysphoria Scale (Cohen-Kettenis & Van Goozen, Citation1997), Trans­gender Congruence Scale (Kozee et al., Citation2012) and Utrecht Gender Dysphoria Scale—Gender Spectrum (McGuire, Catalpa, et al., Citation2016). The measures identified are described and discussed below in alphabetical order, with revised versions of a measure described immediately following the original measure. The relevant parameters of each measure are discussed, such as their intended purpose, development, scoring, psychometric properties, clinical utility, appropriateness, and contemporary relevance, as well as any other salient features. summarizes the properties of the identified gender measures. In discussing the measures below, we use the terminology used by the authors at the time of original publication.

Table 1. Properties of pediatric gender measures.

Results

Body Image Scale for Transsexuals

The Body Image Scale for Transsexuals (BIS) was published in 1975. It was developed to evaluate adults seeking gender-affirming surgery and to monitor body image during the treatment process (Lindgren & Pauly, Citation1975), and has been used in adolescent gender clinics (de Vries et al., Citation2014). The BIS quantifies an individual’s satisfaction with specific parts of their body, as body image dissatisfaction is common in trans populations (Jones et al., Citation2016; McGuire, Doty, et al., Citation2016). The BIS lists 30 body parts, and raters respond on a 5-point Likert scale from 1 (very satisfied) to 5 (very dissatisfied). A higher score reflects greater body dissatisfaction. If the individual responds to a given item with a score of 3 or higher, they are asked whether they would consider changing the body part if it were possible through medical or surgical treatment (yes/no). There are separate male and female scales by sex registered at birth. Lindgren and Pauly (Citation1975) specify to use the opposite binary version of the scale once an individual has undergone “complete sex-reassignment surgery;” however, the meaning of “complete” is subjective. In addition, longitudinal and pre-post intervention study designs require the use of consistent scale versions.

Lindgren and Pauly (Citation1975) divided the body parts itemized on the scale into three groups—“primary gendered” (e.g. penis, vagina), “secondary gendered” (e.g. hips, figure, muscles) and “hormonally unresponsive (neutral)” (e.g. nose, shoulders). Lindgren and Pauly’s (Citation1975) pilot study of the scale included 16 birth-registered male and 16 birth-registered female “transsexuals” aged between 17 and 46 years old. Seven of these participants repeated the scale after gender-affirming hormone treatment and surgery; all follow-up participants demonstrated improvements in their scores. The small sample size and large loss to follow up were potential sources of bias.

The language and concepts used in the BIS are easy to understand. The BIS has been shown in a small sample to detect subtle improvements in body image that occur following gender-affirming medical care (Lindgren & Pauly, Citation1975). The BIS appears to perform well across countries, and has been used to compare body satisfaction outcomes in different countries (Shirdel-Havar et al., Citation2019). High internal consistency reliability has been reported (on a modified version of the scale) with a Cronbach’s alpha of .90 (van de Grift et al., Citation2017). The BIS may be clinically useful for services providing gender-affirming treatment, to quantify pretreatment dissatisfaction with the body features which may change with treatment, and guide discussion about treatment wishes, provided that an individual has the body parts included on the scale. The scale has not been validated for use in children and young adolescents but is used widely in practice with adolescents.

The BIS has several weaknesses. In one study, overall BIS scores did not significantly correlate with the severity of gender dysphoria as measured with the Utrecht Gender Dysphoria Scale, implying that in people seeking gender-affirming treatment, the degree of body dissatisfaction may not correspond to the level of gender dysphoria (van de Grift et al., Citation2017), or perhaps that neither scale is an ideal measure of gender dysphoria. The BIS should strictly be used to evaluate body image, not gender dysphoria. The title of the BIS includes the outdated, potentially offensive term “transsexual”. The classification of “gendered” body parts in the BIS is problematic, as there are many additional body parts that individuals experience as gendered. For example, Lindgren and Pauly classify “hands” and “feet”, as “hormonally unresponsive—neutral”; however, it is common for people who have gender dysphoria to express distress about the size and appearance of their hands and feet. “Face” is classified as hormonally unresponsive, whereas facial appearance obviously changes with estrogen or testosterone treatment. van De Grift, Cohen-Kettenis, Steensma, et al. (Citation2016) offer an alternative classification of items in the BIS based on six body areas: social and hair items, head and neck region, muscularity and posture, hip region, chest region, and genitals. These alternative subscales avoid the assumption of which body parts individuals experience as gendered and may allow for a more clinically relevant interpretation of the scores than the original subscales. The BIS assumes a male-female binary, and has not been validated for use with non-binary people. Lastly, the assessment of body image by individual body parts fails to capture the holistic approach to body image in relationship to gender dysphoria (Becker et al., Citation2016; van de Grift, Cohen-Kettenis, Elaut, et al., Citation2016).

Body Image Scale – Gender Spectrum (BIS-GS)

A revised version of the BIS, the Body Image Scale—Gender Spectrum (BIS-GS) was developed in 2016 by McGuire, Spencer and colleagues It contains 33 items combining the “male” and “female” body parts from the original BIS scale. The same five-point response scale ranging from 1 (very satisfied) to 5 (very dissatisfied) is used, plus the option to check a ‘don’t have’ box if a rater does not have the body part (e.g. vagina). Raters then indicate on the five-point scale how they feel about not having that body part and whether they would want to change this. The items are the same as on the original BIS, with minor modifications such as ‘hair (on head)’ to reduce ambiguity, thus retains both language that is easy to understand and the benefit of brevity that the original BIS has.

The BIS-GS addresses some limitations of the BIS, as it is inclusive of the entire gender spectrum. The addition of the ‘don’t have’ option provides insight into an individual’s feelings toward body parts they do not have but may wish for. The single scale, listing a greater range of body parts for all, opens a greater potential for trans people to respond based on their own preferred names for body parts and their personal body experience. This is preferable to the cisnormative binary configuration imposed by the BIS. However, confusion may arise if an individual uses a name for a body part in a personal, non-anatomical way. To date there is no available research published using the BIS-GS, thus no evidence demonstrating psychometric properties or consumer acceptability in adolescents or adults. Many gender clinics still use the original BIS, likely because the BIS-GS has not been published in peer-reviewed literature. In addition, the BIS-GS is expected to be replaced by the Body Part Satisfaction Scale—Gender Spectrum (BOPS-GS; Berg et al., Citation2020) in order to address problems with the administration of the BIS-GS, including misunderstanding of the instructions (National Center for Gender Spectrum Health, n.d.). The BOPS-GS is currently undergoing psychometric validation; however it is not yet available for use. To summarize, the revised BIS-GS improves on the language and conceptual limitations of the BIS and is more inclusive of the entire gender spectrum, but has unknown psychometric properties; a further revised and validated version of the scale may be forthcoming.

Gender Feeling Amplitude

The Gender Feeling Amplitude (GFA) aims to enhance clinical assessment through identifying the young person’s feelings and levels of distress regarding their gender identity and diversity (Riley, Citation2017). It is a very quick self-report measure, consisting of 68 words or phrases. Examples include: ‘shy’, ‘supported’, ‘hopeful’, and ‘self-conscious’, ‘as if I’m not being seen properly’, ‘depressed’. The young person is prompted with the instruction “Circle 10 or so words that feel true to you”. The GFA is intended to be used to provide a snapshot of the young person’s current feelings about gender, to support clinical exploration. After completion, the clinician prompts further discussion of the circled items. Clinicians can seek the adolescent’s consent to discuss some or all items with parents, to facilitate family understanding and acceptance. The only published study of the GFA is the pilot by Riley (Citation2017). Over half of the 67 young people (aged 10-20) circled the items ‘self-conscious,’ ‘awkward’ and ‘don’t fit in’ and the average number of items circled was 11.7.

The GFA was developed in consultation with 110 trans adults, who were asked to describe feelings about, and experiences of, their gender identity and expression during childhood and adolescence. The words that were related to ‘feelings’ were then ranked in order from positive to negative and formed the basis of the GFA. The consultation with trans individuals to create the GFA is a major strength of this measure. The measure is inclusive of all gender identities, as it captures a young person’s current feelings irrespective of their gender identity and uses one version for all. Speed of completion is another strength. The GFA does not generate a meaningful numerical score, so is unsuitable for use in quantitative research. Additionally, its consumer acceptability has not been evaluated. Lastly, young people may not understand some of the words, for example “marginalized” or “perverted”, depending on age, literacy, and English language proficiency. The GFA could be useful in an assessment or therapeutic setting to support conversation about gender.

Gender identity/Gender Dysphoria Questionnaire for Adults and Adolescents (GIDYQ-AA)

The Gender Identity/Gender Dysphoria Ques­tionnaire for Adolescents and Adults (GIDYQ-AA) assesses dimensions of gender identity and gender dysphoria within the last 12 months (Deogracias et al., Citation2007). The scale aims to measure gender dysphoria, gender uncertainty and “gender identity transitions” between what the authors describe as two defined poles of male and female. The GIDQY-AA is a 27 item self-report questionnaire which can be administered to adults and/or adolescents in both a clinical and non-clinical setting. There are separate male and female (‘sex registered at birth”) versions of the scale—the authors state this is because it is “less contrived and easier to understand.” Items are categorized into subjective, social, somatic, and sociological parameters which are scored on a scale from 1 (always) to 5 (never). An example item for participants presumed female at birth is “In the past 12 months, have you felt satisfied being a woman?” When the scale is administered to adolescents (<18) the words woman/man are changed to girl/boy (Deogracias et al., Citation2007). The GIDYQ-AA was originally constructed by administering the survey to 462 subjects: 389 university students who were not known to be trans, and a separate group of 73 adolescents and adults attending a gender clinic who met the DSM-IV criteria for “Gender Identity Disorder” and had not undergone gender-affirming surgery (Deogracias et al., Citation2007).

The GIDYQ-AA shows strong psychometric properties. The internal consistency reliability (α = 0.97) meets the Nunnally and Bernstein (Citation1994) guideline for use in applied research. However, approaching a perfect reliability (i.e. α = 1) indicates a potential redundancy in items. The measure shows convergent validity in its correlation with the UGDS (Schneider et al., Citation2016). In regard to the accuracy of the measure, the GIDYQ-AA stated cutoff score demonstrates a specificity (the ability to identify a “true negative”; correctly exclude non-gender-referred participants) of 99.7% and sensitivity (ability to identify a “true positive”; correctly include gender-referred participants) of 90.4%.

Singh et al. (Citation2010) validated the GIDYQ-AA further by comparing a sample of 44 adolescents (ages 13–18) with a diagnosis of “Gender Identity Disorder” (GID; (World Health Organization, Citation1992), with two control groups; one consisting of adolescents referred for various clinical concerns other than gender, and a second comparison group of adolescents who had been referred for “fetishistic cross-dressing”. The GIDYQ-AA demonstrated the ability to differentiate between GID diagnosed individuals, and both control groups.

The GIDQY-AA’s simple language, brevity, and strong discriminant and convergent validity suggest good clinical and research utility at face value. As it pertains to feelings within the last 12 months, it is responsive to change over time. However, the male and female versions of the scale mean it is not inclusive of identities outside of the gender binary. One question in the GIDYQ-AA refers to this spectrum: “In the past 12 months, have you felt uncertain about your gender, that is, feeling somewhere between a man and a woman?” This assumes erroneously that non-binary gender is associated with uncertainty and confusion. Several questions conflate gender identity and gender expression in a limiting way. The inclusion of one question about intersex status “In the past 12 months, have you thought of yourself as a “hermaphrodite” or “intersex” rather than as a man or woman?” is potentially confusing, and incorrectly conflates trans identity and intersex variation. One study found that only 52.5% of transgender participants felt that the GIDY-AA accurately reflected their experience of gender dysphoria, with non-binary/agender participants reporting significantly lower subjective ratings than binary trans participants (Galupo & Pulice-Farrow, Citation2020). In conclusion, the GIDYQ-AA has some psychometric strengths, but some problematic language and assumptions that make it unsuitable for continued use.

Gender identity Interview for children (GIIC)

The Gender Identity Interview for Children (GIIC) was intended for use with children (3—12 years old) who were clinically referred for “problems in gender identity development” (Zucker et al., Citation1993). The GIIC is a 12-item structured interview: the interviewer codes the child’s responses. The tool was designed as a dimensional measure to “putatively access symptoms of cognitive and affective gender identity confusion” (Zucker et al., Citation1993, p.444). The conceptualization of gender diversity underlying the measure is outdated, and does not align with the current definition of Gender Dysphoria in DSM-5 (American Psychiatric Association, Citation2013) or Gender Incongruence in ICD-11 (World Health Organization, Citation2018). One example is their definition of “cognitive gender confusion” which occurs when “a child may misclassify his or her sex…or be mistaken about its invariance over time (e.g. believing that one can change sex when older)” (Zucker et al., Citation1993). The definition delegitimizes children’s experiences by framing their self-identified gender as confused, mistaken and abnormal, which is inconsistent with the current best practice of a gender-affirming approach (Riggs, Citation2019).

Each question is scored on a three-point scale ranging from 0 to 2. A zero indicates a child gave a “factually correct” (“sex-typical”) response. For example, to the question “Are you a boy or a girl?”, a response congruent with the birth-registered sex scores 0, and the opposite (described as “putatively deviant and without ambiguity”) scores 2. Some questions in the GIIC suggest further probing questions based on initial responses.

The interview’s language and scoring design reflect cisgenderist perspectives, such as the categorization of trans children as “deviant”. An answer which is factually correct from the child’s point of view is scored as factually incorrect by the interviewer. The repetition in the 12 questions, and the probing approach, may be confusing or intimidating. For example, the first question asks, “Are you a boy or a girl?”, which is directly followed by the question “Are you a (opposite of first response)?” This could be puzzling and destabilizing for the child and undermines self-knowledge. The questions present binary gender as the expected norm, and while there is an opportunity for answers to be expanded, children with non-binary genders may feel excluded and/or feel pressure to provide a binary response.

The GIIC demonstrates discriminant validity, as gender-clinic referred children showed greater affective and cognitive gender “confusion” than both the general clinical and “normal” control groups (Zucker et al., Citation1993). Wallien et al. (Citation2009) further validated the GIIC in Dutch and Canadian clinical samples, with similar results across samples. They demonstrate combined samples internal consistency reliability (Cronbach’s α = 0.85), suitable for basic (but not applied) research (Nunnally & Bernstein, Citation1994). Inter-rater reliability was very high (r = 0.99; (Wallien et al., Citation2009). A cutoff score of three or four gender-atypical responses has been reported to be a helpful prompt for a more thorough diagnostic assessment (Wallien et al. Citation2009; (Zucker et al., Citation1993). Applying a cutoff score of 4 gender-atypical responses, there is a specificity of 93.9% but a markedly lower sensitivity of 65.8% (Zucker et al., Citation1993). This means that almost one-third of children meeting diagnostic criteria (DSM–III–R/DSM–IV–TR) are incorrectly classified by this measure.

In summary, the GIIC perpetuates cisgenderist and binary assumptions, may be experienced as distressing for the child, and has low sensitivity when used as suggested by the authors. It seems likely that the information gathered by the GIIC could better be obtained by a clinical interview listening to the concerns of the child and parents or carers.

Gender Identity Questionnaire for Children – Parent Report (GIQC)

The Parent Report Gender Identity Questionnaire for Children (GIQC) examines gender-role behaviors and gender identity, as reported by parents of children aged 3–12 years (Johnson et al., Citation2004). This measure is a revised version of the ‘behavior preference’ questionnaire by Elizabeth and Green (Citation1984) who developed it using data from 702 twins aged 4–12 years. It consists of 16 items, reflecting observable behaviors that correspond to the core features of an ICD-10 “Gender Identity Disorder” (GID) diagnosis. Item responses range on a five-point Likert scale. Three items contain a ‘not applicable’ option. Mean GIQC total scores are calculated by summing the 14 items and then dividing by 14 (Johnson et al., Citation2004). Lower mean scores indicate more “cross-gendered behavior”. There are boy and girl versions, by gender registered at birth. An example item is “He imitates female characters seen on TV or in the movies.”

The GIQC demonstrates strong psychometric properties. In the original publication, 325 children (aged 3–12 years) made up the gender-referred group who had been assessed at a gender identity clinic in the United States, and 504 children (aged 2.5–12 years) were in the comparison group (Johnson et al., Citation2004). Using a cutoff of a total score less than 3.54, the scale has a specificity of 95% but a lower sensitivity of 86.8% (Johnson et al., Citation2004). The GIQC strongly discriminated gender-referred children from the comparison group, with a large effect size (Cohen’s d = 3.70) (Cohen, Citation1998). The GIQC was also used in a cross-national context in Canada and the Netherlands and showed strong convergent validity with clinician DSM-IV diagnoses of GID (Cohen-Kettenis et al., Citation2006). The GIQC had a strong inter-rater reliability (r = 0.90) for a sample of birth-presumed males aged 4–17.5 years (Johnson et al., Citation2004). Caldarera et al. (Citation2019) have validated an Italian version of the GIQC in a community sample of 1148 children (aged 3–12 years).

The brevity and simple language makes the GIQC appealing for both clinical and research settings (Johnson et al., Citation2004). Chen et al. (Citation2016) suggested using it for children under age 12 in pediatric gender-affirming clinical care settings, because it has clinician-perceived utility. Parent-report measures that assess children’s gender-related behaviors allow research in community samples of younger children (Yu et al., Citation2010). Johnson et al. (Citation2004) found an absence of age effects.

However, some limitations are apparent. The separate girl and boy versions of the measure are poorly inclusive of non-binary gender identities. The measure is heavily reliant on cultural expectations of gender roles; not all children (trans or cis) match these expectations. This is important, as measures that reflect a child’s own experience and knowledge of their gender are helpful to gain an understanding of their perspective and evaluate the effectiveness of interventions (Olson-Kennedy et al., Citation2016). Therefore, the parent-report GIQC, if used, should be interpreted in conjunction with the child’s self-report.

Genderqueer Identity Scale (GQI)

The Genderqueer Identity Scale (GQI) was developed as a tool to measure non-binary and genderqueer identities and expression across time, including before, during and after medical transition (McGuire et al., Citation2019). The GQI consists of 23 questions across four subscales; (i) challenging the gender binary, (e.g. “I enjoy it when people are not sure if I am male or female”) (ii) social construction of gender, e.g. “The way I think about my gender has been influenced by experiences in my life” (iii) theoretical awareness (e.g. “The way I show my gender is important because I push society to question traditional gender roles”) and (iv) fluidity over time (e.g. “In the future, I think my gender will be fluid or change over time”). These were developed based on a literature review, prior qualitative interviews and clinical experience. Respondents use a 5-point Likert scale from 0 (strongly disagree) to 4 (strongly agree).

McGuire et al. (Citation2019) used three samples for the pilot exploratory factor analysis, one Dutch gender clinic involving 327 people aged 17–68, one Dutch LGB community group involving 290 people aged 18–85, and one LGBTQ community sample from two universities in the USA involving 150 people aged 18–61. A fourth sample of 510 trans and genderqueer individuals (aged 18–74) was then recruited after final revisions to retest the factor structure and confirm reliability. The measure demonstrated construct validity, acceptable scale reliability and internal consistency for identifying genderqueer people among clinical and community samples of LGBTQ individuals (McGuire et al., Citation2019). Further, a study by Catalpa et al. (Citation2020) found the GQI can distinguish between genderqueer, binary trans and cisgender sexual minority individuals. They found that genderqueer individuals had significantly higher scores on challenging the binary and gender fluidity subscales.

A strength of the GQI is its inclusivity of gender identities beyond male and female. This enables the representation of diverse genders that are often excluded from research and health care (Schulz, Citation2018; Vincent, Citation2019). It may help to measure the experiences of non-binary and genderqueer individuals who often experience barriers to health care. The GQI can be repeated at multiple timepoints. However, the samples tested included only those 17 years and older, and suitability for use in children and adolescents is unclear. The language used in the GQI appears easy to understand for this age range, but may need to be simplified if it is utilized in a younger age group.

Gender Preoccupation and Stability Questionnaire – 2nd edition (GPSQ-2)

The Gender Preoccupation and Stability Questionnaire—2nd Edition (GPSQ-2; Bowman et al., Citation2022) aims to measure short-term fluctuations in gender dysphoria for the purpose of assessing the effectiveness of gender-affirming care. The GPSQ-2 focuses on gender preoccupation and gender instability. The questionnaire consists of 14 items, rated on a 5-point response scale from 0 (never) to 4 (all the time). These items are then summed to yield a total score between 0 and 56, with higher scores indicating more intense experiences of gender dysphoria. Example items include “Over the past two weeks how often have you thought about your gender?” and “Over the past two weeks how often has your sense of what gender you identify with changed at all?”

The GPSQ-2 is a revised version of the Gender Preoccupation and Stability Questionnaire (GPSQ; Hakeem et al., Citation2016). Where the GPSQ was designed for use with adult populations, the GPSQ-2 is for use with persons over age 13. In addition, the GPSQ-2 addresses issues faced by the GPSQ regarding validity and comprehensibility. A pilot study of the revised measure involved structured interviews with trans adolescents and adults to assess the relevance, comprehensibility, and comprehensiveness of the items (Bowman et al., Citation2022). The GPSQ-2 was then validated in a community sample of 141 people aged 14–73 (Bowman et al., Citation2022). The measure demonstrated construct validity and internal consistency reliability. A follow-up survey was administered after two weeks, demonstrating good test-retest reliability (ICC = 0.88).

Strengths of the GPSQ-2 include brevity, clarity, consumer acceptability, inclusivity across the gender spectrum, and the ability to capture gender fluidity. In addition, the two-week reference period accommodates the fluctuating nature of dysphoric feelings. However, further research is required to assess the ability of the GPSQ-2 to detect clinically important changes in gender dysphoria among people receiving gender-affirming care. No further validation studies have been published, so it is unclear how the GPSQ-2 performs in clinical samples, or exclusively adolescent samples.

Perth Gender Picture (PGP)

The Perth Gender Picture (PGP; Moore et al., Citation2021) is a pictorial and narrative tool used during face-to-face clinical consultation with young people aged 11–18 to reflect on and communicate gender identity. The PGP was developed with the aim of reducing the requirement for English literacy, allow for color and creativity in its use, facilitate expression of the whole gender spectrum, enable the use of the young person’s own words to describe their gender at present and its development over time, and be clinically useful across a range of ages, developmental stages and cognitive capacities. The measure reflects a non-binary model of gender identity and expression.

The young person is invited to use colored markers to show on a diamond-shaped diagram their current gender identity, how it was in the past, and how they hope or wish it will be in five or ten years in the future. They can show nuanced degrees of “neither male nor female”, “something else/something different”, “male”, “female”, “masculine”, “feminine”, and “outside the box”. The young person can indicate more than one place at the same time or show change over time. It is easy to show “male”, “female”, “masculine”, “feminine”, “neither” and/or “something else” at the same time. A standardized script is used by the clinician alongside the diagram, and the young person is invited to describe each marked place in their own words. The interpretation of the picture is shared between the young person and the clinician, with the goal of improving communication and deepening understanding. This customizability allows the clinician to use language that is appropriate for the individual client.

Strengths include documented high client acceptability (Moore et al., Citation2021), inclusiveness of non-binary and changing gender identities, reduced imposition of cultural stereotypes of gender expression, reduced literacy requirement, potential to be repeated at multiple time points, and eliciting of detailed self-report including future wishes. The lack of numerical scoring could be seen as a strength, as it reflects the endless possibilities and subjectivity of gender experiences. It is also a limitation, as it cannot generate quantitative data for research purposes. Because of this, the measure has not been psychometrically validated.

Recalled Childhood Gender Identity scale (RCGI)

The Recalled Gender Identity Scale (RCGI) (Zucker et al., Citation2006) assesses recalled gender-typed behavior, its “normality” relative to social/cultural norms, and relative closeness to the mother and father during childhood. The RCGI is a 23-item self-report quantitative questionnaire which requires adults and adolescents to reflect upon the time frame of 0–12 years old. It consists of male and female versions, based on birth-presumed gender. The respondent selects the answer which describes them best. Twenty-two items are rated on a 5-point scale, and one is rated on a 4-point scale. Some items also have an additional option for a participant to indicate that the behavior did not apply. An example item is “As a child, I felt…. (1) very masculine; (2) somewhat masculine; (3) masculine and feminine equally; (4) somewhat feminine; (5) very feminine; (f) I did not feel masculine or feminine.”

By referencing the existing literature, Zucker et al. (Citation2006) assessed behaviors with established sex-differences as a method of operationalizing social and cultural norms. Deviations from these norms are conceptualized as indicators of “Gender Identity Disorder” (American Psychiatric Associa­tion, Citation2013). The authors assert that the degree of identification with the parent of either gender may be indicative of the child gender role. This is problematic, as neither “gendered” behavior nor relationship to parents equate to gender identity.

Zucker et al., (Citation2006) administered the scale to a variety of subsamples, over a period of 23 years. The samples include women and men; adolescents attending a gender identity clinic; heterosexual versus homosexual adults; women with congenital adrenal hyperplasia compared to their female relatives; parents of gender-referred children and intersex children; and adolescents attending a gender identity clinic compared to adolescent boys referred for “transvestic fetishism”. In the last-mentioned sample, of 87 adolescents (29 gender-referred birth-presumed males; 25 gender referred birth-presumed females; 33 birth-presumed males with “transvestic fetishism” [TF]) - both gender-referred groups had significantly higher “cross-gender scores” when compared with the TF group (p<.05). The RCGI demonstrated discriminative validity between the gender-referred birth-presumed female and birth-presumed male groups and the TF group (Cohen’s d = 1.65 and 2.67 respectively).

The RCGI’s advantages include brevity, ease of use, avoidance of easily outdated specific examples of toys or activities, and documented strong discriminant validity. Since the scale asks for recollection of past behaviors during childhood, it is not responsive to change over time, and may be subject to recall bias. The use of the words “sissy” and “tomboy” in items is problematic, as these may be experienced as derogatory. Also, as these are presented as terms directed at the individual by others, which are not necessarily a reflection of the person’s internal sense of gender, their clinical relevance is doubtful. The RCGI makes many assumptions regarding normative gendered behavior, for example, ascribing significance to haircuts and clothing. It assumes that gender is a male/female binary. It also assumes that families consist of a ‘mother’ and ‘father’, unlike many families. The items referencing the parent-child relationship are of uncertain significance, lack evidence for the reasoning in including these items, and may be experienced as puzzling or intrusive by young people completing the RCGI. These problems make it unsuitable for continued use.

Recalled Childhood Gender Identity and Experience – Gender Spectrum (RCGIE-GS)

The revised RCGI, modified by Berg et al. (Citation2016a), aims to be more inclusive of identities across the gender spectrum. There is a single scale for all, unlike the original RCGI which has male and female versions. There is a self-report adolescent version and a parent-report version for adolescents. Importantly, the RCGIE-GS distinguishes between gender expression and gender identity. It also acknowledges cultural stereotypes, e.g. “As a child, I liked to play in a way that was stereotypically considered for girls.”

The RCGIE-GS contains 32 items: 13 items are scored from 0 to 5, and 19 items are scored from 0 to 4. Six of the items are presented for the rater to respond on a number line rating scale. For example, for item 3 “as a child, my favorite toys and games were…” respondents rate on a scale from 0 (not at all masculine) to 4 (masculine). The last two questions relate to whether, as a child, the individual disliked their anatomy. The RCGIE-GS also diverges from the original RCGI as it no longer measures the relative closeness to their mother and father and instead focuses on their relationship to a brother and/or sister. The rationale for this change was not described. All items are the same across the parent and youth self-report versions, however in the parent version, questions begin with “Growing up, my child’s …” and refer to age 0–12 (Berg et al., Citation2016b).

The potential strength of the RCGIE-GS is that it is a single scale for all genders and uses language which is more inclusive of non-binary identities; at the cost of some increase in length but maintained ease of use. However, it retains some derogatory terms (“sissy” and “tomboy”). Items reflecting on childhood gender feelings ask about both internal thoughts on gender and whether as a child this was communicated to others—since many trans children grow up in family environments that are unsupportive of diverse gender identities, a low score on this item may reflect social context rather than degree of gender dysphoria. It is difficult to know how the change in the RCGIE-GS, to measuring an individual’s closeness to siblings, rather than closeness to parents as in the RCGI, affects the validity of this measure. Indeed, it is unclear if closeness to either parents or siblings has relevance to gender identity. No studies have yet been published regarding the RCGIE-GS’s consumer acceptability, validity, reliability, or clinical utility.

Trans Youth CAN! Gender Distress Scale (TYC-GDS)

The Trans Youth CAN! Gender Distress Scale (TYC-GDS) assesses feelings of distress associated with gender dysphoria (Bauer et al., Citation2021a). The scale comprises 14 items across two subscales—social distress (e.g. “I worry that people will always treat me as the wrong gender”) and body-related distress (e.g. “I wish I had been born in a different body”). There is a male and female version of the scale, administered based on birth-presumed gender. Some of the items in the TYC-GDS were adapted from the Utrecht Gender Dysphoria Scale and revised to be inclusive of non-binary gender identities. Other items were developed based on input from trans youth as well as expert researchers and clinicians.

The TYC-GDS was validated in a sample of 161 youth aged 10–15 recruited from gender clinics (Saewyc et al., Citation2022). Confirmatory factor analysis supported a 2-factor model. The total score as well as both subscale scores were significantly correlated with a number of existing measures, demonstrating convergent and divergent validity.

A strength of the TYC-GDS is the involvement of trans youth in its development. The measure examines both social and body-specific aspects of gender dysphoria. It is brief, and uses simple language suitable for younger adolescents. While the measure is intended to be inclusive of all gender identities, the use of male and female versions of the scale promotes a binary conceptualization of gender (e.g. “I dislike having breasts because they make me feel like I’m not my true gender”). Further research is required to assess the ability of the measure to detect changes in young people receiving gender-affirming care, as well as validate the measure in other samples.

Trans Youth CAN! Gender Positivity Scale (TYC-GPS)

Created by the same team as the TYC-GDS, the Trans Youth CAN! Gender Positivity Scale (TYC-GPS) is intended to measure positive experiences of gender (Bauer et al., Citation2021b). Consisting of 11 items, the TYC-GPS comprises two subscales—positive social experiences (e.g. “I feel happy that society sees me on the outside for who I am on the inside”), and body-specific positivity (e.g. “I feel like my body fits with the real me”). Unlike the TYC-GDS, the TYC-GPS has only one version of the scale which can be administered to anyone regardless of their birth-presumed gender or current gender identity.

The methods of development and validation of the TYC-GPS are the same as described above for the TYC-GDS. The results of the confirmatory factor analysis again confirmed a 2-factor solution, and the total scale score and subscale scores each demonstrated convergent and divergent validity.

Like the TYC-GDS, the strengths of this measure include its brevity, clarity, simple language, the social- and body-specific subscales, and the involvement of trans youth in its development. In addition, the use of a single scale for all genders is more inclusive of non-binary identities than the male and female versions of its counterpart.

Transgender Congruence Scale (TCS)

The Transgender Congruence Scale (TCS) measures the extent to which a trans individual feels comfortable and authentic in relation to their gender identity (Kozee et al., Citation2012). The measure consists of 12 items across two subscales—­appearance incongruence (e.g. “My outward appearance reflects my gender identity”) and gender identity acceptance (e.g. “I am happy that I have the gender identity I do”). The TCS was initially designed and validated for use among trans adults but has been used frequently among young people, including in the largest yet multicenter study of outcomes of gender-affirming hormone treatment in adolescents (Chen et al., Citation2023).

The TCS was then validated across two studies in adults where the scale demonstrated internal consistency; convergent validity with measures of meaning in life and life satisfaction; and discriminant validity with measures of anxiety, depression, and body dissatisfaction (Kozee et al., Citation2012). To the best of our knowledge the TCS has not been validated among young people, however Thoma et al. (Citation2023) found that it demonstrated strong reliability among a sample of 1943 trans adolescents aged 14–18.

It may be a limitation that the TCS focuses so heavily on appearance congruence, as this could exclude participants who do not wish to adhere to gendered social norms and standards of appearance or dress. However, the scale has been assessed by an expert panel of four trans individuals who deemed that the items accurately reflect the intended construct. Despite not being normed for use with adolescents, the wording of items is clear and likely to be easily understood by young people. In addition, the TCS is a brief scale at only 12 items which allows for ease of administration. The TCS should be further validated to confirm its suitability for use with younger populations.

Utrecht Gender Dysphoria Scale (UGDS)

The Utrecht Gender Dysphoria Scale (UGDS) is a 12-item self-report measure assessing the severity of current gender dysphoria (Cohen-Kettenis and van Goozen, Citation1997). The scale is intended to capture social, subjective, and somatic indicators of gender dysphoria. Each question is scored on a 5-point scale. Higher scores indicate more severe gender dysphoria. For example, “I am dissatisfied with my beard growth because it makes me look like a boy.” It has a male version (UGDS-M) and a female version (UGDS-F) administered based on birth-registered sex, with a language change to “man” and “woman” for adults.

Steensma et al. (Citation2013) validated the UGDS among 1119 participants (across multiple identity-based subsamples) aged 12-75. Results indicated internal consistency reliability (Cronbach’s α = 0.98) acceptable for applied contexts (Nunnally & Bernstein, Citation1994) for both UGDS-F and UGDS-M and a strong ability to discriminate between clinically and non-clinically referred individuals with gender dysphoria. Using a cutoff score of 40, they reported a sensitivity of 88.3% for clinically referred participants and a specificity of 99.5% for the UGDS-M, as well as a sensitivity of 98.5% and specificity of 97.9% for UGDS-F. The UGDS thus demonstrates strong accuracy without substantial tradeoffs between sensitivity and specificity. However, Schneider et al. (Citation2016) found evidence of ceiling effects for birth-registered females, suggesting an inability to differentiate between moderate and severe gender dysphoria. de Vries et al. (Citation2014) administered the UGDS to trans adolescents prior to gender-affirming medical intervention using birth-presumed gender, and then again following intervention based on affirmed gender, with the goal of measuring change in gender dysphoria after treatment. However, it is not possible to differentiate which changes are due to gender affirmation, and which are due to the change between male and female version, unless both scales are used at both time points. The UGDS-M and UGDS-F versions differ substantially. They both use simple language, however eleven of 12 questions in the UGDS-M are framed using negative language, while the UGDS-F only has four questions framed in a negative way. The UGDS-M contains three items which may capture suicidality (“life would be meaningless if I had to live as a boy”, “only as a girl my life would be worth living”, and “better not to live than to live as a boy”) whereas the UGDS-F contains no such items. The UGDS-F asks about “behaving sexually as a girl” (conflating stereotypical sexual scripts with gender), whereas the UGDS-M does not. It is unclear why the male and female scales have this asymmetry. The UGDS is not inclusive of non-binary gender identities. Galupo and Pulice-Farrow (Citation2020) found that only 54.6% of trans participants felt that the UGDS accurately reflected their experience of gender dysphoria, with non-binary/agender participants reporting the lowest subjective ratings.

Utrecht Gender Dysphoria Scale – Gender Spectrum (UGDS-GS)

This revision of the UGDS is intended for use in adolescents and adults to measure the severity of gender dysphoria in a way that is more inclusive of the full spectrum of gender identities (McGuire, Catalpa et al., Citation2016). Its construct validity was recently evaluated in an adult sample of cisgender, binary trans, and non-binary/genderqueer individuals (McGuire et al., Citation2020). A single scale is used for all participants, using the language “assigned sex” and “affirmed gender.” An example item is “it would be better not to live, than to live as my assigned sex.” The response scale ranges from one (disagree completely) to five (agree completely) with higher scores indicative of greater gender dysphoria. The scale was ­developed by professionals working in a gender-affirming clinic setting, with formal feedback from gender-diverse adolescents and their parents, clinicians and researchers, and cis and trans peers at gender health scientific meetings to guide revisions. The initial and final iterations of the measure were then pilot tested in two online convenience samples, each including binary transgender, non-binary or genderqueer, and cisgender LGBQ participants aged over 18, who gave feedback on language, inclusivity and ease of understanding. The pilot data were used for exploratory factor analysis and principal component analysis. Three larger online samples of nonbinary and genderqueer (n = 587), binary transgender (n = 297), and LGBQ cisgender (n = 121) adults were then recruited to ­complete confirmatory factor analysis, testing measurement invariance across the groups. A two-factor explanatory model (“gender dysphoria” and “gender affirmation”) was found to be the best fit, and the measure was found to operate similarly among binary transgender and non-binary/genderqueer individuals (McGuire et al., Citation2020).

The scale appears promising, as the single scale can be used for all gender identities, improving the inclusion of all genders, allowing more accurate comparison between individuals in research, and greater ease of administration. The process of its development, with the meaningful participation of trans individuals to evaluate and ­contribute to iterations of the measure, is commendable. It appears compatible with longitudinal use, avoiding confusion as to which version to use after social and/or medical gender affirmation. Children and adolescents may have difficulty understanding some language; good literacy and language proficiency are required. A helpful addition would be if the UGDS-GS asked responders to state in their own words their assigned sex/presumed gender and experienced/affirmed gender. This would provide essential context regarding their ratings, and greater clarity for professionals using the scale. The scale has not yet been validated in child and adolescent populations.

Discussion

This review has encompassed a broad range of gender measures and is more comprehensive than those previously published, identifying more measures in use. This review raises many questions and implications for current clinical and research practice. Why do gender clinics and researchers continue to use gender measures? Is it truly meaningful or useful to attempt to quantify, categorize, or systematically clarify gender identity or gender dysphoria, using standardized measures, at any age? Do gender measures serve to perpetuate the pathologization of diverse gender identity? Which measures are most useful and least harmful? Are any current measures ideal for use? What would be the ideal properties of a pediatric gender measure, and how would it best be developed? These are questions that we suggest clinicians and researchers reflect on when choosing whether to use a gender measure, which measure to use, and in the development of gender measures.

Considering psychometric properties

Most of the reviewed measures are quantitative scales, and some have strong psychometric properties (see ). The GIDYQ-AA and UGDS demonstrate good accuracy without extreme sacrifice of sensitivity (true positive rates) to achieve high specificity (true negative rates). Several measures demonstrate strong discriminant validity. There is a large variability in estimates of internal consistency of the same measures over different studies. Further research is needed to establish, for most measures, convergent validity with a Gender Dysphoria or Gender Incongruence diagnosis where this is the goal. The majority of psychometric investigations of the measures considered in this review were conducted using adult samples. The psychometric properties of tests should not be assumed consistent across samples that differ from the validation sample. Therefore, further research is needed to establish psychometric properties in children and adolescents in many of the measures included. Most importantly, for many measures, there are concerns regarding construct validity: the beliefs and assumptions about gender diversity reflected in many measures limit their contemporary applicability. Many of the gender measures reviewed above use problematic and offensive language, convey cisnormativity, reinforce gender stereotypes, are not inclusive of non-binary gender identities, and may not be applicable to young people across the different stages of their gender journeys.

Considering inclusivity

Trans people are often aware that dominant narratives of binary gender identity may more readily result in access to gender-affirming medical care, which may have implications for how young people respond to measures (Vincent, Citation2019). Many measures are founded on outdated cisnormative and heteronormative conceptualizations of male and female gender: for example, many do not accommodate the possibility of trans male/masculine people having feminine gender expression. An increasing number of young people who identify as non-binary are presenting to pediatric gender clinics (Telfer et al., Citation2015), so there is a clear requirement for inclusive gender measures. Thus, all measures separated into male/female versions should now be used with caution. Some newer measures aim to be inclusive of diverse gender identities. For example, the UGDS-GS, RCGIE-GS and BIS-GS (and the updated versions of these) combined previously binary scales into single scales that can be used for all trans individuals. However, none of these measures have been validated in adolescent trans populations, and only the UGDS-GS has been validated in non-binary people of any age. Cultural and social changes in language, gender expression and community acceptability have significantly evolved over the past two decades. Measures need to be updated to remain relevant to trans young people and their experiences and understandings of gender. Moreover, without these updates—conducted in collaboration with and ideally led by trans people—it is unethical to make assumptions about non-binary experiences of gender, and the language to describe such experiences and understandings. Thorough and appropriate research will allow for a greater understanding of gender modalities (Ashley et al., Citation2024), and develop the most accurate and inclusive language to describe and measure dimensions of all gender identities. It is likely that the more a measure reflects the diverse reality of gender experience, the more likely the young person is to feel understood and express their thoughts and feelings openly, creating a good foundation for a clinical relationship as well as more accurate data.

Considering impact

Clinicians and researchers should consider the impact of administering gender measures. The language and content of some questions, and the conceptualizations of gender communicated by some measures, could be distressing or even harmful (e.g. “sissy” in the RCGIE-GS). Derogatory language in a measure provided to a client by a clinical or research team serves to reinforce stigma and cisgenderist attitudes. New measures should use respectful language which has been evaluated for consumer acceptability by trans young people. The sensitive and personal nature of questions needs careful consideration. The UGDS scale for those registered male at birth asks the young person to respond to “I dislike urinating in a standing position” and “I dislike having erections”. The UGDS scale for those registered female at birth includes: “I like to behave sexually as a girl/woman”. The BIS asks whether the individual is satisfied or dissatisfied with their penis, vagina and clitoris. For an adolescent, being confronted with such deeply personal questions can be intrusive, offensive, or confusing, especially if they are unsure of the purpose of such questions. Clinicians should consider the goals of assessment, and endeavor to achieve these without causing unnecessary distress. It may be useful to administer the measure after the clinician explains the purpose of the questions. The young person should be granted privacy for completing the measure (i.e. from any accompanying adults or siblings).

In addition, the clinician should consider the process of administering a gender measure, and how this may affect the clinician-client relationship. Negative experiences within healthcare are common for trans young people, and can be detrimental to their mental health (Strauss, Winter, et al., Citation2022). Clinicians should make every attempt to reduce any client discomfort within gender-specific health services so that young people feel safe and supported. This includes use the of validated gender measures only as appropriate. Otherwise, this may exacerbate existing distrust that is often present between trans communities and clinical settings. Critically assessing how overt and subtle cisgenderist, binary, and heterosexist ideas may pervade gender measures is vital, as these embedded assumptions can pathologize and alienate trans young people, hindering rapport and communication.

Considering language

Language is crucial when selecting questionnaires for use with children and adolescents. The language of the GIIC, intended for young children, has been shown to be difficult for preschool-aged children to understand (Dèttore et al., Citation2010). The language used in the GFA and UGDS-GS could also be inappropriate for younger adolescents, due to language and reading disorders that are common in the general child and adolescent population (McLeod & McKinnon, Citation2007). Hence, accommodation for these must be considered in all language-based measures. In addition, as the co-occurrence of gender diversity and autism is becoming well-established (Kallitsounaki & Williams, Citation2022; Mahfouda et al., Citation2019; Strang et al., Citation2018; Strauss et al., Citation2021), measures should ensure that they accommodate the communication needs of young autistic people. The Perth Gender Picture was developed with the goal of reducing dependence on written language, and consumer feedback from young people suggested that some young people prefer it for this reason (Moore et al., Citation2021).

Culturally specific language is a concern in many of the measures discussed. There has been little investigation into the psychometric properties of measures across cultures and languages. Notable exceptions are Shirdel-Havar et al. (Citation2019), who compared Dutch and Iranian samples, and Wang et al. (Citation2021), who conducted validation of a Chinese-language version of the GIDYQ-AA. In addition, the Transgender Congruence Scale has been translated to French and Brazilian Portuguese (Irineu et al., Citation2023; Martin-Storey et al., Citation2021). Expectations of gender roles vary between cultures, thus we recommend that future research should examine the applicability of gender measures across different cultural and linguistic contexts, and that clinicians consider the cultural relevance of measures for individual clients.

Many measures frame questions about gender experience in negative language. For example, of the 12 items in the UGDS scale for those presumed male at birth, 11 are framed negatively (Schneider et al., Citation2016). The dominant narrative of trans experience is one of distress and suffering (Schulz, Citation2018). Measures that frame trans individuals’ experiences as entirely negative reinforce this distress narrative and fail to capture the enjoyment or satisfaction of expressing and being affirmed in one’s gender. While it is important to measure distress, it is also important to acknowledge how measures may reinforce the narrative that there is something “wrong” or “abnormal” about being trans. Measures should capture the spectrum of feelings about gender that an individual may experience—from very dysphoric to euphoric, and everything in between. The TYC-GPS is an important development capturing gender positivity. For evaluating response to gender-affirming medical and surgical treatments, measures which assess satisfaction with treatment, gender euphoria, and self-rated quality of life, and which are able to reflect the positive change, are likely to be more clinically relevant than dysphoria scales which were typically developed for diagnostic purposes, and are framed negatively.

Finally, it is unclear, for many of the published measures, whether trans people contributed to the development. The UGDS-GS, TYC-GDS and TYC-GPS are notable exceptions. Ideally, measures should be informed through community consultation, inclusive of people who identify as non-binary, trans people with disability, and people from cultural and linguistically diverse backgrounds, including Indigenous populations. Measures should be developed by teams that include trans contributors, and are, where possible, led by trans people. This is especially pertinent for measures for use in children and adolescents, where understandings of dimensions of gender are evolving, and language is simultaneously adapting to these newer understandings. It is vital to develop gender assessment measures in consultation with trans individuals to ensure that they meet the needs of trans people and accurately reflect trans experiences (Vincent, Citation2018).

When use of a gender measure is appropriate

While we have discussed many limitations of existing gender measures, there are contexts in which using a gender measure can be helpful. The BIS-GS (Lindgren & Pauly, Citation1975; McGuire, Spencer, et al., Citation2016), in some circumstances, may have value in that it discretely measures satisfaction with specific body parts. Using such a measure in an appropriate context may allow the young person to indicate concerns which they are unable or unwilling to express verbally and may facilitate informed consent discussions about the potential impacts of desired gender-affirming treatments. In addition, the administration of gender measures using current, inclusive and appropriate language (through measures co-designed with trans community members) may facilitate conversations of gender exploration, helping the young person to put words to their feelings. Measures can be followed by conversations which provide an opportunity to expand on the complexities of gender experiences with the clinician. The GFA (Riley, Citation2017) and PGP (Moore et al., Citation2021) aim to facilitate these conversations. A quantitative measure of gender dysphoria could be used to investigate what helps to alleviate dysphoria. Gender euphoria, gender positivity and gender congruence scales could be used to evaluate treatment outcomes.

Conclusion

Gender diversity is increasingly recognized as part of a spectrum of human variation, and many people do not believe it is appropriate to classify gender difference. Throughout this paper, evaluating the purpose and utility of each measure has remained central. There is a need for broader clinical and cultural investigation: are gender measures necessary in the gender-affirming care of trans young people? Trans researcher and community involvement in the development and validation of newer, more ideal measures would enable gender clinics to reduce reliance on older measures. It would be informative to explore clinicians’ perceived utility of measures, over and above asking the young person to self-identify and describe their experience of gender in a structured way. We encourage clinical teams and researchers to actively reflect upon, and periodically review, the measures they administer. Most pertinently, any current clinical or research use of measures that frame trans people as abnormal, use discriminatory language, or measure only distress without measuring wellbeing and euphoria, needs urgent reevaluation. Any measure should be used precisely, for a purpose that is consistent with its known psychometric properties. Research actively including trans young people’s voices is required to develop gender measures that are culturally appropriate, safe and welcoming for trans young people of all identities, abilities and cultural and linguistic backgrounds, which can measure positive as well as negative aspects of gender experience: this could establish a new standard of gender measures which are acceptable to young trans people, as well as psychometrically rigorous. Listening to young people express their own identities should remain central.

Disclosure statement

The Perth Gender Picture measure was created by Dr Julia K. Moore and the team at the Gender Diversity Service, Perth Children’s Hospital, Western Australia, and published by authors JM, PS, AL, LS, CT and SM.

Additional information

Funding

AL is supported by an NHMRC Emerging Leadership Fellowship [#2010063]. PS is supported by a Suicide Prevention Australia Post-Doctoral Fellowship. This research has been supported by a Perth Children’s Hospital Foundation Grant and a grant from the Raine Medical Research Foundation.

References

  • Achille, C., Taggart, T., Eaton, N. R., Osipoff, J., Tafuri, K., Lane, A., & Wilson, T. A. (2020). Longitudinal impact of gender-affirming endocrine intervention on the mental health and well-being of transgender youths: Preliminary results. International Journal of Pediatric Endocrinology, 2020(1), 8. https://doi.org/10.1186/s13633-020-00078-2
  • Allen, L. R., Watson, L. B., Egan, A. M., & Moser, C. N. (2019). Well-being and suicidality among transgender youth after gender-affirming hormones. Clinical Practice in Pediatric Psychology, 7(3), 302–311. https://doi.org/10.1037/cpp0000288
  • American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders: DSM-5. APA. https://doi.org/10.1176/appi.books.9780890425596
  • Arnoldussen, M., van der Miesen, A. I. R., Elzinga, W. S., Alberse, A. E., Popma, A., Steensma, T. D., & de Vries, A. L. C. (2022). Self-perception of transgender ­adolescents after gender-affirming treatment: A follow-up study into young adulthood. LGBT Health, 9(4), 238–246. https://doi.org/10.1089/lgbt.2020.0494
  • Ashley, F., Brightly-Brown, S., & Rider, G. N. (2024). Beyond the trans/cis binary: Introducing new terms will enrich gender research. Nature, 630(8016), 293–295. https://doi.org/10.1038/d41586-024-01719-9
  • Bauer, G., Churchill, S., Ducharme, J., Feder, S., Gillis, L., Gotovac, S., Holmes, C., Lawson, M., Metzger, D., Saewyc, E., Speechley, K., Temple, J., & Team, f t T. Y. C. R. (2021a). Trans Youth CAN! Gender Distress Scale (TYC-GDS). Trans Youth CAN! Research Team.
  • Bauer, G., Churchill, S., Ducharme, J., Feder, S., Gillis, L., Gotovac, S., Holmes, C., Lawson, M., Metzger, D., Saewyc, E., Speechley, K., T. J., & Team, f t T. Y. C. R. (2021b). Trans Youth CAN! Gender Positivity Scale (TYC-GPS). In. Trans Youth CAN! Research Team.
  • Becker, I., Nieder, T., Cerwenka, S., Briken, P., Kreukels, B., Cohen-Kettenis, P., Cuypere, G., Haraldsen, I., & Richter-Appelt, H. (2016). Body image in young gender dysphoric adults: A European multi-center study. Archives of Sexual Behavior, 45(3), 559–574. https://doi.org/10.1007/s10508-015-0527-z
  • Berg, D., DeWitt, J., Spencer, K., McGuire, J., Zucker, K. J., Mitchell, J. N., Bradley, S. J., et al. (2016). The recalled childhood gender identity and experience questionnaire-gender spectrum-adolescent version. [Adapted from The recalled childhood gender identity/gender role questionnaire: Psychometric properties. Sex Roles 54(7-8), 469–483.]. University of Minnesota.
  • Berg, D., Spencer, K., Rider, G. N., Strang, J. F., Edwards-Leeper, L., Leibowitz, S. F., & McGuire, J. (2020). Body Part Satisfaction-Gender Spectrum (BOPS-GS). [Unpublished]. National Center for Gender Spectrum Health, Department of Family Medicine and Community Health, University of Minnesota Medical School.
  • Bloom, T. M., Nguyen, T. P., Lami, F., Pace, C. C., Poulakis, Z., Telfer, M., Taylor, A., Pang, K. C., & Tollit, M. A. (2021). Measurement tools for gender identity, gender ­expression, and gender dysphoria in transgender and gender-diverse children and adolescents: A systematic ­review. The Lancet Child & Adolescent Health, 5(8), 582–588. https://doi.org/10.1016/S2352-4642(21)00098-5
  • Bowman, S. J., Hakeem, A., Demant, D., McAloon, J., & Wootton, B. M. (2022). Assessing gender dysphoria: Development and validation of the gender preoccupation and stability questionnaire–2nd edition (GPSQ-2). Journal of Homosexuality, 71(3), 666–690. https://doi.org/10.1080/00918369.2022.2132440
  • Caldarera, A. M., Marengo, D., Gerino, E., Brustia, P., Rollè, L., & Cohen-Kettenis, P. (2019). A parent-report gender identity questionnaire for children: Psychometric properties of an Italian version. Archives of Sexual Behavior, 48(5), 1603–1615. https://doi.org/10.1007/s10508-018-1372-7
  • Catalpa, J. M., McGuire, J. K., Fish, J. N., Rider, G. N., Bradford, N. J., & Berg, D. (2020). Predictive validity of the genderqueer identity scale (GQI): Differences between genderqueer, transgender and cisgender sexual minority individuals. In Non-binary and genderqueer genders (pp. 187–196). Routledge.
  • Chen, D., Berona, J., Chan, Y.-M., Ehrensaft, D., Garofalo, R., Hidalgo, M. A., Rosenthal, S. M., Tishelman, A. C., & Olson-Kennedy, J. (2023). Psychosocial functioning in transgender youth after 2 years of hormones. The New England Journal of Medicine, 388(3), 240–250. https://doi.org/10.1056/nejmoa2206297
  • Chen, D., Hidalgo, M. A., Leibowitz, S., Leininger, J., Simons, L., Finlayson, C., & Garofalo, R. (2016). Multidisciplinary care for gender-diverse youth: A narrative review and unique model of gender-affirming care. Transgender Health, 1(1), 117–123. https://doi.org/10.1089/trgh.2016.0009
  • Cohen, J. (1998). Statistical power analysis for the behavioral sciences (2nd ed.). Erlbaum.
  • Cohen-Kettenis, P. T., & Van Goozen, S. (1997). Sex reassignment of adolescent transsexuals: A follow-up study. Journal of the American Academy of Child and Adolescent Psychiatry, 36(2), 263–271. https://doi.org/10.1097/00004583-199702000-00017
  • Cohen-Kettenis, P. T., Steensma, T. D., & de Vries, A. L. C. (2011). Treatment of adolescents with gender dysphoria in the Netherlands. Child and Adolescent Psychiatric Clinics of North America, 20(4), 689–700. https://doi.org/10.1016/j.chc.2011.08.001
  • Cohen-Kettenis, P. T., Wallien, M., Johnson, L. L., Owen-Anderson, A. F. H., Bradley, S. J., & Zucker, K. J. (2006). A parent-report gender identity questionnaire for children: A cross-national, cross-clinic comparative analysis. Clinical Child Psychology and Psychiatry, 11(3), 397–405. https://doi.org/10.1177/1359104506059135
  • Coleman, E., Radix, A. E., Bouman, W. P., Brown, G. R., de Vries, A. L. C., Deutsch, M. B., Ettner, R., Fraser, L., Goodman, M., Green, J., Hancock, A. B., Johnson, T. W., Karasic, D. H., Knudson, G. A., Leibowitz, S. F., Meyer-Bahlburg, H. F. L., Monstrey, S. J., Motmans, J., Nahata, L., … Arcelus, J. (2022). Standards of care for the health of transgender and gender diverse people, version 8. International Journal of Transgender Health, 23(Suppl 1), S1–S259. https://doi.org/10.1080/26895269.2022.2100644
  • de Vries, A. L., & Cohen-Kettenis, P. (2012). Clinical management of gender dysphoria in children and adolescents: The Dutch approach. Journal of Homosexuality, 59(3), 301–320. https://doi.org/10.1080/00918369.2012.653300
  • de Vries, A. L., McGuire, J. K., Steensma, T. D., Wagenaar, E. C., Doreleijers, T. A., & Cohen-Kettenis, P. (2014). Young adult psychological outcome after puberty suppression and gender reassignment. Pediatrics, 134(4), 696–704. https://doi.org/10.1542/peds.2013-2958
  • Deogracias, J. J., Johnson, L. L., Meyer-Bahlburg, H. F. L., Kessler, S. J., Schober, J. M., & Zucker, K. J. (2007). The gender identity/gender dysphoria questionnaire for adolescents and adults. Journal of Sex Research, 44(4), 370–379. https://doi.org/10.1080/00224490701586730
  • Dèttore, D., Ristori, J., & Casale, S. (2010). GID and gender-variant children in Italy: A study in preschool children. Journal of Gay & Lesbian Mental Health, 15(1), 12–29. https://doi.org/10.1080/19359705.2011.530569
  • Doorn, C. D., Kuiper, A. J., Verschoor, A. M., & Cohen-Kettenis, P. T. (1996). Het verloop van de geslachtsaanpassing: Een 5-jarige prospectieve studie [The course of sex reassignment: A 5 year prospective study]. Rapport voor de Nederlandse Ziekenfondsraad [Report for the Dutch National Health Council].
  • Doyle, D. M., Lewis, T. O., & Barreto, M. (2023). A systematic review of psychosocial functioning changes after gender-affirming hormone therapy among transgender people. Nature Human Behaviour, 7(8), 1320–1331. https://doi.org/10.1038/s41562-023-01605-w
  • Drummond, K. D., Bradley, S. J., Peterson-Badali, M., & Zucker, K. J. (2008). A follow-up study of girls with gender identity disorder. Developmental Psychology, 44(1), 34–45. https://doi.org/10.1037/0012-1649.44.1.34
  • Elizabeth, P. H., & Green, R. (1984). Childhood sex-role behaviors: Similarities and differences in twins. Acta Geneticae Medicae et Gemellologiae, 33(2), 173–179. https://doi.org/10.1017/s0001566000007200
  • Galupo, M. P., & Pulice-Farrow, L. (2020). Subjective ratings of gender dysphoria scales by transgender individuals. Archives of Sexual Behavior, 49(2), 479–488. https://doi.org/10.1007/s10508-019-01556-2
  • Grannis, C., Leibowitz, S. F., Gahn, S., Nahata, L., Morningstar, M., Mattson, W. I., Chen, D., Strang, J. F., & Nelson, E. E. (2021). Testosterone treatment, internalizing symptoms, and body image dissatisfaction in transgender boys. Psychoneuroendocrinology, 132, 105358. https://doi.org/10.1016/j.psyneuen.2021.105358
  • Green, A. E., DeChants, J. P., Price, M. N., & Davis, C. K. (2022). Association of gender-affirming hormone therapy with depression, thoughts of suicide, and attempted suicide among transgender and nonbinary youth. The Journal of Adolescent Health, 70(4), 643–649. https://doi.org/10.1016/j.jadohealth.2021.10.036
  • Hakeem, A., Črnčec, R., Asghari-Fard, M., Harte, F., & Eapen, V. (2016). Development and validation of a measure for assessing gender dysphoria in adults: The Gender Preoccupation and Stability Questionnaire. International Journal of Transgenderism, 17(3-4), 131–140. https://doi.org/10.1080/15532739.2016.1217812
  • Hembree, W. C., Cohen- Kettenis, P. T., Gooren, L., Hannema, S. E., Meyer, W. J., Murad, M. H., Rosenthal, S. M., Safer, J. D., Tangpricha, V., & T’Sjoen, G. G. (2017). Endocrine treatment of gender-dysphoric/gender-incongruent persons: An endocrine society clinical practice guideline. The Journal of Clinical Endocrinology and Metabolism, 102(11), 3869–3903. https://doi.org/10.1210/jc.2017-01658
  • Hill, D. B., Menvielle, E., Sica, K. M., & Johnson, A. (2010). An affirmative intervention for families with gender variant children: Parental ratings of child mental health and gender. Journal of Sex & Marital Therapy, 36(1), 6–23. https://doi.org/10.1080/00926230903375560
  • Irineu, R. d A., Ribeiro, V. V., Sebastião, T. F., Crow, K., van Mersbergen, M., & Behlau, M. (2023). Cross-cultural ­adaptation to Brazilian Portuguese of the vocal congruence scale and transgender congruence scale. CoDAS, 36(2), e20230050–e20230050. https://doi.org/10.1590/2317-1782/20232023050pt
  • Irwing, P., Booth, T., & Hughes, D. J. (Eds.). (2018). The Wiley handbook of psychometric testing: A multidisciplinary reference on survey, scale and test development. John Wiley & Sons.
  • Johnson, L. L., Bradley, S. J., Birkenfeld-Adams, A. S., Kuksis, M. A. R., Maing, D. M., Mitchell, J. N., & Zucker, K. J. (2004). A parent-report gender identity questionnaire for children. Archives of Sexual Behavior, 33(2), 105–116. https://doi.org/10.1023/B:ASEB.0000014325.68094.f3
  • Jones, B. A., Haycraft, E., Murjan, S., & Arcelus, J. (2016). Body dissatisfaction and disordered eating in trans people: A systematic review of the literature. International Review of Psychiatry, 28(1), 81–94. https://doi.org/10.3109/09540261.2015.1089217
  • Kallitsounaki, A., & Williams, D. M. (2022). Autism spectrum disorder and gender dysphoria/incongruence. A systematic literature review and meta-analysis. Journal of Autism and Developmental Disorders, 53(8), 3103–3117. https://doi.org/10.1007/s10803-022-05517-y
  • Khatchadourian, K., Amed, S., & Metzger, D. L. (2014). Clinical management of youth with gender dysphoria in Vancouver. The Journal of Pediatrics, 164(4), 906–911. https://doi.org/10.1016/j.jpeds.2013.10.068
  • Kline, P. (2015). A handbook of test construction (psychology revivals): Introduction to psychometric design (1st ed.). Routledge. https://doi.org/10.4324/9781315695990
  • Kozee, H. B., Tylka, T. L., & Bauerband, L. A. (2012). Measuring transgender individuals’ comfort with gender identity and appearance: Development and validation of the Transgender Congruence Scale. Psychology of Women Quarterly, 36(2), 179–196. https://doi.org/10.1177/0361684312442161
  • Kuper, L. E., Stewart, S., Preston, S., Lau, M., & Lopez, X. (2020). Body dissatisfaction and mental health outcomes of youth on gender-affirming hormone therapy. Pediatrics, 145(4), e20193006. https://doi.org/10.1542/peds.2019-3006
  • Lavender, R., Shaw, S., Maninger, J. K., Butler, G., Carruthers, P., Carmichael, P., & Masic, U. (2023). Impact of hormone treatment on psychosocial functioning in gender-diverse young people. LGBT Health, 10(5), 382–390. https://doi.org/10.1089/lgbt.2022.0201
  • Lindgren, T. W., & Pauly, I. B. (1975). A body image scale for evaluating transsexuals. Archives of Sexual Behavior, 4(6), 639–656. https://doi.org/10.1007/BF01544272
  • López de Lara, D., Pérez Rodríguez, O., Cuellar Flores, I., Pedreira Masa, J. L., Campos-Muñoz, L., Cuesta Hernández, M., & Ramos Amador, J. T. (2020). Psychosocial assessment in transgender adolescents. Anales de Pediatria, 93(1), 41–48. https://doi.org/10.1016/j.anpede.2020.01.004
  • Mahfouda, S., Moore, J. K., Siafarikas, A., Hewitt, T., Ganti, U., Lin, A., & Zepf, F. D. (2018). Gender-affirming ­hormones and surgery in transgender children and adolescents. The Lancet Diabetes & Endocrinology, 7(6), 484–498. https://doi.org/10.1016/S2213-8587(18)30305-X
  • Mahfouda, S., Moore, J. K., Siafarikas, A., Zepf, F. D., & Lin, A. (2017). Puberty suppression in transgender children and adolescents. The Lancet Diabetes & Endocrinology, 5(10), 816–826. https://doi.org/10.1016/S2213-8587(17)30099-2
  • Mahfouda, S., Panos, C., Whitehouse, A. J. O., Thomas, C. S., Maybery, M., Strauss, P., Zepf, F. D., O’Donovan, A., van Hall, H.-W., Saunders, L. A., Moore, J. K., & Lin, A. (2019). Mental health correlates of autism spectrum disorder in gender diverse young people: Evidence from a specialised child and adolescent gender clinic in Australia. Journal of Clinical Medicine, 8(10), 1503. https://doi.org/10.3390/jcm8101503
  • Martin-Storey, A., Cotton, J. C., Le Corff, Y., Michaud, A., & Beauchesne-Lévesque, S. (2021). A French translation of the transgender congruence scale: Validation and associations with distress, well-being, and perceived transition status. Transgender Health, 6(1), 23–30. https://doi.org/10.1089/trgh.2020.0037
  • McGuire, J. K., Beek, T. F., Catalpa, J. M., & Steensma, T. D. (2019). The genderqueer Identity (GQI) scale: Measurement and validation of four distinct subscales with trans and LGBQ clinical and community samples in two countries. The International Journal of Transgenderism, 20(2-3), 289–304. https://doi.org/10.1080/15532739.2018.1460735
  • McGuire, J., Berg, D., Catalpa, J. M., Morrow, Q. J., Fish, J. N., Nic Rider, G., Steensma, T., Cohen-Kettenis, P. T., & Spencer, K. (2020). Utrecht gender dysphoria scale – Gender spectrum (UGDS-GS): Construct validity among transgender, nonbinary, and LGBQ samples. International Journal of Transgender Health, 21(2), 194–208. https://doi.org/10.1080/26895269.2020.1723460
  • McGuire, J., Catalpa, J., Berg, D., & Spencer, K. (2016). The Utrecht gender dysphoria scale – Gender spectrum. University of Minnesota.
  • McGuire, J., Doty, J. L., Catalpa, J. M., & Ola, C. (2016). Body image in transgender young people: Findings from a qualitative, community based study. Body Image, 18, 96–107. https://doi.org/10.1016/j.bodyim.2016.06.004
  • McGuire, J., Spencer, K., Rider, G. N., & Berg, D. (2016). Body image scale – Gender spectrum. University of Minnesota.
  • McLeod, S., & McKinnon, D. H. (2007). Prevalence of communication disorders compared with other learning needs in 14 500 primary and secondary school students. International Journal of Language & Communication Disorders, 42(S1), 37–59. https://doi.org/10.1080/13682820601173262
  • Moore, J. K., Thomas, C. S., van Hall, H.-W., Strauss, P., Saunders, L. A., Harry, M., Mahfouda, S., Lawrence, S. J., Zepf, F. D., & Lin, A. (2021). The Perth Gender Picture (PGP): Young people’s feedback about acceptability and usefulness of a new pictorial and narrative approach to gender identity assessment and exploration. International Journal of Transgender Health, 22(3), 337–348. https://doi.org/10.1080/26895269.2020.1795960
  • National Center for Gender Spectrum Health. (n.d). Gender Affirmative Lifespan Approach (GALA™). Retrieved 3 May, from https://license.umn.edu/product/gender-affirmative-lifespan-approach-gala
  • Nicholas, L. (2019). Queer ethics and fostering positive mindsets toward non-binary gender, genderqueer, and gender ambiguity. The International Journal of Transgenderism, 20(2-3), 169–180. https://doi.org/10.1080/15532739.2018.1505576
  • Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory. McGraw-Hill.
  • Olson-Kennedy, J., Cohen-Kettenis, P., Kreukels, B. P., Meyer-Bahlburg, H. F., Garofalo, R., Meyer, W., & Rosenthal, S. M. (2016). Research priorities for gender nonconforming/transgender youth: Gender identity development and biopsychosocial outcomes. Current Opinion in Endocrinology, Diabetes, and Obesity, 23(2), 172–179. https://doi.org/10.1097/MED.0000000000000236
  • Riggs, D. W. (2019). Working with transgender young people and their families: A critical developmental approach. Springer.
  • Riley, E. (2017). The gender feeling amplitude: An instrument to assist clinicians with the assessment of gender diverse adolescents. Sexual Health, 14(5), 436–441. https://doi.org/10.1071/SH17009
  • Saewyc, E. M., Gotovac, S., Villalobos, M. C., Scheim, A., Vandermorris, A., & Bauer, G. (2022). Development and validation of new gender distress and gender positivity scales for young transgender adolescents in Canada. Journal of Adolescent Health, 70(4), S10. https://doi.org/10.1016/j.jadohealth.2022.01.130
  • Schneider, C., Cerwenka, S., Nieder, T. O., Briken, P., Cohen-Kettenis, P. T., De Cuypere, G., Haraldsen, I. R., Kreukels, B. P. C., & Richter-Appelt, H. (2016). Measuring gender dysphoria: A multicenter examination and comparison of the Utrecht gender dysphoria scale and the gender identity/gender dysphoria questionnaire for adolescents and adults. Archives of Sexual Behavior, 45(3), 551–558. https://doi.org/10.1007/s10508-016-0702-x
  • Schulz, S. L. (2018). The informed consent model of transgender care: An alternative to the diagnosis of gender dysphoria. Journal of Humanistic Psychology, 58(1), 72–92. https://doi.org/10.1177/0022167817745217
  • Shirdel-Havar, E., Steensma, T. D., Cohen-Kettenis, P., & Kreukels, B. P. C. (2019). Psychological symptoms and body image in individuals with gender dysphoria: A comparison between Iranian and Dutch clinics. The International Journal of Transgenderism, 20(1), 108–117. https://doi.org/10.1080/15532739.2018.1444529
  • Shulman, G. P., Holt, N. R., Hope, D. A., Mocarski, R., Eyer, J., & Woodruff, N. (2017). A review of contemporary ­assessment tools for use with transgender and gender nonconforming adults. Psychology of Sexual Orientation and Gender Diversity, 4(3), 304–313. https://doi.org/10.1037/sgd0000233
  • Singh, D., Deogracias, J. J., Johnson, L. L., Bradley, S. J., Kibblewhite, S. J., Owen-Anderson, A., Peterson-Badali, M., Meyer-Bahlburg, H. F., & Zucker, K. (2010). The gender identity/gender dysphoria questionnaire for adolescents and adults: Further validity evidence. Journal of Sex Research, 47(1), 49–58. https://doi.org/10.1080/00224490902898728
  • Steensma, T. D., Kreukels, B. P. C., Jürgensen, M., Thyen, U., De Vries, A. L. C., & Cohen-Kettenis, P. T. (2013). The Utrecht gender dysphoria scale: A validation study. Archives of Sexual Behavior.
  • Strang, J. F., Powers, M. D., Knauss, M., Sibarium, E., Leibowitz, S. F., Kenworthy, L., Sadikova, E., Wyss, S., Willing, L., Caplan, R., Pervez, N., Nowak, J., Gohari, D., Gomez-Lobo, V., Call, D., & Anthony, L. G. (2018). “They thought it was an obsession”: Trajectories and perspectives of autistic transgender and gender-diverse adolescents. Journal of Autism and Developmental Disorders, 48(12), 4039–4055. https://doi.org/10.1007/s10803-018-3723-6
  • Strauss, P., Cook, A., Watson, V., Winter, S., Whitehouse, A., Albrecht, N., Wright Toussaint, D., & Lin, A. (2021). Mental health difficulties among trans young people diagnosed with an autism spectrum disorder: Findings from trans pathways. Journal of Psychiatric Research, 137, 360–367. https://doi.org/10.1016/j.jpsychires.2021.03.005
  • Strauss, P., Winter, S., Waters, Z., Watson, V., Wright Toussaint, D., & Lin, A. (2022). Perspectives of trans and gender diverse young people accessing primary care and medical transition services: Findings from trans pathways. International Journal of Transgender Health, 23(3), 295–307. https://doi.org/10.1080/26895269.2021.1884925
  • Telfer, M. M., Tollit, M. A., & Feldman, D. (2015). Transformation of health-care and legal systems for the transgender population: The need for change in Australia. Journal of Paediatrics and Child Health, 51(11), 1051–1053. https://doi.org/10.1111/jpc.12994
  • Telfer, M. M., Tollit, M. A., Pace, C. C., & Pang, K. C. (2018). Australian standards of care and treatment guidelines for transgender and gender diverse children and adolescents. The Medical Journal of Australia, 209(3), 132–136. https://doi.org/10.5694/mja17.01044
  • Thoma, B. C., Jardas, E. J., Choukas-Bradley, S., & Salk, R. H. (2023). Perceived gender transition progress, gender congruence, and mental health symptoms among transgender adolescents. The Journal of Adolescent Health, 72(3), 444–451. https://doi.org/10.1016/j.jadohealth.2022.09.032
  • Turban, J. L., King, D., Carswell, J. M., & Keuroghlian, A. S. (2020). Pubertal suppression for transgender youth and risk of suicidal ideation. Pediatrics, 145(2), e20191725. https://doi.org/10.1542/peds.2019-1725
  • van De Grift, T. C., Cohen-Kettenis, P. T., Steensma, T. D., De Cuypere, G., Richter-Appelt, H., Haraldsen, I. R. H., Dikmans, R. E. G., Cerwenka, S. C., & Kreukels, B. P. C. (2016). Body satisfaction and physical appearance in gender dysphoria. Archives of Sexual Behavior, 45(3), 575–585. https://doi.org/10.1007/s10508-015-0614-1
  • van de Grift, T. C., Cohen-Kettenis, P., Elaut, E., De Cuypere, G., Richter-Appelt, H., Haraldsen, I. R., & Kreukels, B. P. C. (2016). A network analysis of body satisfaction of people with gender dysphoria. Body Image, 17, 184–190. https://doi.org/10.1016/j.bodyim.2016.04.002
  • van de Grift, T. C., Elaut, C. E., Cerwenka, T. S., Cohen-Kettenis, P., De Cuypere, P. C. G., Richter-Appelt, P. C. H., & Kreukels, P. C. B. (2017). Effects of medical interventions on gender dysphoria and body image: A follow-up study. Psychosomatic Medicine, 79(7), 815–823. https://doi.org/10.1097/PSY.0000000000000465
  • Vincent, B. (2018). Studying trans: Recommendations for ethical recruitment and collaboration with transgender participants in academic research. Psychology & Sexuality, 9(2), 102–116. https://doi.org/10.1080/19419899.2018.1434558
  • Vincent, B. (2019). Breaking down barriers and binaries in trans healthcare: The validation of non-binary people. The International Journal of Transgenderism, 20(2-3), 132–137. https://doi.org/10.1080/15532739.2018.1534075
  • Wallien, M. S. C., Quilty, L. C., Steensma, T. D., Singh, D., Lambert, S. L., Leroux, A., Owen-Anderson, A., Kibblewhite, S. J., Bradley, S. J., Cohen-Kettenis, P. T., & Zucker, K. J. (2009). Cross-national replication of the gender identity ­interview for children. Journal of Personality Assessment, 91(6), 545–552. https://doi.org/10.1080/00223890903228463
  • Wang, Y., Feng, Y., Su, D., Wilson, A., Pan, B., Liu, Y., Wang, N., Guo, B., Han, M., Zucker, K. J., & Chen, R. (2021). Validation of the Chinese version of the gender identity/gender dysphoria questionnaire for adolescents and adults. The Journal of Sexual Medicine, 18(9), 1632–1640. https://doi.org/10.1016/j.jsxm.2021.05.007
  • World Health Organization. (1992). The ICD-10 classification of mental and behavioural disorders clinical descriptions and diagnostic guidelines. WHO.
  • World Health Organization. (2018). The ICD-11 for mortality and morbidity statistics. WHO.
  • Yu, L., Winter, S., & Xie, D. (2010). The child play behavior and activity questionnaire: A parent-report measure of childhood gender-related behavior in China. Archives of Sexual Behavior, 39(3), 807–815. https://doi.org/10.1007/s10508-008-9403-4
  • Zucker, K. J., Bradley, S., Sullivan, C. B. L., Kuksis, M., Birkenfeld-Adams, A., & Mitchell, J. N. (1993). A gender identity interview for children. Journal of Personality Assessment, 61(3), 443–456. https://doi.org/10.1207/s15327752jpa6103_2
  • Zucker, K. J., Mitchell, J. N., Bradley, S. J., Tkachuk, J., Cantor, J. M., & Allin, S. M. (2006). The recalled childhood gender identity/gender role questionnaire: Psychometric properties. Sex Roles, 54(7-8), 469–483. https://doi.org/10.1007/s11199-006-9019-x