423
Views
0
CrossRef citations to date
0
Altmetric
Education Policy

High-stakes test anxiety among Taiwanese adolescents: a longitudinal study

ORCID Icon, &
Article: 2321019 | Received 31 Aug 2023, Accepted 10 Feb 2024, Published online: 27 Feb 2024

Abstract

This study examined changes of test anxiety under the reform of the examination system in Taiwan. We sampled 46,361 Grade 9 students in Taiwan for 9 consecutive years starting from 2011 to collect data on their test anxiety, cram school attendance frequency, and academic achievement. Students’ test anxiety level was compared between three periods: Basic Competence Test (2011–2013), Comprehensive Assessment Program (CAP 1; 2014–2016), and CAP 2(2017–2019). The results indicated that first, during CAP 1, students’ test anxiety increased, but decreased in CAP 2. Second, the test anxiety level of students those with an upper-intermediate level of academic achievement, increased. Finally, the cram school attendance frequency didn’t differ among the three periods. We provide suggestions for countries with a similar social climate. For potentially controversial social issues, such as test anxiety, policy makers should develop corresponding measures in advance on the basis of empirical evidence.

High-stakes testing (HST) refers to tests whose results have a major impact on the interests of an individual or institution. For example, test results may determine whether a student is admitted to a particular school, an individual is offered a job, or a school receives government subsidies for the succeeding semester. Abu-Alhija (Citation2007) argued that large-scale testing brings anxiety to both teachers and students. Segool et al. (Citation2013) studied countries with HST systems and suggested that the perceived test anxiety of American students in Grades 3–5 is higher for HST compared with classroom exams. Sung et al. (Citation2016) indicated that test anxiety experienced by students in Taiwan, which has an HST system, reflects the presence of long-term stress. Furthermore, students’ test anxiety may be influenced by the social climate. Segool et al. (Citation2014) revealed that test anxiety and the social climate perceived by students are likely to be associated: the more test anxiety a student perceives in their teachers or classmates, the higher their test anxiety. Chao and Sung (Citation2019a) also mentioned that the social climate formed by the test and school admission systems increases students’ test anxiety. For example, students in Taiwan may experience high test anxiety because they are anxious about failing to be accepted into a prestigious public high school, which is the preferred type of school in Taiwan. Taken together, HST systems and results may aggravate students’ test anxiety; therefore, reducing students’ test anxiety is a key objective of any test system reform. Taiwan has implemented a test system reform. The Senior High School Education Act, promulgated by the Ministry of Education (MOE) of Taiwan in 2013, stipulates that the number of students to be admitted by the open admission program in each senior high school by 2019 should be more than 85% of the total number of students admitted. In 2014, the criterion-referenced Comprehensive Assessment Program (CAP) replaced the norm-referenced Basic Competence Test (BCTEST; Tseng et al., Citation2019). By either reforming the test systems or revising the school entrance policy, the main goal is to alleviate students’ test anxiety by eliminating HST from school admission.

However, there is still limited empirical evidence that clarifies whether students’ test anxiety will increase, decrease, or remain unchanged through the transition of the test system. In other words, the relationship between test systems and test anxiety is currently inferred from a subjective perspective, and there is no consensus reached within society. Therefore, to ascertain whether test anxiety among Taiwanese junior high school students has undergone changes due to the test reform, we investigated Grade 9 students’ test anxiety over nine consecutive years, roughly divided into three periods: (1) the end of the BCTEST period (2011–2013); (2) the initial phase of the CAP 1 (2014–2016); and (3) the adaptation phase of the CAP 2 (2017–2019). We compared these three periods to investigate the impact of transitioning from the old test system to the new test system on students’ test anxiety.

Moreover, we investigated students’ cram schooling starting in 2012 to determine whether students’ learning burden was reduced (e.g. spending less time in cram schools) by the test system reform. Overall, our primary goal was to obtain empirical evidence supporting changes in students’ test anxiety by the test system reform. Apart from elucidating test anxiety theories, our study could also provide education policymakers with useful information for inclusive decision-making.

Literature review

Current debate

The necessity and advantages of HST for students comprise a controversial topic in education. Consequences aside, HST is established out of good intentions. Madaus and Russell (Citation2010) indicated that for the government, test policy has two main functions. First, it affects what students are taught and ensures that the curriculum aligns with a nation’s or state’s criteria. For example, the standardized test program of each US state is based on the curriculum criteria of that state. Similar to the approach in Australia and Japan, Taiwan’s standardized tests are developed according to the curriculum guidelines issued by the MOE.

Second, test results are a vital indicator of whether taxpayer money is well spent on education. In other words, the effectiveness of education can be assessed through standardized test results. The No Child Left Behind Act passed in 2001 in the US can serve as an example: it stipulates that public and charter schools must improve their students’ standardized test grades rapidly, or the employment of the teachers or school administrators or the operating budget of the school may be compromised (United States Congress, 107th, 1st session, 2001).

Presently, some local governments in Taiwan have proposed the ‘C Reduction Program’ to increase the number of Grade 9 students obtaining B (basic) grades or above while reducing the number of students receiving C (to be enhanced) grades through teacher incentives and remedial education. An advantage of HST is that the results can be an indicator of the efficacy of education policy. However, HST does have numerous negative impacts, chief of which is the stress on teacher groups as well as on students. Teachers are at risk of losing their jobs, and schools are also at risk of budget cuts. These in turn may affect how teachers teach. For example, Pedulla et al. (Citation2003) surveyed more than 4,000 teachers, and 80% of them reported being pressured by HST to reduce the time spent on content unrelated to tests. The HST environment not only drives teachers to improve students’ scores but also results in the former focusing more on test skills than on higher-level thinking. In turn, students may sacrifice in-depth learning for higher test scores. As for schools, they may exclude disadvantaged students for the sake of maintaining high overall test results. However, HST has held for a long time in many countries, such as Taiwan.

Taiwan’s test system reform

For more than 40 years (1956–2000), junior high school students in Taiwan had been required to take a joint senior high school entrance examination to gain admission to high school. During this period, a uniform national examination was held each year, and students were ranked based on their exam scores and assigned to a senior high school (Sung et al., Citation2010). If students wished to attend a prestigious senior high school, they needed to have excellent test results. Although this system may seem fair (i.e. school admissions determined by students’ test scores), calls for reform had already been registered in the 1990s. Therefore, the MOE began planning and then launched the senior high school multiple-entrance program in the 1990s. The MOE also commissioned the Research Center for Psychological and Educational Testing (RCPET), an institute of the National Taiwan Normal University, to develop the BCTEST. The aim of the BCTEST was to assess the basic competencies and learning outcomes of Grade 9 students. In contrast to the joint examination, the BCTEST had an item bank constructed for this purpose, and tests were generated using items from the item bank. In other words, the BCTEST items were pre-tested and analyzed, and only high-quality items were used for the official examination. The BCTEST, launched in 2001, tested six subjects: Chinese, English, mathematics, social studies, and science. Essay writing was added to the test in 2007. This component employed criterion-referenced grading, whereas all the other components had scale scores. From 2001 to 2013, BCTEST scores were a key reference for Grade 9 students’ senior or vocational high school applications (Sung et al., Citation2014; Sung et al., Citation2016).

Initially, the objective of the BCTEST was to assess students’ basic competence and, in theory, criterion-referenced grading should have been adopted; that is, the test result should have been either ‘pass’ or ‘fail’ depending on the scores. However, the previous joint school entrance test system had considerably affected culture. People were used to having all schools ranked by the highest and lowest scores of the students. As such, the public at the time insisted that the BCTEST results be used to rank students’ scores for admission to schools. For school admission, the MOE offered a percentile rank (PR) that was computed according to each student’s total test score.

Years after introducing the multiple-entrance program and BCTEST, the MOE launched the 12-year compulsory education program in 2011. At this time, the MOE commissioned the RCPET, the entity that developed the BCTEST, to set up the CAP to replace the existing BCTEST. A major difference between the two is that, although the CAP is also a standardized test assessing performance in junior high school subjects, its objective is to monitor student competence nationwide rather than to serve as a measure for students to compete with others for school entrance. That is, instead of using the PR value as in the BCTEST, a CAP test report indicates whether a student is at one of three levels for each subject: Mastery (Level A), Basic (Level B), and To Be Enhanced (Level C). In 2014, all Grade 9 students were required to take the CAP, which had fully replaced BCTEST.

The MOE (2013) also highlighted test anxiety reduction as a primary reason for its reform programs. The replacement of the BCTEST, a norm-referenced test, by the CAP, a criterion-reference test, was expected to result in students being less concerned about their scores, thereby mitigating their test anxiety and correcting the excessive focus on performance (e.g. by frequently attending cram schools).

High-stakes tests and test anxiety

HST brings test anxiety. The stress on children and adolescents caused by HST has long been a serious education problem challenging many countries (Putwain, Citation2008; von der Embse et al., Citation2015; Wuthrich et al., Citation2021). Lee and Larson (Citation2000) likened the anxiety associated with school entrance examination experienced by Korean senior high school students to ‘hell’. Hutchings (Citation2015) also used the term ‘examination factories’ to describe the emphasis on standardized testing in the UK that increased test anxiety in students. Howell (Citation2017) suggested that for students, the Literacy and Numeracy domains of the National Assessment Program of Australia are associated with negative experiences. One study also revealed that test anxiety may substantially lower university students’ sleep quality and affect their mood (Adams et al., Citation2021). Quantitatively, studies have shown that many students experience test anxiety. Thomas et al. (Citation2018) indicated that 25% of US undergraduate students fall into the high test anxiety class. Chen et al. (Citation2023) also conducted a study involving over seven thousand Chinese students aged 10 to 19 and found a prevalence rate of 46.7% for test anxiety.

In Taiwan, approximately 22% of Grade 9 students experience high test anxiety (Chao & Sung, Citation2019b). Moreover, Chao and Sung (Citation2019a) indicated that public high schools often have a higher enrollment rate than private and vocational schools, and the majority of society favors public high schools. Such a social climate may cause middle-achieving students to face the uncertainty of further education and the possibility that they would pay the price but not receive the return, thus increasing their test anxiety (Chao & Sung, Citation2019a). Chao and Sung (Citation2023) further demonstrated that the perceived uncertainty of Taiwanese students is strongly associated with test anxiety. Upon reviewing the existing body of research on test anxiety, it is evident that there is a lack of longitudinal studies investigating alterations in the test system. Cross-sectional studies provide valuable insights into the association between test anxiety and other factors, as demonstrated in the aforementioned study conducted by Chen et al. (Citation2023). However, in order to ascertain the relationship between changes in the test system and test anxiety, it is necessary to depend solely on longitudinal studies. The present study used a longitudinal approach to investigate the potential association between the test system and test anxiety.

The aforementioned research has indicated that test anxiety among students is a growing problem. However, empirical data is limited with respect to the change in test anxiety before and after the implementation of a new testing system. Based on the current statements and evidence, we propose several potential changes, as shown in . First, according to the MOE and public expectations in Taiwan, the shift from BCTEST to CAP 1—from receiving PR values to receiving three levels—can encourage students to concentrate less about their grades, thereby alleviating their test anxiety. In addition, as a result of modifications to the test, the entire social climate will change, and the impact of reducing test anxiety will endure. Under the aforementioned conditions, variations in test anxiety will follow the pattern depicted in , with test anxiety being highest under the old regime and declining steadily under the new regime. The second possibility is that the change in the test system would result in what policymakers anticipated from the beginning: a decrease in test anxiety. However, without significant change in the social climate of examination-based competition and with the continuation of the test system, a new way of comparing test results will evolve. This will restore test anxiety to its prior level, as depicted in . In the third condition, the modification of the test system will increase the uncertainty of the students because the test results cannot indicate their school admission, and this increased uncertainty will also increase students’ test anxiety (Chao & Sung, Citation2023). As depicted in , the continuation of the test system leads society to become gradually used to the new method, making students feel less uncertain and thus reducing test anxiety. In the fourth situation, the change in the test system will increase test anxiety owing to the uncertainty it brings to the students, and when the social climate has not changed, this uncertainty will persist, causing test anxiety to remain at a greater level, as illustrated in .

Figure 1. Predictions of test anxiety on high-stake testing change.

Figure 1. Predictions of test anxiety on high-stake testing change.

Cram schooling phenomenon

The prevalence of cram schooling, in which students attend after-school intensive classes to enhance exam skills, can be viewed as an indicator of the social setting, particularly the social climate in which test scores determine admission decisions. This rooted social climate explains the large number of cram schools across (Bray, Citation2007; Kim & Lee, Citation2010). Kwok (Citation2004) examined the cram schooling phenomenon in Japan, Korea, Taiwan, Hong Kong, and Macau before and after 2000 and found that although formal schools started moving toward education reform, the number of examination-oriented cram schools did not decrease. This ethos persisted even after the turn of the millennium. Byun (Citation2014) indicated that between 2006 and 2007, 80% of Grades 7 and 8 students in Korea attended cram schools. Chung (Citation2013) also reported that the number of cram schools in Taiwan increased continuously from 2002 to 2011. Park and Lee (Citation2021) found that students’ parents are also affected by the HST climate. They would move their families or buy a house to enable their children to access superior cram schooling opportunities.

Notably, cram schooling may not have a considerable effect on academic performance. Kuan (Citation2011), for instance, revealed that attending a math cram school has only a minor effect on students’ math performance. However, the cram school industry remains booming. In Taiwan, Grade 9 students attended cram school for an average of three days a week, and those with more frequent attendance were also those with the highest test anxiety (Chao & Sung, Citation2019a). Students’ test anxiety and the cram schooling phenomenon may be products of the social climate that values test results.

Taken together, students’ cram schooling frequency may reflect society’s overall examination culture and can be an indicator of the overall social climate. If society continues to regard excellent test results as a guarantee of an individual’s future success (Howell, Citation2017) and promotes competition between individuals and perfection, then cram schools will continue to thrive. Conversely, if society begins to regard test results as merely a means to assess a student’s learning results or as an indicator of the effectiveness of the nation’s education policy, then the habit of attending cram schools may diminish, resulting in a reduction in students’ time spent in cram schools. Therefore, we also examined whether the time spent at cram schools declined following the test system reform.

Gender and test anxiety

Regarding gender differences in test anxiety, research has consistently revealed that test anxiety is higher among women than men (Hembree, Citation1988; von der Embse et al., Citation2018). Although the gender difference in test anxiety is a replicable phenomenon, little evidence has been reported on its origins (Putwain & Daly, Citation2014). Cassady and Johnson (Citation2002) suggested that the reason for this difference is that girls tend to overestimate environmental risks. From this viewpoint, girls’ higher anxiety would reflect that they are sensitive in the stress context. The indirect evidence reveals that although girls report higher anxiety, they perform as well as boys do (Cassady & Johnson, Citation2002; Chao & Sung, Citation2019a; Sung et al., Citation2016), or even better (Chapell et al., Citation2005; Stang et al., Citation2020). Therefore, the gender difference in reporting test anxiety is a ‘constant difference’. We thus assumed that the gender difference exists from before to after the test system reform.

Research objective

Our research addressed two questions: First, has MOE-declared test anxiety decreased after test system reforms? Second, has the new test system reduced cram school attendance? We compared test anxiety and cram schooling before and after the test system modification.

We collected data for three years under BCTEST and six years under CAP. We separated the data into three eras. BCTEST's final three years (2011–2013) were the first period. The second and third periods were 2014–2016 (CAP 1), when the CAP was launched, and 2017–2019 (CAP 2), when it matured. We compared test anxiety and cram school attendance between these three periods.

Method

Participants

We used stratified random sampling to recruit Grade 9 students in Taiwan each year starting in 2011. Students were selected from northern, central, southern, and eastern Taiwan at an approximate ratio of 3:2:2:1. Subsequently, according to school size, we sampled 40, 80, or 120 Grade 9 students from each school. We first sampled the schools from each region according to the school’s attributes (e.g. public vs. private) and the proportions of the attributes in the specific region. Next, we randomly sampled Grade 9 students from the selected schools. In terms of post hoc data screening, we excluded students who provided incomplete information, namely, (1) incomplete background information, such as sex and ID number (which can be linked to their BCTEST or CAP scores), (2) incomplete Examination Stress Scale (ExamSS) information, and (3) no subsequent participation in the BCTEST or CAP. During the nine study years, the number of Grade 9 students sampled each year ranged from 1,819 to 9,064, and data were obtained from a total of 46,361 students and then analyzed. presents the sex of the students sampled each year. Junior high schools in all regions of Taiwan were included in the sample, and the distribution of the study sample was similar to that of Grade 9 students in Taiwan. In addition, we used data weighting to minimize sample bias. Therefore, our study used a representative sample size. The informed consent form was distributed to students and their parents, notifying them of their selection to participate in an investigation. Participants were informed that they would be queried regarding their levels of test anxiety. The questionnaire administrators, with abundant experience in administering questionnaires, were provided with relevant training. After completing the questionnaire, the students were informed by the questionnaire administrator about how their data would be used; confidentiality-related matters were addressed, such as the data being processed anonymously.

Table 1. Number of grade 9 students sampled each year.

Research materials

Personal information questionnaire

The Personal Information Questionnaire was presented with ExamSS and included the students’ ID number, sex, age, and cram school attendance. The question on cram school attendance was, ‘During your third year of junior high school, how many days a week did you attend cram school (or receive private tuition)?’ The participants chose from six answers: ‘zero days’, ‘1 day’, ‘2 days’, ‘3 days’, ‘4 days’, and ‘5 days’.

Examination stress scale

We used the ExamSS developed by Sung and Chao (Citation2015). The scale includes three dimensions reflecting adolescents’ test anxiety levels: physiological anxiety response (PA; 10 items), cognitive and behavioral response (CB, eight items), and perceived social expectation and social comparison (SS, nine items). PA refers to students’ anxiety response elicited by examinations. An example questionnaire item is, ‘I experience sleep disorder when I think about entrance examinations’. CB refers to students’ thoughts and behaviors elicited when experiencing test anxiety. An example questionnaire item is, ‘The entrance examination is similar to a battle that I cannot afford to lose’. SS refers to the pressure resulting from parents’ and teachers’ expectations and peer comparisons. An example questionnaire item is, ‘I feel jealous when hearing about other students that have been admitted to elite schools’. The participants used a five-point scale to rate the items: 1 = strongly disagree, 2 = disagree, 3 = partially agree, 4 = agree, and 5 = strongly agree. The points were summed for each subscale, and the scores of all the subscales were summed to generate the total score. The total score of the ExamSS ranged from 27 to 135 points; higher scores indicated greater test anxiety. In this study, Cronbach’s α for the PA subscale was .90, .87, and for SS, .89. For the total scores, the Cronbach’s α was .94, indicating that the ExamSS we used was a stable and acceptable instrument.

BCTEST and CAP

The BCTEST is a standardized test developed in 1998 by the RCPET on behalf of the MOE. It tested the six subjects of Chinese, English, mathematics, social studies, science, and essay writing. Items in the first five subjects were of the multiple-choice type, and the scale score of each subject ranged from 1 to 80 points. For the essay writing, six score levels were employed, and each level can be converted to two scale points. The total scale score was 412 points. The RCPET then used the scale score to compute the PR on the basis of all test takers nationwide. Regarding the generation of the test items, the RCPET constructed an item bank for the BCTEST. Each item was reviewed and revised by experts in the subjects or in examination in general. After the pretest and test analysis, items meeting the criteria were included in the item bank. From 2011 to 2013, the total number of items for each subject tested in the BCTEST were as follows: Chinese, 48; English, 45; mathematics, 34; social science, 63; and science, 58. All subjects had a Cronbach’s α for reliability estimation between .91 and .96, suggesting acceptable reliability.

The CAP was launched in 2014 to replace the BCTEST. Unlike the BCTEST that provides norm-referenced test results, the CAP expresses test results using criterion-referenced grading. When students receive their CAP test reports, their performance on each subject is described as Mastery (Level A), Basic (Level B), or To Be Enhanced (Level C). The subjects tested in the BCTEST and CAP are similar. CAP involves essay writing, Chinese, English (reading), English (listening), mathematics, social studies, and science. After replacing the BCTEST in 2014, the CAP became the largest academic achievement test for junior high schools. As with the BCTEST item bank, each item to be added to the CAP item bank is subjected to a rigorous process. Moreover, some items are specially designed using the common-item method that enables test equating between the BCTEST and CAP. According to the RCPET, each subject tested in the CAP had a test reliability higher than .91, which is similar to the reliability of the BCTEST. In our study, we requested that the RCPET convert students’ CAP results into a scale score used in the BCTEST and provide the PR of each student. Finally, we used the scale scores and PRs of students who took the BCTEST or CAP from 2011 to 2019 as their academic achievement index.

Procedure and data analysis

The ExamSS was conducted consistently between the beginning of March and the end of April each year. Moreover, the BCTEST and CAP have been uniformly held nationwide on a weekend in May every year. Both the BCTEST and CAP lasted two days and followed a rigorous procedure for test taking and test administration.

For the data analysis, we divided the students into three periods: BCTEST, CAP 1, and CAP 2. These periods referred to the final period of the BCTEST system, the initial period of the transition from BCTEST to CAP, and the mid-stage of the transition, respectively. We calculated students’ test anxiety, of female and male students separately and of the overall sample, in each of the three periods. Subsequently, students in these three periods were divided into 10 groups according to their PRs, and the test anxiety level was calculated for each group. We conducted a regression analysis to test the test anxiety trend in terms of academic achievement and to ascertain if any linear or nonlinear effect was present (Chao & Sung, Citation2019a). Finally, we performed analysis of variance (ANOVA) and post hoc comparison; test anxiety and cram school attendance (days) were the dependent variables, and the three periods and 10 academic achievement groups were the independent variables. The goal was to identify any significant differences between group means. To analyze the data, we used JASP 0.14.1 (JASP Team, Citation2020).

Results

Changes in students’ test anxiety Level

We first examined the average test anxiety of the students during the three periods ( and ). In general, test anxiety was the lowest in the BCTEST period. As for the CAP 1 period, the average student test anxiety increased by approximately one-fifth of the SD, Cohen’s d = .19. For the CAP 2 period, the average student test anxiety decreased slightly Cohen’s d = .10). In terms of the effect of sex, the girls’ test anxiety was consistently higher than the boys’ across all three periods, and the ω2sex = .02 of girls’ scores increased by approximately one-third of the SD. When considering the standard error of the means, regardless of the comparison between the three periods or between boys and girls, all differences were statistically significant ().

Figure 2. Examination Stress Scale (ExamSS) total scores of students. BCTEST = Basic Competence Test; CAP = Comprehensive Assessment Program.

Figure 2. Examination Stress Scale (ExamSS) total scores of students. BCTEST = Basic Competence Test; CAP = Comprehensive Assessment Program.

Table 2. Descriptive statistics for ExamSS across three periods.

Test anxiety and academic achievement

We also grouped the students according to their academic achievement (i.e. their PRs on the BCTEST or CAP). We eventually obtained ten groups of students with different academic achievement levels (Chao & Sung, Citation2019a; Sung et al., Citation2016). Subsequently, according to the group data, we calculated the ExamSS mean for each group ( and ).

Figure 3. Examination Stress Scale (ExamSS) total scores according with different levels of learning achievement. BCTEST = Basic Competence Test; CAP = Comprehensive Assessment Program; PR = percentile rank.

Figure 3. Examination Stress Scale (ExamSS) total scores according with different levels of learning achievement. BCTEST = Basic Competence Test; CAP = Comprehensive Assessment Program; PR = percentile rank.

Table 3. Test anxiety and cram school attendance frequency of different academic achievement groups in the three periods.

Regarding the test anxiety trend by academic achievement level, we conducted a cubic regression for each of the three periods. The results indicated that both linear and quadratic trends were statistically significant across the three periods, but the cubic trend was only statistically significant for the CAP 1 period. According to the linear coefficients (), higher academic achievement was correlated with higher test anxiety in all three periods (β =  0.94, 0.38, and 0.90; ps <.01). As for the quadratic coefficient, a significant inverted-U shape pattern was observed in both the BCTEST and CAP 2 periods (β = −0.92 and −0.67; ps <.05). However, in the CAP 1 period, although the quadratic coefficient was positive (β =  0.51; p = .03), the cubic coefficient was negative (β = −0.76; p <.01). This finding reflected the trend that the curve at this stage did not turn downward until the end ( and ).

Table 4. Cubic regression of PR groups on ExamSS scores among three periods.

The preceding analysis indicated that, across all three periods, changes in test anxiety according to students’ academic achievement had both linear and nonlinear characteristics. We further examined the means of test anxiety of the three periods and the 10 academic achievement groups using ANOVA and post hoc comparisons (). A comparison of the three curves among the three groups revealed that test anxiety did not change significantly among students whose academic achievement was in the bottom 49% (PR1–49). For higher academic achievement groups starting from PR50–59, students’ test anxiety differed significantly between the CAP 1 and BCTEST periods: students’ test anxiety was greater in the former compared with the latter period. During the CAP 2 period, students in the PR70–79 and PR80–89 academic achievement groups exhibited lower test anxiety compared with their counterparts in the previous period. As displayed in , the test anxiety of students in the PR50–59 and PR60–69 groups peaked during the BCTEST period. During the CAP period 1, students in the PR70–79 group had the highest test anxiety. Somewhat similarly, students in PR60–69 and PR70–79 had the highest test anxiety during the CAP 2. Together, the results indicated that the test system reform was associated with changes in students’ test anxiety, with differences depending on academic achievement.

Table 5. ANOVA and turkey’s post-hoc tests of test anxiety and days in Cram School.

Changes in cram school attendance

On the average number of days per week students spent in cram schools (), the ANOVA results showed significant effects in the three periods, academic achievement, and their interaction (). In the post hoc comparison, we observed similar results across the three periods. We also found that students with the lowest academic achievement spent the fewest days in cram schools, and cram school attendance increased along with academic achievement. For students with higher academic achievement (≥ PR50), cram school attendance peaked and no longer increased even when academic achievement increased further. For students in the PR90–99 group, cram school attendance decreased. A comparison of the three periods indicated that students in the PR90–99 group spent significantly more days in cram school during the CAP 1 and CAP 2 periods than during the BCTEST period.

Figure 4. Average number days students attend cram school each week according with different levels of learning achievement. BCTEST = Basic Competence Test; CAP = Comprehensive Assessment Program; PR = percentile.

Figure 4. Average number days students attend cram school each week according with different levels of learning achievement. BCTEST = Basic Competence Test; CAP = Comprehensive Assessment Program; PR = percentile.

Discussion

Impacts of changing the test system

We investigated the HST faced by Grade 9 students in Taiwan and the changes in students’ test anxiety after the test system reform. We collected data from Grade 9 students from 2011 to 2019 and divided the nine years into three periods: the BCTEST, CAP 1, and CAP 2 periods. We obtained three major findings. First, girls experienced more anxiety than boys across all three periods, and the results are in line with the assumption that the gender difference of reporting test anxiety is a ‘constant difference’. Second, although the effect sizes were small, students experienced the lowest test anxiety in the BCTEST period. Their test anxiety increased in the CAP 1 period but then decreased in the CAP 2 period. The trend is consistent with the prediction shown in , indicating that students’ perceived uncertainty is more likely to affect test anxiety. Third, the students’ test anxiety across all three periods differed depending on their academic achievement. Students’ test anxiety increased along with academic achievement. However, after a certain academic achievement level, such as PR60–69 in the BCTEST and PR70–79 in the CAP 1 and CAP 2 periods, test anxiety decreased. Empirical evidence revealed that at the beginning of the test system reform, students’ test anxiety increased, which may be attributed to the uncertainty regarding school admission. Sometime after the system reform, students’ test anxiety decreased, which may be because either the public had become accustomed to the new system or a change had occurred in the social climate. For example, people may have grown more willing to attend a high school in their neighborhood rather than a prestigious high school.

Although the study results suggested that test anxiety in the BCTEST period was the lowest, our study only covered the final three years of the BCTEST system, which had been administered for a decade. We also found that test anxiety was lower during the CAP 2 period than during the CAP 1 period. The conclusion here does not suggest that the BCTEST system should be reintroduced, although such a policy has a strong group of supporters. Indeed, some people even prefer the test system from the previous century (i.e. the test system before BCTEST), under which test results were completely linked to an individual’s school ranking. Nonetheless, policymakers are keen on adopting the most advanced education ideas during the test system reform.

The key point of our study is that the test system reform affects not only students but also society as a whole, and relevant impacts may be reflected in students’ adaptation to daily living and test anxiety. Although the effect size of the increase in student test anxiety was small, with a Cohen’s d of only .19, if such an increase is translated into the cost that society has to pay, such as the cost of the school counseling system or the social costs of student deviant behavior, the total could be significant. Our results clearly indicated that students with upper-intermediate academic achievement were the primary group affected by test anxiety. A possible reason is that under the CAP system, students whose academic achievement is at an intermediate or higher level would experience more uncertainty compared with their counterparts under the BCTEST system. Under the BCTEST system, test results and school admission were more closely related, and because the BCTEST system had been employed for 10 years, the public tended to be less uncertain about the academic achievement required by certain senior high schools. When the CAP system was implemented, the entire society was forced to move from a PR-based test result to the use of Levels A, B, and C for each test subject, and the previously established link between PRs and schools was no longer valid. We found that the new admissions policy did not reduce the anxiety caused by the previous norm-referenced exams by roughing up the criterion-referenced CAP scores. Moreover, for students at the boundary of the Mastery (Level A) and Basic (Level B) thresholds, the test anxiety seemed to be even greater, as one item could make or break one’s Level A standing. This group of students at the threshold of Mastery and Basic may spend more time preparing for exams and miss many opportunities to learn through extracurricular activities. This is especially true for those with upper-intermediate academic achievement (PRs 60–69, 70–79, and 80–89). Nonetheless, Taiwanese society seems to have gradually adapted to the CAP system. For example, using the school entrance results of the previous year, students can recognize how the CAP results are linked to schools. People began to select a senior high school close to where they lived, an idea proposed by the MOE. Thus, eventually, students with upper-intermediate academic achievement experienced less stress.

Cram school attendance and social climate

We also examined whether the frequency of students’ cram school attendance, an important indicator of the social climate, changed as a result of the test system reform. Our results indicated that the average cram school attendance did not change much across the three periods and was associated with students’ academic achievement: the greater the academic achievement, the greater the number of days spent in cram schools. Students’ cram school attendance increased among all students, except for the group with the highest academic achievement; these students had a slightly lower attendance frequency. This result is consistent with the findings of Chao and Sung (Citation2019a). The failure of the test reform to affect the frequency of cram school attendance suggests that cram schooling is a culture that is resistant to change. For example, once a cram schooling culture is formed, convincing the public to stop sending children to cram school is challenging.

Another assumption is that sending children to cram schools is associated with parents’ belief that longer learning hours means greater competitiveness. In such a competitive climate, parents are unlikely to initiate the change (i.e. reduce their children’s cram school hours) unless they know that others would do the same. This is because such an action is associated with some risk—parents fear that their children may fail because they are the only ones spending less time in cram schools. This risk may be driving parents’ anxiety in the examination culture. Therefore, even if the test system reform can reduce students’ test anxiety, reducing the time children spend in cram schools is challenging because of the aforementioned parents’ anxiety. Chao and Sung (Citation2019a) indicated that students with upper-intermediate academic achievement may still experience the uncertainty of lacking competitiveness even after dedicating more effort, time, and money to cram schooling and studying. Our results verified this phenomenon. Therefore, the uncertainty faced by students who have invested considerable time and money in cram schools and studying should be addressed. A stable testing system can help mitigate this uncertainty. Another possibility is enhancing students’ self-exploration and promoting inclusive school entrance policies.

Examination culture and anxiety

The Programme for International Student Assessment (PISA) is an international evaluation conducted by the Organisation for Economic Cooperation and Development (OECD), assessing the abilities of 15-year-old students in mathematics, science, and reading across different countries. Taiwan, along with other East Asian regions like Japan, South Korea, Singapore, Vietnam, Hong Kong, and China, often ranks well in PISA, demonstrating the strong academic achievements of students in these countries, which can also be seen as a reflection of their cultures that highly value education. However, high academic achievements come at a cost. For example, the OECD (Citation2023) published the results of PISA 2022, where Taiwan ranked high in mathematics, science, and reading, at third, fourth, and fifth places respectively. Nonetheless, the level of anxiety among Taiwanese students is also high. In countries scoring above the OECD average in mathematics, Taiwan is only surpassed by Japan and Spain (OECD, Citation2023). This phenomenon highlights the socio-educational atmosphere of East Asian countries, which includes a strong link between academic success and future career prospects, the competitive nature created by school environments, societal and familial expectations, and a lack of attention to the emotional aspect. This situation reflects the complex interplay of cultural, educational, and societal factors, and such an atmosphere is challenging to change in the short term. The findings of this study resonate with this phenomenon. Before expecting a shift in the societal atmosphere, reducing the competitiveness of the school environment and enhancing emotional care for students are actions that educators can take and areas where the government can allocate resources.

Research contributions and limitations

Our study used data collected over nine consecutive years to examine whether students’ test anxiety and cram school attendance changed after the standardized test system reform. Its implications for the end-of-secondary-school testing system worldwide. These results provide useful information for policymakers worldwide, especially those looking for a system that uncouples test results from school admission. Toward such an aim, policymakers must consider which students would be more affected by the uncoupling. For example, students who have upper-intermediate academic achievement (at the boundary of the thresholds) may feel more uncertainty. Furthermore, policymakers could consider adopting minor policies to lighten the impact.

For test system reform policies, the authorities should prioritize action for gradually changing the social climate. One option, for instance, is to implement a reward system that encourages students and parents to choose a senior high school close to where they live, to alleviate the social climate of pursuing elite schools. Another approach is to pilot a new system in certain areas or run the old and new systems concurrently for several years before implementing large-scale transition, to minimize the impact of abrupt switching from one test system to another.

Meanwhile, Sung et al. (Citation2017) reported that more than half of Taiwanese junior high school students do not have clearly defined interests, highlighting the need for counseling policies. If students lack a clear vision of their talents and interests, then the results of large-scale examinations will become their only reference for determining which school to attend in the future. Once large-scale examination results are detached from school admissions, then students will have to use other types of reference. Therefore, policymakers should invest more resources in students’ career exploration before reforming the test system to help students understand their aptitudes and interests.

In terms of research limitations, we believe that our study has internal and external validity issues. Regarding internal validity, our study used data from nine consecutive years to examine the correlation between students’ test anxiety, cram school attendance, and academic achievement. Although the amount of data was large, the number of variables was small compared with other studies. Therefore, although we revealed the changes in test anxiety and the cram school attendance before and after the test system reform, we could not indicate the psychological process that caused the change. We could infer that students may be affected by the social atmosphere. For example, as aforementioned, test results have a strong correspondence with the results of entering a desired high school. When this correspondence becomes fuzzy, students would be uncertain about entering a high school. The increase in uncertainty increases test anxiety (Chao & Sung, Citation2023). Nonetheless, our study had no actual data to show that students whose test anxiety was rising also felt strong uncertainty. Future research may be able to transform the current ‘slender’ data into ‘stubby’ data. That is to say, in the future, it is not necessary to collect data from such a large number of people for a long time. Rather, data collection can be designed with a small sample but a large number of psychological variables, to explore the causes of changes in test anxiety in more detail.

In terms of external validity, our study focused on the situation in Taiwan, including the social atmosphere of examination culture and the historical context of the country’s test system reform. Therefore, the results may not be directly applied elsewhere without external validity issues. We would recommend more research for the decision-making basis of other countries’ reform of their test systems. For example, in this study, we used ExamSS compiled with all Taiwanese students, in order to elucidate the situation of Taiwanese students. For inspecting the changes in test anxiety of students in other countries, ExamSS may not be a suitable tool. Alternatively, its psychometrics must be re-tested, as in the Turkish version of ExamSS (Karaduman & Kilmen, Citation2018). Nonetheless, our research illustrated that changes in the test system can be tested for changes in students’ test anxiety. In the future, we look forward to similar inspections of students’ test anxiety in countries that are seeking test system reforms. These countries may also plan similar policies to those mentioned above to examine whether the impact on students can be reduced.

Conclusion

This study assessed the effects of Taiwan’s test system reform on students’ test anxiety. The beginning of the test system reform was accompanied by an increase in students’ test anxiety, based on changes in test anxiety levels. We also found that the exam system reform did not appear to alter Taiwan’s cram schooling culture. Policymakers and educators should reflect on the aforementioned phenomena and take the following into account: (1) responding to system reform-induced changes in advance, and (2) devoting more resources in career counseling for students. If the test system reform has little or no influence on the social environment, then stakeholders might explore modifying the examination culture’s overall social climate. This analysis is founded on the empirical data offered in our study. We hope that scholars interested in and concerned about adolescent test anxiety would collaborate to solve the issues raised by our study and provide input to policymakers, educators, and the public.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work was supported by the Ministry of Science and Technology, Taiwan under Grant [109-2511-H-008-013].

Notes on contributors

Tzu-Yang Chao

Tzu-Yang Chao, Ph. D., is an Assistant Professor in Graduate Institute of Learning and Instruction, National Central University. His research interests include measurement theory, students' life adjustment and emotions, teacher education, and the traits of well-performed teachers.

Yao-Ting Sung

Yao-Ting Sung, Ph. D., is a Chair Professor in the Department of Educational Psychology and Counseling, National Taiwan Normal University. His research interests include computer assisted testing, psychological and educational testing, and career information analysis and application.

Fen-Lan Tseng

Fen-Lan Tseng, Ph. D., is the vice Deputy Director of the Research Center for Psychological and Educational Testing, National Taiwan Normal University. Her research interests include measurement theory, entrance examination system, item bank construction, and learning assessment.

References

  • Abu-Alhija, F. N. (2007). Large-scale testing: Benefits and pitfalls. Studies in Educational Evaluation, 33(1), 50–68. https://doi.org/10.1016/j.stueduc.2007.01.005
  • Adams, S. K., Mushkat, Z., & Minkel, J. (2021). Examining the moderator role of sleep quality in the relationship among test anxiety, academic success and mood. Psychological Reports, 125(5), 2400–2415. PMID: 34134557. https://doi.org/10.1177/00332941211025268
  • Bray, M. (2007). The shadow education system: Private tutoring and its implications for planners (2nd ed.). International Institute for Educational Planning.
  • Byun, S. (2014). Shadow education and academic success in Republic of Korea. In: H. Park & K. Kim (Eds.), Korean education in changing economic and demographic contexts. education in the Asia-Pacific region: Issues, concerns and prospects (Vol. 23). Springer. https://doi.org/10.1007/978-981-4451-27-7_3
  • Cassady, J. C., & Johnson, R. E. (2002). Cognitive test anxiety and academic performance. Contemporary Educational Psychology, 27(2), 270–295. https://doi.org/10.1006/ceps.2001.1094
  • Chao, T. Y., & Sung, Y. T. (2019a). An investigation of the reasons for test anxiety, time spent studying, and achievement among adolescents in Taiwan. Asia Pacific Journal of Education, 39(4), 469–484. https://doi.org/10.1080/02188791.2019.1671804
  • Chao, T. Y., & Sung, Y. T. (2019b). Examination stress and personal characteristics among Taiwanese adolescents: A latent class approach. Journal of Research in Education Sciences, 64(3), 203–235. https://doi.org/10.6209/JORIES.201909_64(3).0008 (in Mandarin)
  • Chao, T. Y., & Sung, Y. T. (2023). Testing of the uncertainty-of-stress model: Development of the adolescents’ uncertainty scale for Taiwanese adolescents. The Asia-Pacific Education Researcher, 32(4), 531–544. https://doi.org/10.1007/s40299-022-00674-1
  • Chapell, M. S., Blanding, Z. B., Silverstein, M. E., Takahashi, M., Newman, B., Gubi, A., & McCann, N. (2005). Test anxiety and academic performance in undergraduate and graduate students. Journal of Educational Psychology, 97(2), 268–274. https://doi.org/10.1037/0022-0663.97.2.268
  • Chen, C., Liu, P., Wu, F., Wang, H., Chen, S., Zhang, Y., Huang, W., Wang, Y., & Chen, Q. (2023). Factors associated with test anxiety among adolescents in Shenzhen, China. Journal of Affective Disorders, 323, 123–130. https://doi.org/10.1016/j.jad.2022.11.048
  • Chung, I. F. (2013). Crammed to learn English: What are learners’ motivation and approach? The Asia-Pacific Education Researcher, 22(4), 585–592. https://doi.org/10.1007/s40299-013-0061-5
  • Hembree, R. (1988). Correlates, causes, effects, and treatment of test anxiety. Review of Educational Research, 58(1), 47–77. https://doi.org/10.3102/00346543058001047
  • Howell, A. (2017). ‘Because then you could never ever get a job!’: Children’s constructions of NAPLAN as high-stakes. Journal of Education Policy, 32(5), 564–587. https://doi.org/10.1080/02680939.2017.1305451
  • Hutchings, M. (2015). Exam factories? The impact of accountability measures on children and young people. The National Union of Teachers website https://www.teachers.org.uk/files/exam-factories.pdf
  • JASP Team. (2020). JASP (Version 0.14.1) [Computer software].
  • Karaduman, B., & Kilmen, S. (2018). Sınav Stresi Ölçeğinin Türkçeye Uyarlanması ve Ölçme Değişmezliğinin İncelenmesi (Adaptation of the examination stress scale into Turkish and examination of measurement invariance). Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 9(2), 101–115. https://doi.org/10.21031/epod.330426
  • Kim, S., & Lee, J.-H. (2010). Private tutoring and demand for education in South Korea. Economic Development and Cultural Change, 58(2), 259–296. https://doi.org/10.1086/648186
  • Kuan, P. Y. (2011). Effects of cram schooling on mathematics performance: Evidence from junior high students in Taiwan. Comparative Education Review, 55(3), 342–368. https://doi.org/10.1086/659142
  • Kwok, P. (2004). Examination-oriented knowledge and value transformation in East Asian cram schools. Asia Pacific Education Review, 5(1), 64–75. https://doi.org/10.1007/BF03026280
  • Lee, M., & Larson, R. (2000). The Korean ‘examination hell’: Long hours of studying, distress, and depression. Journal of Youth and Adolescence, 29(2), 249–271. https://doi.org/10.1023/A:1005160717081
  • Madaus, G., & Russell, M. (2010). Paradoxes of high-stakes testing. Journal of Education, 190(1-2), 21–30. https://doi.org/10.1177/0022057410190001-205
  • OECD. (2023). PISA 2022 results (Volume I): The state of learning and equity in education. PISA, OECD Publishing. https://doi.org/10.1787/53f23881-en
  • Park, J., & Lee, S. (2021). Effects of private education fever on tenure and occupancy choices in Seoul, South Korea. Journal of Housing and the Built Environment, 36(2), 433–452. https://doi.org/10.1007/s10901-020-09773-1
  • Pedulla, J. J., Abrams, L. M., Madaus, G. F., Russell, M. K., Ramos, M. A., & Miao, J. (2003). Perceived effects of state-mandated testing programs on teaching and learning: Findings from a national survey of teachers. Center for the Study of Testing, Evaluation, and Educational Policy, Boston College.
  • Putwain, D. (2008). Do examinations stakes moderate the test anxiety–examination performance relationship? Educational Psychology, 28(2), 109–118. https://doi.org/10.1080/01443410701452264
  • Putwain, D., & Daly, A. L. (2014). Test anxiety prevalence and gender differences in a sample of English secondary school students. Educational Studies, 40(5), 554–570. https://doi.org/10.1080/03055698.2014.953914
  • Segool, N. K., Carlson, J. S., Goforth, A. N., Von Der Embse, N., & Barterian, J. A. (2013). Heightened test anxiety among young children: Elementary school students’ anxious responses to high‐stakes testing. Psychology in the Schools, 50(5), 489–499. https://doi.org/10.1002/pits.21689
  • Segool, N. K., von der Embse, N. P., Mata, A. D., & Gallant, J. (2014). Cognitive behavioral model of test anxiety in a high-stakes context: An exploratory study. School Mental Health, 6(1), 50–61. https://doi.org/10.1007/s12310-013-9111-7
  • Stang, J. B., Altiere, E., Ives, J., & Dubois, P. J. (2020). Exploring the contributions of self-efficacy and test anxiety to gender differences in assessments. Paper presented at the Physics Education Research Conference 2020. arXiv preprint arXiv2007.07947.
  • Sung, Y. T., & Chao, T. Y. (2015). Construction of the examination stress scale for adolescent students. Measurement and Evaluation in Counseling and Development, 48(1), 44–58. https://doi.org/10.1177/0748175614538062
  • Sung, Y. T., Chao, T. Y., & Tseng, F. L. (2016). Reexamining the relationship between test anxiety and learning achievement: An individual-differences perspective. Contemporary Educational Psychology, 46, 241–252. https://doi.org/10.1016/j.cedpsych.2016.07.001
  • Sung, Y. T., Huang, L. Y., Tseng, F. L., & Chang, K. E. (2014). The aspects and ability groups in which little fish perform worse than big fish: Examining the big-fish-little-pond effect in the context of school tracking. Contemporary Educational Psychology, 39(3), 220–232. https://doi.org/10.1016/j.cedpsych.2014.05.002
  • Sung, Y.-T., Cheng, Y.-W., & Hsueh, J.-H. (2017). Identifying the career-interest profiles of junior-high-school students through latent profile analysis. The Journal of Psychology, 151(3), 229–246. https://doi.org/10.1080/00223980.2016.1261076
  • Sung, Y.-T., Chou, Y.-T., Wu, P.-Y., Lin, H.-S., & Tseng, F.-L. (2010). A reflection of school-based assessment on the extended open admission program in Taiwan. Journal of Research in Education Sciences, 55(2), 73–113. https://doi.org/10.3966/2073753X2010065502003 (in Mandarin).
  • Thomas, C. L., Cassady, J. C., & Finch, W. H. (2018). Identifying severity standards on the Cognitive Test Anxiety Scale: Cut score determination using latent class and cluster analysis. Journal of Psychoeducational Assessment, 36(5), 492–508. https://doi.org/10.1177/0734282916686004
  • Tseng, F.-L., You, Y.-X., Tsai, I.-F., & Chen, P.-H. (2019). A pilot study of the washback effect of the incorporation of English listening test in the comprehensive assessment program for junior high school students. Journal of Research in Education Sciences, 64(2), 219–252. https://doi.org/10.6209/JORIES.201906_64(2).0008 (in Mandarin)
  • United States Congress (107th, 1st session: 2001). (2001). No child left behind act of 2001: Conference report to accompany H.R. 1. U.S. Government Printing Office.
  • von der Embse, N. P., Jester, D., Roy, D., & Post, J. (2018). Test anxiety effects, predictors, and correlates: A 30-year meta-analytic review. Journal of Affective Disorders, 227, 483–493. https://doi.org/10.1016/j.jad.2017.11.048
  • von der Embse, N. P., Schultz, B. K., & Draughn, J. D. (2015). Readying students to test: The influence of fear and efficacy appeals on anxiety and test performance. School Psychology International, 36(6), 620–637. https://doi.org/10.1177/0143034315609094
  • Wuthrich, V. M., Belcher, J., Kilby, C., Jagiello, T., & Lowe, C. (2021). Tracking stress, depression, and anxiety across the final year of secondary school: A longitudinal study. Journal of School Psychology, 88, 18–30. https://doi.org/10.1016/j.jsp.2021.07.004