5,966
Views
0
CrossRef citations to date
0
Altmetric
EDUCATIONAL ASSESSMENT & EVALUATION

Student strategies when taking open-ended test questions

& | (Reviewing editor)
Article: 1877905 | Received 04 Oct 2020, Accepted 04 Jan 2021, Published online: 22 Jan 2021

Abstract

Assessments are common in undergraduate classrooms, with formats including multiple-choice and open-ended (in which the students must generate their own answers) questions. While much is known about the strategies that students use when taking multiple-choice questions, there has yet to be a study evaluating the strategies that students employ during open-ended test questions. The current study used phenomenography and an open-ended online survey to solicit strategies from 416 students in an upper division reproductive physiology class across two quarters. Fifteen different categories of student strategy were categorized across four broad themes: formulating an answer, writing a response, time management, and question details. A detailed account of each of the 15 categories is presented along with an assessment of how each category is both unique and similar to other categories. The results of the current study could have important implications for teaching, as several of the strategies reported by students are likely not in line with those desired by course instructors.

PUBLIC INTEREST STATEMENT

Test and exam questions can have various formats, like multiple-choice, true-false and open-ended. One unique feature of open-ended questions is that students must generate the answers themselves, as opposed to recognizing a correct answer. While much is known about student strategies during multiple-choice exam questions, no study has evaluated what strategies students are using when they are answering open-ended questions. The current study used an online survey to ask students in an upper division physiology class to self-report the type(s) of strategies they use when answering open-ended test questions. Fifteen different categories of strategy were discovered, which were binned into four broad themes: formulating an answer, writing a response, time management, and question details. A detailed account of each of the fifteen categories is presented. The results suggest that several strategies students employ are likely not in line with those desired by course instructors.

1. Introduction

Assessments such as exams are an integral part of many undergraduate classes. While providing the important role of summative assessment for our students, midterm examinations also provide the opportunity for formative assessment—in which students are able to receive feedback about their to-date performance in the class (see: Ambrose et al., Citation2010). Assessments can also be used by instructors as a way to gauge student learning, identify student misconceptions and inform their pedagogical practices based on assessment results (Tanner & Allen, Citation2004).

The format of assessments deployed in post-secondary classrooms are varied and include multiple-choice, open-ended questions, fill-in-the-blank, true/false and essay questions, with multiple-choice exams being among the most commonly used in college biology classrooms and standard tests including Advanced Placement exams and the Medical College Admissions Test (Momsen et al., Citation2013; Zheng et al., Citation2008). The strategies that students use during multiple-choice exams have been investigated and include: error avoidance, elimination of incorrect responses, identifying correct options, checking responses, predicting, time management, and guessing (Millman et al., Citation1965; Prevost & Lemons, Citation2016; Towns & Robinson, Citation1993). Understandably, many of these strategies involve recognition and evaluation of correctness. Research into student strategies on multiple-choice exams is important because it informs instructors that some of the strategies being used by students may be different than those intended by the instructor. For example, guessing is not likely a cognitive strategy the course instructor desired when writing a particular question.

In contrast, on open-ended (or “free-response”) questions—which range from fill-in-the-blank to essay questions—students must generate answers themselves. This act of retrieving information is known to be a powerful driver of content retention (Carrier & Pashler, Citation1992; McDaniel & Masson, Citation1985). Indeed, open-ended exam questions have been shown to lead to greater learning of material compared to multiple-choice questions, so long as students are provided feedback (Kang et al., Citation2007). At least some of the potential benefits of open-ended test questions might be due to their impact on student preparation and studying. The type of assessment used can determine the quantity, approach and depth of material that is studied by students (Scouller & Prosser, Citation1994). Students tend to study at a depth that they believe will be consistent with the depth of the exam (Entwistle & Entwistle, Citation1992; Ross et al., Citation2006). Multiple-choice exams have been found to lead students to use study approaches that emphasize surface features of the material, whereas open-ended questions tend to drive students to study the information in greater depth (Thomas & Bain, Citation1984; Watkins & Watkins, D. Factors influencing the study methods of Australian tertiary students, Citation1982).

Like with multiple-choice exams, it is conceivable that students are engaged in behaviours that are different than those anticipated by the instructor when open-ended questions are employed. For example, a professor might ask a question asking students to analyse some data that involves a prominent pathway (a higher-order cognitive skill; Bloom, Citation1956; Crowe et al., Citation2008), and the student might respond by writing out the pathway that they had memorized, in the form of a “memory dump” (a low-order cognitive skill; Bloom, Citation1956; Crowe et al., Citation2008). The student might get (at least) partial credit for the question, despite the fact that they had not indicated any ability to answer the question in a more conceptual manner. Indeed, in a recent study evaluating whether assessments measure student understanding, there was a similar amount of mismatch between understanding and question grade on open-ended (26%) and multiple-choice (28%) questions (Sato et al., Citation2019). This suggests the possibility that students may not be engaging with open-ended questions in the manner intended, much like with multiple-choice tests.

If open-ended test questions are designed with the intentions of having students use higher-order, conceptual thinking, then it is important to identify whether the behaviours and strategies that are sought are actually happening. This leads us to our research questions:

Do students use conceptual thinking when answering open-ended questions on exams? If not, what sort(s) of strategies do they employ during open-ended question answering?

The results of this study could have important implications for teaching: for example, the need to provide more guidance to our students on how to answer open-ended questions, allow them to practice answering open-ended questions, or providing them opportunities to receive feedback on their responses to open-ended questions. This study could also be important for constructing and grading open-ended test questions: for example, one may need to consider constructing questions that do not reward students for engaging in behaviours that do not reflect understanding of course content. (eg: “memory dump” in lieu of analysis).

To conduct a study evaluating student strategies during open-ended question-taking, we needed a qualitative research method that would allow us to identify these strategies without imposing any of our own biases, ideas or categories into the research. One such methodology is that of phenomenography.

1.1. Phenomenography

Phenomenography is a qualitative research method that was first described by Ference Marton in 1981, when reflecting on earlier work examining how first-year university students each had different understandings of the same passage (Marton, Citation1981; Marton & Saljo, Citation1976). Phenomenography is used to try and understand how different people experience a particular phenomenon. A concept, event or phenomenon (in our case, the open-ended questions of an exam) can be experienced differently by different individuals. For example, students answered in two broad categories when asked about the forces acting on a car driving along a road (Johansson et al., Citation1985), while five different categories of belief of how sight occurs were delineated from high school biology students (Andersson & Karrqvist, Citation1981, retrieved from; Marton, Citation1986).

In phenomenography, it is the interaction between the material being learned, the learner and the environment that contribute to the experience (Entwistle, Citation1997). The researcher is trying to understand the thoughts that the learner has, as the learner draws on their unique individual experiences (Entwistle, Citation1997). This is in contrast to phenomenology, in which the researcher seeks to clarify the structure and meaning of a phenomenon (Giorgi, Citation1999). A phenomenographic study is not trying to identify “correctness” of students’ understanding of the phenomenon, rather how different individuals might experience the same phenomenon (Marton, Citation1986). The goal is to move beyond a student’s description of an event and understand their underlying meaning (Entwistle, Citation1997).

The methodology of a phenomenographic study has been described elsewhere (see: Han & Ellis, Citation2019; Marton, Citation1986). In brief: data is collected—often via in-person interviews, although this can also be done via with surveys (Han & Ellis, Citation2019), drawings (Wenestam, Citation1982, taken from; Marton, Citation1986), and even observed behaviour (Marton, Citation2015), then the data are transcribed (if necessary) and categories are developed. Importantly, the categories are not pre-determined in advance; rather, the quotes themselves will reveal salient features of the individuals’ beliefs which will then form the basis of the categories. Some quotes will be grouped into the same categories based on similarities, while categories will be separated by differences in salient features of the quotes. An iterative process of making-and-refining categories, eventually the attributes of each category are made explicit. The goal is to develop categories that provide structurally significant differences into the ways that individuals experience the same phenomenon.

The purpose of the present study was to answer the question: what strategies are students employing when answering open-ended test questions?

2. Methods

2.1. Setting and participants

The participants in this study were from an upper division human reproduction course for biology majors, at a large research university in Southern California. The study was performed over two quarters: summer and winter quarter. Enrolment in the summer quarter was 83, and enrolment in the winter quarter was 251. One of the authors was the instructor for both courses, and both classes were taught using student-centred learning approaches including clicker questions, worksheets, think-pair-share questions, etc. In both classes, student evaluations included (but was not limited to) two midterms. In the summer course, the midterms were offered at the end of the second and fourth week of the five-week class, while the winter quarter had the midterms at the end of the fourth and seventh week of the ten-week class. The midterm exams in both quarters were isomorphic (differing in only small detail) and consisted largely of open-ended questions. These open-ended questions were generally short answer questions (eg: “If we blocked estrogen receptors in the AVPV nucleus of the hypothalamus (and only there), what would happen to luteinization and ovulation? For each, explain why.”), as opposed to essay style or fill-in-the-blank style questions.

2.2. Data collection

This study was approved by our institutional review board (IRB#: 170,886). This IRB covers many faculty/instructors who are engaged in teaching/learning research at our institution. The broad objective of this IRB is for faculty to be able to use data that would normally be collected during a course (eg: test scores; assignments; surveys; overall course grades) for the purposes of understanding the student experiences, perspectives and how pedagogical choices might improve student outcomes. Students are informed at the beginning of the term that any data collected from the normal management of the course—including assessments, surveys and assignments—could be used as part of a research study. This information is disseminated to students through the course syllabus as well as directly in lecture, on the course website and students are also offered to take hard copy forms outlining the study details during class time. In accordance with the IRB guidelines, students are informed that personally identifying data will be removed, and any data used for study purposes will be anonymized. Importantly, students are given the opportunity to voluntarily withdraw any of their work, survey responses, etc., from the study at any point by completing the opt-out form provided to them on the course website. Students are explicitly told that withdrawing from the study will not impact their grades, and that their opt-out status will not be shared with the course faculty member until after final grades are posted. Once the course is complete and final grades posted, any students that opted out have their data removed.

Data was collected using an open-ended survey to collect student strategies when answering open-ended questions. While this approach prevents one from asking follow-up questions, it allows for the collection a much wider range of experiences with relative ease (Han & Ellis, Citation2019), which was the ultimate goal of this study. Immediately after both midterms, students were asked to complete a survey using the course’s online platform (Canvas). The survey consisted of five Likert-scale questions that asked students about their experiences of the exam just taken (eg: whether they felt the exam was fair, whether they had enough time, etc) and an optional open-ended question that asked: “Did you have any specific strategies when answering the free-response questions? If so, please describe any strategies you used.” This question was intended to be open-ended as this is a critical element of allowing subjects to include what they believe to be important features of a phenomenon when doing phenomenological research (Marton, Citation1986). Completing this last question—from which all data of the current study was derived—was voluntary for the students. Students were given roughly 48 hours to complete the survey. Of the students who completed the survey, the percentage of students who volunteered an answer to the strategy question was 71% (MT1 summer), 87% (MT1 winter), 71% (MT2 summer) and 82% (MT2 winter) (more information below). The survey was completed anonymously, in an attempt to have the students answer as honestly as possible. As the survey was distributed using Canvas’ “survey” function, the authors were unable to trace which students had completed the open-ended survey question.

Of the 83 students registered in the summer section, 49 completed the survey after midterm 1 and of those 49, 35 provided an answer to the open-ended question of strategies used during open-ended. In the winter section, 188 students completed the survey after midterm, and 164 of those students provided an answer to the open-ended question of strategies used during open-ended questions for midterm 1. For midterm 2, 68 of 83 students in the summer session completed the survey and of those 68 48 students provided an answer to the open-ended question of strategies used during free-response questions. Two-hundred and five out of 251 students in winter session completed the survey after midterm two, and 169 of those students provided an answer to the open-ended question of strategies used during free-response questions after midterm 2. Responses from both sections and both midterms were combined, as this study was trying to assess the full scope of different possible strategies students report using when engaging with open-ended exam questions. Combining all of these answers, there were a total of 416 respondents to the question of “Did you have any specific strategies when answering the free-response questions? If so, please describe any strategies you used.” Ten of these 416 respondents included answers that were not informative or helpful for our study (eg: “Watching podcasts, flash cards, problem sets, practice midterm” is an example of a student presumably confusing strategies they use while studying for the exam as opposed to those used during the exam itself) and were therefore eliminated from the analysis. Several students listed more than one single strategy employed, which resulted in 448 total strategies reported.

2.3. Category development

To categorize student responses, a model described by Han and Ellis (Citation2019) was employed. There were no predetermined categories, nor was there any idea of how many categories might be discovered. The analysis began by simply reading many of the answers students provided to get an idea of the breadth and scope of strategies. This process can be referred to as “familiarization”. Then, elements of responses that seemed most important (“reduction and condensation” phase) were considered, followed by a “classification” phase in which answers were compared in terms of their similarities and differences, which must be clear and explicit (Entwistle, Citation1997). Indeed, it was important that certain statements were categorized based on what students actually reported, as opposed to us trying to infer “what they meant”. The process was iterative; if a response ever generated a novel category, the authors would go back through earlier responses to see if any of them might also be “binned” into the new category. When a new category was developed, the category was labelled with a title that clearly articulated the theme of that category, often called “labelling with descriptors”, according to Han and Ellis (Citation2019). The labels of the categories were subject to change as a nature of the iterative process. Importantly, it was possible for a given statement to be included in more than one category.

Similar to the methodology of Crawford et al. (Citation1994), after an initial meeting of the two authors to develop categories with labels based on roughly 30 statements, each individual coded another 20 statements separately based on the draft categories developed. After each individual coded these 20 statements, disagreements of categorization were discussed and resolved. After which point, one person scored all remaining responses.

After category development was finished, the authors noticed that the categories fit broadly into different themes. In the results below, categories are presented as belonging to four different themes.

2.4. Quantification of student implemented strategies

Each category’s frequency was reported by dividing the number of times each category was mentioned (in total, across two midterms) by the total number of strategies reported (448). While this method allows for over-representation of students that offered more than one strategy (as opposed to a method that quantifies the number of students who used each category), each student is (potentially) sampled twice because the strategies were pooled across two midterms, and strategy information was collected anonymously precluding the ability to trace whether each student kept or altered their strategies between midterms.

3. Results

Responses were collected from 416 students from which there were 448 strategies, which were divided into 15 categories. These categories seemed to fit into four broad themes: formulating an answer—in which students described the processes through which they arrived at an answer; writing a response—in which students described their methodology in physically scribing their answers onto the paper; time-saving measures—in which students reported techniques they consciously employ to save time during the test; and question details—in which students mentioned focusing on elements of the question as their strategy. In , each of the categories are included in the four themes along with representative statements for each. Below, each category is described within the theme to which they belong, along with example statements and characteristics of each statement that caused them to be binned into their respective categories.

Table 1. Student Strategies When Taking Open-Ended Test Questions

3.1. Formulating an answer

The categories in this theme involved students describing the processes they use while organizing and preparing their answers. These categories often occur before the student actually begins the process of answering the question. In many of these statements, it seemed that the students were engaged in a form of “memory dump”, so that they had information to fall back on in the event that they lost their train of thought. The categories herein include processes that are entirely mental/psychological, those that include writing down information before attempting to answer questions, and those that involve drawing.

The category “brainstorm in writing” had to include writing something down, and the information being written down could not be a “drawing” or “diagram”. It also had to be clear that the writing was happening prior to actually attempting to answer the question, otherwise their comments would be counted under one of the “writing a response” theme categories (below). Students often mentioned writing down notes or ideas on the margins or back of the test page as a way to collect their thoughts. The following statement illustrates this point:

“For some questions, I wrote notes on the side of the question to organize my thoughts before writing the answer.”—Statement 153

In this example, the student has explicitly said that they write things down and that this happens before answering the question. That the student has said “notes” and not “drawing” separates this statement from one that might be included in the category “drawing” (see below). Similarly, in the statement:

“I first jot out notes and then write out my response”—Statement 109

The student has explicitly indicated that they are writing notes (not drawings) and this is happening before they begin to write their answer to the question.

The second category in the theme of “formulating an answer” is “drawing”. To be included in this category, the students had to mention that what they were putting onto the paper was either a diagram, image, drawing or even a flowchart with arrows, as opposed to writing words or notes, which separates this category from “brainstorm in writing”. All statements that included reference to drawing fell into the theme of “formulating an answer” and not “writing a response”. For example, in the statement:

“Usually I begin answering by drawing a picture or an arrow sequence of what is happening before I put it into words”—Statement 76

It is clear that this was part of the student’s preparation for answering the question, not something they did while answering the question (… “before I put it into words”). Also, the student is mentioning both a picture and an arrow sequence—both of which would be sufficient for inclusion in this category in their own right. In another example, the student is explicit about using pathways on the side of the page to inform what will become their answer:

“It’s easy for me to picture pathways or certain facts and write them down on the side so I get an idea of what I know about a certain hormone or pathway. Then I can use this information to construct an answer that includes the facts while also making my best assumption for the answer.”—Statement 43

While it is unclear if the student refers to the drawing of the pathway in their answer, it is clear that the drawing was part of the preparation before they answered the question. That the student has also explicitly mentioned writing facts on the side of the page has this statement binned into the “brainstorm in writing” category as well.

As the name suggests, “write normal then deviate” involves students writing out a pathway or system as it normally occurs, before considering what the question is asking (many of our questions involve pathological states or fictitious scenarios in which certain cells or receptors are not working). These statements appeared to be a type of “memory dump”, in which the students put down everything they know about a given system so that they can turn to that information when actually answering the question(s). In this way, “write normal then deviate” is a component of the student formulating their answer. As an example, consider the following:

“I would explain the way the correct pathway should work, and then address the change in the pathway that was mentioned in the test question.”—Statement 174

The student indicates that the normal situation is part of the student’s process of formulating their answer, rather than part of the answer itself (indicated with the “and then” portion of their statement). In this case, the normal pathway is included in their answer (as opposed to being written in the margins or back of a page), but the authors believed that this element was an important part of them formulating an answer. In another example:

“If the question asks what happens as a consequence when something abnormal happens in the body, I first write down what normally happens. And from there I can explain what happens when something is abnormal”—Statement 179

The distinction between writing out “what normally happens” and their answer to the question, is clearer. The difference between statements in this category and “brainstorm in writing” is that they were explicit about what they were writing: the normal circumstances of the system in question. It is conceivable that some statements that were categorized as “brainstorm in writing” were notes about how a system normally works, but in the absence of a student saying that, these statements could not be coded that way. The authors felt compelled to include “write normal, then deviate” as its own category because this offers an insight into a specific behaviour that wouldn’t be clear if these comments had been binned into the “brainstorm in writing” category. Since this study was trying to capture the breadth of student strategies, the authors felt that keeping this category separate from “brainstorm in writing” was reasonable.

The category “write as much as possible before answering” is remarkable because students had to mention that they included everything they could think of for a given topic before they attempted to answer the question. For example:

“I write everything that I know down before I write my response”—Statement 8

That the student indicates that they are including “everything [they] know” but that this “everything” is not actually part of their answer separates this comment from one that would be included in the category “write as much as possible” (see below for a description of that category). This category is also unique from the “brainstorm in writing” category in that in the “brainstorm in writing” category students do not explicitly report including everything they can think of for a given topic. In this way, the “write as much as possible before answering” category could be considered as a sort of sub-category of “brainstorm in writing”, which is more broad. In another example:

“I would try and write about everything that pertains to that topic first and make little bullets. Then I would read over it again and analyze what I know with what the question is asking and then write down my answers.”—Statement 124

We again see that the student takes the time to write down everything they know about a topic before they even start to consider what the question is actually asking.

The final category in the theme of “formulating an answer” is “mental processing”. At its simplest, in this category students told us how they think about a problem. Students reported thinking or visualizing about an answer or pathway, and all processes herein were focused on internal processing rather than externally representing material. For example:

“I try to visualize the pathways in my head.”—Statement 9

Here, the student is not writing (as with the “brainstorm in writing” category) or drawing (as with the “drawing” category) the pathways involved on paper, rather all processing is happening in their mind. In another example:

“I would run through the whole pathway in my head and sometimes would draw it out then erase it after if necessary.”—Statement 122

This statement is an example that was coded into two categories: “mental processing” and “drawing”. The student has reported thinking about the pathway mentally—without writing anything down—which we coded as “mental processing”. The student then mentioned that they drew the pathway. The suggestion of erasing it suggests that the student engaged in these drawings as a way to organize and process their thoughts, which would allow it to be categorized in the “formulating an answer” theme. It is certain that all students are engaged in the category of “mental processing” at some point when answering exam questions. However, statements could not be considered in this category unless students explicitly mentioned that this mental processing was a conscious part of their strategies for answering open-ended questions.

3.2. Writing a response

The theme “writing a response” involves students reporting strategies they used while physically writing their answers to the questions. Largely, students told us about how much information they chose to include in their answers: including only important/relevant information, or longer answers that include as much information as possible.

In the category “include as much information as possible”, students reported that they included everything they could think of as part of their answer. This category is unique from “write as much as possible before answering” in that here, the students are writing all they can as part of their answer. In order to be added to this category, students had to express that they were intentionally adding as much information as they could. Consider the following example:

“Just writing as much as possible with the amount of time we had for the exam.”—Statement 75

Presumably, students in this category are including as much information as possible as a way to maximize the number of points they receive on their exams (the instructor did not penalize for excessive information and there were no word limits on the exams), despite the fact that this action might cause them to run out of time. Another example is more illustrative:

“My method for taking the test was to answer the questions thoroughly and with as much detail as I could. This might have backfired because i was rushing to finish and ended up not completing two questions as best i could but just wrote anything i could down. I tried to answer each question as well as I could by getting to the main points but I often rambled on.”—Statement 18

Here, the student acknowledges including as much information as possible (ie: “ … just wrote anything I [sic] could down.”), and then they acknowledge that this strategy may not have been appropriate for a timed exam. It’s as if they understood that they’re approach was sub-optimal, but that they could not help including more and more information in their responses. It is conceivable that when students include everything that they can think of, they are—in a way—brainstorming in writing. That is: they may simply be writing out the inner monologue on their way to developing an answer. But since these statements did not mention that this writing was part of their formulating process (rather it was a part of their answer)—rather that it was an intentional part of their process for answering the questions, we categorized them as in the theme of “writing an answer”.

In contrast, another category in this theme is “include only important information”, in which students attempted to limit their answers to what (they believed) were the most important pieces of information needed to completely answer a question. For example:

“I tried putting down key points because if I tried putting in too much info I would run out of time.”—Statement 25

This student seems to have realized that including lots of information runs the risk of running out of time. It is important to note that we did not have a way of checking whether students who report including only important information actually included information that was important. It might be that the student above included several pieces of information that were not actually relevant. Or that important pieces of information were excluded from their answers. Nevertheless, that students reported focusing on what they believed was only important information has them included in this category. In another statement:

“Trying to be precise and concise when answering the questions.”—Statement 93

The student explicitly mentions their intentions of only including important pieces of information in their answers, and it is clear that this is part of their answer, rather than them processing or formulating their answer.

A third category that developed in the “writing a response” theme was “answering similar to discussion section”. At the institution at which the research was conducted, lecture classes have corresponding discussion sections that are organized by faculty and run by instructional assistants (IAs). In this particular class, students were given old exam questions to attempt at the beginning of section, followed by answering the questions in a group setting. After this, the IAs share the answers to the questions, modelling how to answer questions correctly and concisely. While we imagine that students who report answering in a manner similar to discussion section imply they are including only important information, the students have not said so explicitly. For that reason, we felt compelled to have a unique category for “answering similar to discussion section”, because we are unsure exactly what that means for these students. For example:

“Answered questions in the way they were answered on discussion sheets”—Statement 154

This is one instance that the chosen model of soliciting student feedback (online form) is limiting. The inability to ask follow-up questions means that this particular student cannot be probed to find out what exactly they mean. This comment might well have been categorized into a different category (eg: include only important information) with some clarifying follow-up questions.

The final category in the “writing a response” theme is “point value indicates quantity”. In this category, students reported including a quantity of information that was dependent on the point value of the question. Consider the two examples below:

“I look at how many points the question is worth and try to provide enough information that correlates to how many points are available.”—Statement 71

‘I look at the number of points awarded and try to estimate how the question will be graded and make sure I have everything complete.—Statement 178

In both of these examples, the point guide is being used to ensure that the students are including enough information (as opposed to ensuring they are not including too much). This category is almost like a cross between “only important information” and “as much as possible”, where the student wants to ensure they are including everything important, but perhaps unsure how much information is actually important to the instructor. Instead of writing as much as they can, they stop when they’ve included the number of elements required to get full points. Presumably, the students have a way of rank-ordering which elements are most important, so they know what to include and exclude in their answers.

3.3. Time-saving measures

In the time-saving measures theme, students reported using techniques that they believe save time during the exams. Whether the approaches mentioned actually save time is not something that was measured in this study. However, that students are consciously engaging in strategies to manage their time has led to these statements being grouped into this theme.

The category “transcription methods” includes statements in which students report using different strategies to reduce the amount of time that it takes to write their answers. These strategies included (but are not limited to): using bullet points; shorthand; abbreviations; and arrows to indicate increasing or decreasing effect. Consider the following examples:

“It saves time to just bullet point all the important information”—Statement 1

“I used bullet format and shorthand (arrows, abbreviations, etc.) instead of writing in complete words and sentences.”—Statement 153

This category is unique in that students had to report strategies that focused on the elements of their writing that save time. Whereas students in the “include only important information” category might have (for example) written in complete sentences while talking about important points, students in this category may have (for example) written in bullet points. Whether they were talking about important or unimportant information is another issue altogether.

In the category “order of questions answered”, students reported selecting specific questions to answer first, rather than simply answering questions in the order they were presented. Most of the students who reported using this strategy suggested that they targeted questions for which they (believed they) knew the correct answers, before going on to questions they perceived as more difficult. Consider the examples below:

“When answering the free response questions, I focused on the set of FR questions I knew the answers to first and then went back to the others to work them out.”—Statement 30

“Answer the questions I had a definite answer for first and come back to the others later.”—Statement 50

Strategies were binned into this category if the student made any explicit attempt to indicate that they engage with questions in an order that is unique to their order of presentation.

The fourth and final theme that emerged from our survey was “question details”, in which students reported focusing on elements of the question. While other themes have dealt with students’ strategies formulating responses, writing their answers down, and developing methodologies to save time, strategies in “question details” involve students reporting focusing on, or even annotating, the question.

In the category “analyze what the question is asking”, students report that they want to ensure that they understand what is being asked of them. Presumably, this is a first step for every student who answers a test question, and we also presume that all students in this category also engaged in strategies involved in answering the questions once they knew what was being asked. But only students who reported on this process—of trying to figure out what was being asked of them—were binned into this category. The major difference between this category and “mental processing” is that for “mental processing”, students report engaging in thoughts about their answer, or how they will answer the question. In “analyze what the question is asking”, students are focusing on understanding the question. It is as if the students need to understand what is being asked of them before they can then move on to the process of “mental processing”. For example:

“Make sure to understand the question before trying to answer it.”—Statement 21

The student is reporting that the process is focused on the question before they move on to worrying about what the answer may be. In another example:

“Read the question at least twice to make sure it is clear what is being asked, helps avoid confusion.”—Statement 116

The student is clear that they feel the need to understand the question in order to avoid confusion that may result if things are unclear.

The second category in the “question details” theme is “focus on the question vocabulary”. In this category, students report using key words or phrases in the question or stem of the question that they use to inform their thought processes. For example:

“I look for key words (vocabulary) and use the vocabulary to guide my answers.”—Statement 130

“I tried looking for key phrases in the question itself like ‘increased adenylyl cyclase’; that way I knew like ‘oh cAMP would increase’, etc.”—Statement 178

Both of these students report that certain terms in the question acted as prompts for them to start thinking about specific elements or pathways important for the question/answer. In a way, “focus on question vocabulary” is like “analyze what the question is asking”, only these students are making explicit reference to elements of the question’s vocabulary. In a way, these students are reporting behaviours that are more specific than those in the “analyze what the question is asking” category. If a student reported focusing on key components/words of the question or stem, their statement was binned into this category.

The third category of the “question details” theme is “annotate question”. Statements in this category all mentioned efforts by students to highlight or underline certain elements of the question or stem. For example:

“First I underlined the key concepts that the question was asking. The [sic], I tried to write the keywords and diagrams that are related first before answering.”—Statement 131

This statement was binned into two categories: the student underlined “key concepts” (although it is unclear to us what those concepts might be), so we categorized the statement as “annotate question”. Further, the student mentioned writing keywords and diagrams prior to answering the question. For this, we binned the statement into “drawing”. In another example:

“I like to underline parts of the question to make sure I don’t forget to include parts!”—Statement 187

In this case, the student is using underlined elements of the question as a guide to prevent including important information in their answer. This is similar to some of the students in the “focus on question vocabulary” category, in which elements of the question are used to ensure that the students do not omit important information in their answers. The difference between this category and “focus on the question vocabulary” is the specific mention of annotating/altering/highlighting some element of the question itself. In this way, “annotating” could be considered a sub-category of “focus on question vocabulary”. The authors found it particularly interesting that students reported the annotation of the question as an important strategy that they used, so this category received its own unique label.

The final category in the “question details” theme is “ask for clarification”, in which students report asking the instructor for help if they were unsure about any element of the question or answer. For example:

“I raised my hand if I wasn’t sure whether or not to go into more detail for some of the problems”.—Statement 206

The reason that this student wanted help was because they were unsure how much information was required for a complete response. In another example:

“The questions [asked were] not so clear, so [I] took lot of time just to understand the question, so my strategy was to ask the TA for clarification”—Statement 56

In this example, the reason the student asked for assistance was that they did not understand the question. In this case, the student also reported taking time to understand what the question was asking, which was also binned as “analyze what the question is asking”.

3.4. Quantification of reported strategies

In quantifying the strategies that students reported, “only include important information” was the most frequent strategy employed, as it represented 20.3% (91/448) of all strategies reported. Other categories that were frequently reported were “drawing” (63/448 = 14.1%), “mental processing” (56/448 = 12.5%) and “include as much information as possible” (54/448 = 12.1%). Some categories were very infrequent, including “point total suggests quantity of information” (5/448 = 1.1%) and “answering similar to discussion section” (1/448 = 0.2%). A breakdown of the quantification of each category can be found in .

Table 2. Quantification of student strategies

4. Discussion

To our knowledge, this is the first study to evaluate student strategies employed while answering open-ended (or free-response) test questions. The qualitative methodology of phenomenography was used to survey hundreds of students in an upper division physiology course across two quarters. Fifteen different categories of strategies were identified, which fell into four themes. Previous research on student strategies while answering test questions has focused on multiple-choice exams (Kim & Goetz, Citation1993; Prevost & Lemons, Citation2016; Towns & Robinson, Citation1993), or strategies that students employ while preparing to take exams (Hartwig & Dunlosky, Citation2012; Karpicke et al., Citation2009; Kornell & Bjork, Citation2007; Ross et al., Citation2006). Below, some of the more interesting results are discussed, as well as the implications of the work for teaching and learning.

4.1. Strategies that students employ

Students in this study reported behaviours that ranged from planning and processing events prior to answering questions, to describing the quantity of information they decide to include in their answers. Clearly, students are not only engaging in the strategy that they are reporting here. For every student who includes “as much information as possible” in their answers, they have to have gone through some “mental processing” prior to writing their answers. And each person who reported using “analyze what the question is asking” almost certainly puts pen-to-paper in order to provide an answer to the question. Below, a brief discussion of each of the themes of categories that students reported using.

4.1.1. Formulating an answer

The most popular category in this theme was “drawing”. Drawing is a strategic process that involves the external representation of an internal idea. Drawings and diagrams are common through scientific disciplines and can help promote learning (see: Ainsworth et al., Citation2011; Quillin & Thomas, Citation2015). When students generate their own diagrams summarizing written passages, performance on subsequent tests are significantly higher than control groups who write summaries of the same passage (Gobert & Clement, Citation1999). Perhaps, then, it is not surprising that many of the students in this study reported drawing as a way to structure their ideas and knowledge before they began to answer the question. What was surprising was that all of the students that mentioned drawing either explicitly or implicitly said that this was part of their process during answer development, and not (necessarily) part of their actual answer. It is conceivable that students may have made reference to their drawings in their answers to test questions, but it is not certain that this ever happened.

The authors were surprised that students would use a strategy like “writing down as much as possible before answering” a question during a timed exam. While this strategy might be a good way of ensuring the answer to the question contains all the information the student feels is important, that is only true if the student has unlimited time. If the student is going to write down everything they can think of, it would be prudent to include all of those elements into their answer (in our class there are no word limits or point deductions for verbosity). Going to the trouble of writing everything down, reading it, then picking the salient points from their notes to include in their answer would certainly cut into the time they have to take the exam. A simple correction to the behaviour by putting all that information into the answer space (akin to the “include as much information as possible” category) would make more sense during a timed exam.

Other categories in the theme of “developing answers” (“brainstorm in writing”, “mental processing” and “write normal, then deviate”) were not particularly surprising. Jotting down notes on page margins has been reported when students take multiple-choice tests (Kim & Goetz, Citation1993). All students are engaged in mental processing throughout any test-taking procedure (although for some of our students, this was the only strategy they reported). And “write normal, then deviate” seems a reasonable approach to physiology questions, in which the question is often presented as a pathology to the system. Anecdotally, the course instructor did suggest this approach to answering questions on the first day of class for all quarters involved.

4.1.2. Writing answers

There seems to be a split between the students who attempt to include only important information, and those who choose to include as much as they can. It is curious that there are such a large number of students engaging in activities that are polar opposites to each other. Because students were not interviewed in person, and since the data were collected anonymously, there was no opportunity to follow-up with any of the answers provided to find out more about their rationale, their test-taking background, or how they performed on the exams. For example, perhaps students who write down as much as they can think of have been taught to do that previously; or they utilize this strategy as a way to maximize points without penalty. Perhaps students with more experience answering open-ended questions on a timed exam prefer to only include important information and move on to the next unanswered question. Future studies might use semi-structured interviews to solicit students’ opinions in order to determine why students are using the strategies they use.

The strategy of “including as much info as possible” could be a good strategy provided the student has unlimited time (which was not the case for our students) and there are no deductions for length. Presumably, students are engaging in this behaviour to maximize the number of points they receive when there is no penalty for verbosity. However, writing in a manner that introduces redundancy has been shown to reduce the essential message trying to be conveyed (Krifka, Citation2002), and may not be an optimal strategy for communicating in writing (Hsia, Citation1977). In addition to redundancy, this strategy also allows for the inclusion of irrelevant, or even incorrect, information in student answers. Unless there are penalties for such practices, there is nothing discouraging them from happening. In the current study, there was no penalty for inclusion of irrelevant or incorrect points, which may have encouraged this “memory dump” practice among the students. If nothing else, this practice might encourage students answering questions in much more superficial manner than is desired. Previous studies have suggested that metacognition can improve students’ habits, test-taking methods, and subsequent scores (Baird, Citation1986; Siegesmund, Citation2016). Perhaps metacognitive exercises that encourage students to consider whether all the information they are writing is helpful to their answers might help them to become more concise and communicate more effectively.

The category “including only important information” seems an effective way to illustrate their knowledge of the subject while not spending time on unnecessary details. One of the limitations of this study—because of the way student responses were collected—is that it is unknown whether the students who claim to only include important information are actually including important information, and only important information. After all, novice-learners have difficulty separating important conceptual pieces of information from superficial features of problems (Chi et al., Citation1981; Smith et al., Citation2013). A follow-up study might assess whether those who claim to include “only important information” are actually doing so, and (if not) whether an intervention might be designed to help our students become more expert-like in identifying (and including) important information in their answers to open-ended questions.

One category stood out for how few students used it: “point total suggests quantity of information”. It represented only 1.1% of all strategies reported, even though it seems to us a reasonable proxy for the quantity of information to include in an answer. A simple, but potentially effective implication of this finding is that instructors might consider directing their students to this strategy (eg: if a question is worth 2 points, one probably needs to provide two pieces of information in their answer). Cueing students to provide a desired behaviour has been effective at producing the desired behaviour in the past (Knight et al., Citation2015), and might warrant consideration here.

4.1.3. Time-saving measures

Two categories of behaviours were reported in which students attempt to save time: “order of questions answered”, and “transcription methods”. The approach of answering questions for which students (believe they) know the answer first (or alternately: skipping questions for which students are unsure) has been well documented in the multiple-choice literature (Hong et al., Citation2006; McClain, Citation1983; Millman et al., Citation1965; Rindler, Citation1980; Stenlund et al., Citation2018). For many of these studies, the students that employ this strategy vary by their achievement level. Often, high-performing students use the “skipping” strategy more than low-performing students (Kim & Goetz, Citation1993; McClain, Citation1983; Stenlund et al., Citation2018). Rindler (Citation1980) found this approach was more popular with “middle ability” students (GPA = 2.20 to 2.79) compared to “high” or “low ability” students. Interestingly, Ridler found that mid-performing students who did not skip questions outperformed those that did, but that high-performing students who skipped some questions outperformed high-performing students who did not skip. In the current study, several students reported focusing on questions for which they knew the answers and then turned to the questions for which they were less certain afterward, consistent with studies on multiple-choice exams. However, this study did not assess whether this strategy was more popular with high—or low-performing students.

The category of “transcription methods” may not have precedence in the student-strategies-during-tests literature. This is hardly surprising as the bulk of the previous work has focused on multiple-choice exams where students do not typically write their answers down. In the current study, the instructor often uses abbreviations, shorthand or symbols when writing out answers to questions, and encourages the students to do the same during their tests, so this strategy was not unexpected. While just over 8% of students reported using this strategy, it is not known whether the number of students who actually use this strategy is even higher.

4.1.4. Question details

Several students in our study reported that their strategy was determining what the question is asking. Virtually every student must have done this at some point in generating their responses, but only some mentioned it as a conscious strategy they employ during test-taking. While it may seem obvious that all students must determine what is expected of them, it has been suggested that students should consciously employ this strategy when engaging with test materials, as part of a broad error-avoidance strategy (Millman et al., Citation1965). Similarly, Millman et al. (Citation1965) suggest that asking for clarification if something is unclear is another error-avoidance strategy that students should invoke, and the current study had some (n = 3) students report using this strategy as well.

Regarding students annotating different elements of questions: students taking multiple-choice tests often make markings on, or beside the questions and answer options (Kim & Goetz, Citation1993). Indeed, when text has been underlined or annotated it increases students’ ability to recall the information that has been underlined/annotated (compared to information that was not underlined/annotated), even if that underlined/annotated information is not of particular importance (Nist & Hogrebe, Citation1987). Whether the annotating/underlining being done by the students in this study was actually effective at improving their test score is not known. It is also unknown whether the information students underlined was actually important (although the instructor of the course involved in the current study attempts to keep test questions as concise as possible, without inclusion of information that might be considered “unnecessary”). Nevertheless, several students in this study report using this technique. Presumably, students binned into the category “focus on vocabulary” are also focusing on key words in the question or stem as a way to avoid error. While they are not explicitly stating that they are annotating any elements of the question, their focus on particular elements of the question serves the same purpose.

4.2. Strategies on multiple-choice questions versus open-ended questions

Many of the strategies employed by students in our study echo those reported by others during multiple-choice exams. Specifically, the categories: brainstorm in writing (Kim & Goetz, Citation1993; Towns & Robinson, Citation1993), mental processing (Prevost & Lemons, Citation2016), order of questions (Hong et al., Citation2006; McClain, Citation1983; Millman et al., Citation1965; Rindler, Citation1980; Stenlund et al., Citation2018), analyze what the question is asking (Millman et al., Citation1965; Stenlund et al., Citation2018), annotate the question (Kim & Goetz, Citation1993; Nist & Hogrebe, Citation1987), and drawing (Hong et al., Citation2006) have all been reported as strategies employed during multiple-choice exams. Perhaps it is not surprising that the bulk of the categories that are not reported in the multiple-choice literature involve students writing their own responses. For example, “include only important information”, “include as much as possible” and “transcription methods” are not realistic strategies to use on a multiple-choice exam.

Conversely, several strategies that are frequently reported in the multiple-choice literature do not occur in our study. Some of these strategies might not be surprising. For example, “guessing” and “eliminating” are often reported in the multiple-choice literature (Millman et al., Citation1965; Stenlund et al., Citation2018; Towns & Robinson, Citation1993), but these strategies are not possible on open-ended questions, as there are no options provided from which students might guess and/or eliminate. Predicting is one strategy students employ during multiple-choice tests (McClain, Citation1983; Prevost & Lemons, Citation2016), particularly the higher-performing students (McClain, Citation1983), that was not reported by students in the current study. Despite the fact that this strategy was never explicitly stated, it may have been an important part of some of the strategies we did observe, such as “write normal, then deviate”. Perhaps the process of writing normally allows the student to make the prediction about what they believe will happen when the pathway is broken. In this way, the student’s response to the question becomes their predicted answer to the question.

Studies of multiple-choice test-taking strategies have often separated strategies into those that are specific for the field in question versus those are more general. Millman and colleagues (Citation1965) described test-taking strategies that were independent of the discipline. They referred to these strategies collectively as “test-wiseness”: procedures one could implement during a test to improve their outcome, regardless of the field of study. These strategies included: omitting items that resist a quick response (time-saving measure); clearly determining the nature of the question (error avoidance); and eliminating incorrect/impossible options (deductive reasoning). Similarly, Prevost and Lemons (Citation2016) studied test-taking strategies on multiple-choice exams through the lens of domain-specific versus domain-general strategies. Like “test-wise” strategies, domain-general strategies are those that could be considered general thinking properties; tools that could be used by any test-taker across many disciplines. Examples of domain-general strategies include: brainstorming; analyzing visual representations; and clarifying a question to ensure it was clear. In contrast, domain-specific strategies were those that required a knowledge of biology; if they used content knowledge (about the course they were taking), then the strategy was considered domain-specific. Examples of domain-specific strategies include: predicting, recalling facts and checking their answer. We believe that most of the categories in our study would likely be considered “domain general” or “test-wise”. All of the categories in the question details and time-saving themes have been described as “test-wise” by others, as well as “brainstorming” and “mental processing” (Millman et al., Citation1965; Prevost & Lemons, Citation2016; Towns & Robinson, Citation1993). The categories of “drawing”, “write normal, then deviate” and “answer like discussion section” would likely be examples of domain-specific categories, as each of these requires knowledge of the course content in order to complete. It is unclear if the categories “write as much as possible” and “include only important information” would be considered domain-general or domain-specific. The information included therein would certainly be specific to the course, but the actions described could be done by any student in any course. For example, one could imagine a student in a chemistry class writing as much as possible about a given topic. Nevertheless, it was never the goal to ascribe categories of general versus specific domains to the categories derived in the current study.

4.3. Educational implications

A lot of the categories observed/recorded (eg: include as much information as possible; include as much as possible before answering; the paucity of students using the point totals as a barometer for content) seem to reflect the possibility that students have very little experience writing answers to open-ended questions. It might be a good idea for those instructors who use open-ended questions to provide a little guidance and information about practices to employ to improve student performance and help them manage their time. Perhaps an approach like cognitive apprenticeship—in which the instructor or teaching assistant models how to answer questions (Anderson, Citation1993), with an emphasis on certain strategies over others—could lead to improved test-taking strategies and outcomes. Suggestions might include: focusing on the most important pieces of information, using abbreviations or shorthand (if appropriate), and using the point total for the question as a gauge of the quantity of information required.

It would also be important for students to practice answering open-ended questions repeatedly, and to receive feedback on their performance. This could happen during class and/or discussion sections. One option might be to have students practice answering a particular question and engage in peer-review with their classmates. If the professor were to provide a rubric for students (eg: paying attention to details that are important, and highlighting those details that do not contribute anything of substance to the answer), not only would each student receive valuable feedback, but the action of providing feedback for peers might also improve their answering open-ended questions (Liu et al., Citation2002).

The current study also has implications for the way that open-ended questions are created and graded. Several of the students in our study reported using strategies that amount to “memory dumps”. Of particular note is the category “include as much information as possible”, in which students are intentionally filling the answer space with anything they can think of, and accounted for 12% of all responses. That students might be including irrelevant or incorrect information in their answers (as a result of regurgitating memorized information) without penalty is an important consideration. For some students, it appears as though open-ended questions—much like multiple-choice questions—are not always engaged in the manner that was intended by the instructor. It is as if these students are answering the question without understanding the material. If we want our students to engage in higher-order, conceptual thinking during the exam, we need to devise ways to encourage those types of processes and discourage lower-order cognitive processes. Constructing questions for which “memory dump” answers do not receive any points would be ideal. These questions might have answers for which regurgitation of memorized facts would not be of any value. For example, students might build, draw or develop a graph, plan or experiment to address a novel problem based on what they have learned. Instructors might also consider not providing any points for answers that appear to be regurgitated from class notes or introducing a word limit into the question stem to prevent verbosity.

While it was beyond the scope of this study to determine whether there was a disconnect between student responses and their grades on the test (ie: where students receive high points for answers that do not reflect conceptual understanding), others have found that this phenomena can occur (Hubbard et al., Citation2017; Sato et al., Citation2019). It is important for instructors to be aware of the strategies that students are using during open-ended questions, and to consider how they might mitigate any discrepancies between their desired behaviours (ie: higher-order cognitive thinking) and those in which the students are engaged.

4.4. Limitations and future studies

The approach taken in this study was to survey a large number of students in order to gain an understanding of the breadth of the strategies students employ. To do this, an online survey was deployed which allowed for the collection of a large number of responses quickly and easily. Open-ended questionnaires like ours are not an uncommon way to collect data in phenomenographic studies (Marton & Saljo, Citation1976; Crawford et al., Citation1994; Loughland et al., Citation2002; see: Han & Ellis, Citation2019). Having said that, this method does have the shortfall of preventing one from asking follow-up questions. There were several cases in which a follow-up question would have been helpful in identifying the meaning of a students’ comment. For example, one student explained that they tried to model their exam answers based on what was done during discussion sections, but the authors were not sure exactly what was meant by that: are they trying to be concise? Are they modifying the way they write (ie: using bullet points)? A future study might employ semi-structured interviews with students to allow some probing when the answers are ambiguous.

Another issue that arose after the completion of the current study was the desire to tie reported strategies to student performance. That is: are “high-performing” students engaged more in one particular type of strategy over “low-performing” students? A future study might consider probing students’ reasons for engaging in one particular strategy over another, as well as tracking the strategies of high—and low-performing students.

A final limitation on the current study pertains to the relationship between what students report and what they actually do. When students report including “only important information”, it is unclear if the information they included is actually important. Undergraduate students have a history of believing important information includes superficial features, which may not actually help their test performance. In future, associating each student’s reported strategy with their exam answers would allow one to determine whether the information they included was, in fact, important.

Declaration of interest

We have no conflict(s) of interest to disclose.

Acknowledgements

The authors are grateful to Drs. M. Owens, T. Bussey and L. McDonnell for helpful discussion during the development of the study and manuscript. We are also very grateful to Drs. C. Wieman and S. Gilbert for constructive feedback on earlier versions of the manuscript.

Data availability statement

the data from this study are stored on an encrypted Google drive owned by UCSD.

Additional information

Funding

this work was not financially supported.

Notes on contributors

Matthew Nedjat-Haiem

Matthew Nedjat-Haiem received his Bachelor’s (2018) and Master’s (2019) degrees in biology from the University of California, San Diego. He is currently in medical school at the University of California, Davis.

James E. Cooke

James Cooke is an assistant teaching professor at the University of California, San Diego. His research interests broadly focus on assessments, and how they can be used to improve student learning.

Broadly speaking, our group focuses on assessments. Within this frame, we are curious about whether (and how) assessments can help improve student outcomes including retention of course content, and how students engage with assessments. Studies evaluating student outcomes have included multi-stage collaborative exams and evaluating whether the test-effect can overcome jargon-induced learning obstacles. The current study is a good example of our work examining how students engage with assessments. We are particularly interested in how strategies used by students might be different than those desired by the instructor, and how students might be using strategies that are detrimental to their own performance. The results of the current study have opened up a new branch of our research group, where we are evaluating why students use the strategies they do, and whether we can help students improve their test-taking strategies to improve their performance and retention of course content.

References