5,406
Views
22
CrossRef citations to date
0
Altmetric
Articles

Why are neuropsychologists so reluctant to embrace modern assessment techniques?

Pages 209-219 | Received 18 Apr 2018, Accepted 09 Sep 2018, Published online: 12 Jan 2019

Abstract

Objective: Computerized tests and use of the internet offer many opportunities for improvement of neuropsychological assessment over traditional paper-and-pencil tests. Nevertheless, many clinical neuropsychologists are conservative in their choice of tests when assessing patients; the majority still seems to prefer using well-established paper-and-pencil tests.

Method: This deliberately one-sided opinion paper discusses several reasons that may explain the reluctance to embrace modern techniques. These reasons are of a psychometric, technical, theoretical, and strategic nature.

Conclusions: A range of issues regarding each of these reasons need to be solved before digital assessment techniques can fulfill their promises. In the meantime, it seems wise to be cautious, and to be critical in adopting the digital assessment techniques.

Introduction

Information technology (IT) is becoming more and more important in everyday life. IT is also being increasingly used to improve health care. This special issue of The Clinical Neuropsychologist provides various examples. eHealth and mHealth (i.e. internet and mobile phone applications in health care) are developing quickly. These techniques may make it easier to consult a health care professional or to adhere to a treatment. They also promise to play an important role in the assessment of emotional and cognitive behavior. Frequent behavioral and mood sampling by smartphone is increasingly common (Dogan, Sander, Wagner, Hegerl, & Kohls, Citation2017). It is even conceivable, for example, that the way we use our tablets and smartphones might reveal subtle mood problems that announce depression, or subtle functional problems as the first signs of Alzheimer’s disease (Stringer et al., Citation2018).

Over the last decades, many neuropsychologists have attempted to apply IT in order to modernize neuropsychological assessment by computerizing tests and questionnaires. Even though this modern approach clearly has many advantages, the field did not massively adopt it (e.g., Miller & Barr, Citation2017; Rabin, Barr, & Burton, Citation2005; Rabin, Paolillo, & Barr, Citation2016; Rabin et al., Citation2014). On the contrary, most neuropsychologists seem to stick to well-established paper-and-pencil tests and to use modern techniques only sparingly.

The present contribution aims to discuss several possible reasons for this conservatism. The discussion will focus on psychometric, technical, theoretical, and strategic obstacles. In doing so, this paper will deliberately be one-sided; it will be questioning claims frequently made by proponents of computerized assessment techniques, and it will be looking for arguments in defense of traditional tests. Contrasting opinions can be found in other contributions to this special issue. The paper will reflect the author’s opinions, not with the aim to convince others to adopt the same stance, but to incite them to weigh the arguments and to form their own opinion. The paper does not aim to be an exhaustive review of the pertinent literature.

It will not treat ‘semi-digital’ tests, i.e. tests where the stimuli are presented by a digital device, while the instructions are given and the responses are noted by the clinician.

Psychometric obstacles

Several traditional neuropsychological tests have been digitized, and new tests have been imported from cognitive neuroscience and adapted for application in clinical assessment. Examples of traditional tests that have been digitized are reaction speed tests, continuous performance tests, and card sorting tests. Examples of paradigms that have been imported from cognitive science for clinical applications are the Simon task (Craft & Simon, Citation1970; Simon & Small, Citation1969; Zurron et al., Citation2018), the Stop-signal task (Logan, Cowan, & Davis, Citation1984; Nigg, Citation2017), and the Flanker task (Eriksen & Eriksen, Citation1974; Zelazo et al., Citation2014).

Reliability of traditional and computerized tests

For traditional paper-and-pencil tests, there is much information on psychometric properties available in test manuals, research articles, and handbooks (Lezak, Howieson, Bigler, & Tranel, Citation2012; Strauss, Sherman, & Spreen, Citation2006). It is often a priori assumed that computerized administration increases test reliability, because instructions are more standardized and scoring is entirely objective. Notwithstanding these claims, digitizing a test does not guarantee high reliability. A review of well-known computerized cognitive test batteries reported a median reliability of about 0.70 (range 0.17–0.87) (Feenstra, Vermeulen, Murre, & Schagen, Citation2017). These reliabilities are moderate and certainly not better than reliabilities of paper-and-pencil tests (Calamia, Markon, & Tranel, Citation2013). Reliabilities in the order of 0.7 may be acceptable for research applications, but one needs considerably higher reliability when diagnostic decisions on individual patients are concerned (Nunnally, Citation1978). Concurrent validity also is low (median 0.49) (Feenstra et al., Citation2017). In other words, the correspondence between traditional tests and the same tests in digital format is often only moderate.

The situation is similar for tests based on paradigms borrowed from cognitive science and adapted for clinical applications. The knowledge base of these paradigms from cognitive science, extensive as it may be, does not relieve their clinical adaptations from the obligation of meeting the same psychometric standards as any other test (American Educational Research Association, American Psychological Association, National Council on Measurement in Education, & Joint Committee on Standards for Educational and Psychological Testing, Citation2014). Descent from basic science does not by itself guarantee high reliability. On the contrary, the theoretically most interesting variables of these paradigms are often composite measures, i.e., mathematical combinations of two or more basic variables, such as the difference or the ratio of scores obtained in two experimental conditions. Such composite scores are notoriously unreliable, because they contain the error variance of each of the composing variables. For example, the most important variable in a Sternberg memory-scanning paradigm is the slope of reaction times as a function of memory set size (Sternberg, Citation1966). This slope is not very reliable. The split-half reliability is 0.63 (Neubauer, Riemann, Mayer, & Angleitner, Citation1997). Retest reliability after one day is unacceptably low (r = 0.29) (Chiang & Atkinson, Citation1976), and only improves after more exercise with the task (to r = 0.70–0.78 after three days of exercise).

Another example is the Cambridge Neuropsychological Test Automated Battery (CANTAB; www.cambridgecognition.com), which is probably one of the best researched computer test batteries. The CANTAB contains tests derived from experimental paradigms of cognitive (neuro)psychology and neuroscience, such as Delayed Matching to Sample (recognizing stimuli after a delay), Intra-Extra Dimensional Set Shifting (analog to card sorting tests), and a Stop-signal task. Several of these tests have low reliability (Strauss et al., Citation2006).

Moreover, in most cases these paradigms have originally been used in scientific studies with young, intelligent, healthy subjects. As a result, the tasks are often too difficult for patients, causing floor effects and low reliability, and need to be simplified with the risk of losing construct validity.

Equivalence of computer tests and paper-and-pencil tests

Once a paper-and-pencil test has been digitized, it has become a new test, and its psychometric properties have to be investigated anew (American Educational Research Association et al., Citation2014). Developers of computer tests often are focused at the IT aspects of their work, while they tend to neglect the psychometric aspects. However, if one wants to use the new computer test interchangeably with the old paper-and-pencil version, one needs to be sure that both versions are equivalent. This requires not only studies that establish the psychometric properties of the digital version, but also a comparison with the properties of the original version. This type of work is expensive because two versions have to be compared. Test publishers will only do this work, if they see an attractive return on investment. Also, this type of work is non-rewarding for researchers or university departments, because by itself it is unlikely to render new scientific insights, and its results will be difficult to publish in high-impact journals. Consequently, it is hard to find funding for such studies. No wonder that many computer tests are released without sufficient psychometric information.

Quality of normative data

Many frequently used traditional tests, such as the Wechsler tests, are published by commercial companies. Other frequently used tests have been published by researchers in scientific journals, or are otherwise in the public domain, for example fluency tests. Test publishing companies develop their tests and collect normative data before they put the tests into the market. The normative data and the psychometric properties are documented in the test manuals. Also, tests published in journals are usually accompanied by information on psychometric properties and some normative data. Consequently, users are informed about the qualities—and possible weaknesses—of the tests and the standardization samples. Importantly, test characteristics remain the same during a number of years until the publisher releases a new version of the test, or, in case of public domain tests, until the test’s author or other researchers publish additional information.

Computerized tests and batteries, however, are often released shortly after the software is ready, typically with a bare minimum of psychometric and normative documentation. Take again the CANTAB as an example. The battery has been available for about 30 years. Its content has expanded over the years, some tests were adapted, and the battery’s hardware has kept up with technical developments. The present version is administered on a tablet, while the first versions made use of a personal computer with a special adapter that turned the conventional monitor into a touch screen. Normative data, as well as reference data from a range of patient samples, have been collected step-by-step over the years, and parts of it were published in numerous scientific papers. Although the CANTAB undoubtedly is a very useful cognitive battery for many research applications, for neuropsychologists working in a clinical setting, it is extremely difficult to get a clear view on the battery’s psychometric properties and on the characteristics of its normative data. Moreover, one cannot be sure that a test result that was obtained several years ago, can be compared to the corresponding result obtained today. After all, the test may have been adapted in the meantime, the hardware may have changed with unknown effects on test performance, and the normative database may have evolved. From a clinical consumer’s point of view, these are important disadvantages, that may lead people to decide not to purchase the battery.

Note that the CANTAB is presented here as typical example; the same goes for many other computerized test batteries.

Some researchers collect large amounts of normative data via the internet in a relatively brief time span, aiming at a broad audience (e.g. www.testmybrain.org or www.memory.uva.nl) (Feenstra, Murre, Vermeulen, Kieffer, & Schagen, Citation2018; Hartshorne & Germine, Citation2015; Murre, Janssen, Rouw, & Meeter, Citation2013). This way of working enables them to considerably reduce the time necessary to develop tests and to collect adequate norms. In principle, this promotes stable specifications of the final product. Of course, online testing via the internet entails new problems; for more in depth discussion, see e.g. (Feenstra et al., Citation2017), Germine et al. (this issue), and Kessels (this issue).

For the time being, however, one may conclude that traditional paper-and-pencil tests generally have at least equal (and often better) psychometric properties than many computer tests. It remains to be shown whether digital tests can be constructed that are superior to their paper-and-pencil analogs. Validity issues will be discussed in paragraph 3 below.

Technical obstacles

The area of computerized testing abounds in technical problems. Perhaps the most important problem is the abyss between the time it takes to do preparatory psychometric work and the tremendous speed of the developments in hardware, software, browser versions, peripherals, et cetera. Once a test is ready for release to the market with sufficient psychometric documentation and normative data, the hardware or the software for which the test was developed may have become outdated. Paper-and-pencil tests do not suffer these problems. For in depth discussion of these and other technical issues, see (Feenstra et al., Citation2017) and Kessels (this issue).

The administration of computerized tests may be more thoroughly standardized than is the case for paper-and-pencil tests, but this entails the disadvantage of being less flexible. When a clinician administers a paper-and-pencil test to a patient, he/she may decide on the spot to deviate from the prescribed procedure, and to adapt the task to a particular problem he/she encounters with this patient. Or, in case he/she notes that the patient apparently has misunderstood the instructions, he/she may interrupt testing and better explain the instructions. These options of ‘testing the limits’ and of giving a second opportunity are impossible for most computerized tests. Testing the limits can be very informative on the causes of test failure, while the latter option may prevent not noticing invalidity of test results.

Patients from ethnic minority groups who are not tested in their mother language are at increased risk of misunderstanding instructions (Nell, Citation2000). In addition, older individuals may have increased difficulty using these devices (Tierney et al., Citation2014). Or they may just push buttons too long, or inadvertently touch incorrect buttons. Also, people who lack experience in electronics for other reasons (e.g., those lacking resources to interact with electronics) may have a similar difficulty. Developers should take these issues into account and apply technology that is suited also to ethnic and cultural minorities, as well as to older patients. More in general, test constructers should take future demographic developments into account. North America, and some countries in other parts of the world, are experiencing rapid and drastic changes in the composition of their populations.

At any rate, these technical factors may cause invalidity of computerized testing due to ethnic and age bias. Paper-and-pencil tests, with their face-to-face administration procedures run much less of these risks, since the test administrator can more readily note misunderstandings of instructions and inadvertent mistakes in responding.

There is one technique that has shown to be very promising with respect to flexibility. This is adaptive testing, especially when it makes use of an item response theoretical (IRT) approach (De Champlain, Citation2010). Adaptive testing by itself is not new; it has been applied for many years in traditional tests. For example, tests with increasingly difficult items often have stopping rules, or one may skip the first few easy items, which are administered only if the person fails the starting items (e.g. subtests of several WAIS versions). However, IRT methods allow more flexible adaptive testing by constantly calculating which test item should best be administered next, given the responses to the previous items. IRT methods also may continuously calculate the reliability of a series of test responses, allowing one to stop testing as soon as a predefined level of reliability has been reached. However, this requires fast, computerized calculations. IRT methods can raise test reliability while reducing the necessary administration time (De Champlain, Citation2010). Neuropsychologists should try and exploit the opportunities offered by IRT for clinical assessment much more than has been done until now. However, this cannot be done by the clinical community itself, since it requires rather advanced statistical knowledge and a large database to calculate the necessary item parameters. Thus, it should be done by the test developers. Another limitation of the IRT approach is that it cannot be applied to every neuropsychological test. It is suited for tests consisting of independent items of varying difficulty, such as vocabulary tests. However, such tests make up only a minority of the neuropsychological instrumentarium.

Theoretical obstacles

Extensiveness of the body of knowledge

One important factor that may explain why neuropsychologists are so reluctant to embrace new computerized tests is the impressive body of knowledge about traditional paper-and-pencil test, that has been accumulated over the decades. This knowledge refers not only to quality of norms and test reliability (as discussed above), but perhaps most importantly to validity. Suffice to take a look in overview works such as Lezak et al. (Citation2012) or Strauss et al. (Citation2006) to see how extensive this validity knowledge base is. Well-established tests like the Trail Making Test (TMT) (Reitan, Citation1955) or Rey’s Auditory Verbal Learning Test (RAVLT) (Rey, Citation1964) have figured in an impressive number of publications. For example, Psychinfo lists almost 11,000 papers on the TMT and 4,900 on the RAVLT. The knowledge base of the CANTAB, for example, is comparatively modest. Psychinfo contains 950 publications on the CANTAB (accessed April 17, 2018). This is of course a very respectable number, but it cannot compete with well-known traditional tests.

To give a more substantive example of the validity of traditional tests: we know how effective word list learning tasks like the RAVLT are, especially the delayed recall condition, in detecting early Alzheimer’s disease (Backman, Jones, Berger, Laukka, & Small, Citation2005). They are more accurate than neurochemical and neuroimaging biomarkers (Albert et al., Citation2018; Cui et al., Citation2011; Gomar, Conejero-Goldberg, Davies, Goldberg, & Alzheimer’s Disease Neuroimaging Initiative, Citation2014; Schmand, Huizenga, & van Gool, Citation2010). The Geneva Task Force for the Roadmap of Alzheimer’s Biomarkers has recognized delayed recall tests as Alzheimer biomarkers in themselves (Cerami et al., Citation2017). Some new memory tests are even more promising with regard to early detection of dementia (Rentz et al., Citation2013), and they are beginning to prove their value (Mowrey et al., Citation2018). These tests are not necessarily computerized; they may as well be in paper-and-pencil format.

Theoretical paradigms and their practical value

Proponents of modern assessment techniques often blame traditional tests for not being grounded in cognitive theory. They call these tests’ construct validity into question: ‘It is unclear which cognitive process is measured by test X’. This reproach is unfounded for many traditional tests. The problem is that their theoretical background has been forgotten. In general, scientific theories and their corresponding paradigms and methods rise when their inventors are young and productive, and often wane and are forgotten as soon as the inventors have retired or have passed away (Kuhn, Citation1962).

An example of a waning paradigm is the Brown-Peterson paradigm of decay in immediate memory (Brown, Citation1958; Peterson & Peterson, Citation1959). It was used in fundamental and clinical memory research during the second half of the 20th century. The paradigm still is used in occasional studies (Ricker, Vergauwe, & Cowan, Citation2016), but attempts to apply it in the clinic have not been very fruitful.

Examples of the contrary are the Stroop Color-Word Test (Stroop, Citation1935) and, again, the RAVLT (Rey, Citation1964). The Stroop test is a traditional test that still is very popular, but its theoretical background seems to have been forgotten. One needs to read Stroop’s original paper and to scan its references, to find out that is was grounded in a long research tradition on interference and inhibition. Stroop himself dated the beginning of this tradition around 1890 in the US, but it was probably Wilhelm Wundt in Germany around 1880 who began studying the speed of reading and of naming colors (Jensen & Rohwer, Citation1966).

In the middle of the 20th century, André Rey borrowed his verbal learning test (RAVLT) from the work of pioneering memory researchers like Ebbinghaus, who lived and worked around 1900. Rote learning of lists of unrelated syllables or words was a favorite method in the early days of memory research. Modern memory paradigms may be much more sophisticated than the classical paradigms that were a model for the by now well-established clinical tests, but this does not imply that these tests lacked any theoretical founding. Moreover, even if reproaches of not being grounded in theory and of having unclear construct validity would be justified, clinical tests may still be of practical use, for example if they have superior predictive or ecological validity.

The well-established tests have survived the ravages of time. They have proven their value over the years. Many other tests, that may have been frequently used in the past, did not survive because they turned out not to be valid or not sufficiently reliable, or because they did not convey clinically relevant information. Thus, the traditional tests that clinicians continue to use are a positive selection, while over the years inferior instruments were abandoned. It is only understandable that many neuropsychologists do not want to throw away their old shoes until they have new ones that fit comfortably.

Strategic obstacles

Finally, an extremely important strategic issue for clinical neuropsychology as a professional discipline is the proliferation of computerized tests and test batteries that is going on. Dozens, if not hundreds of computerized tests and test batteries are available nowadays, and their number still seems to be growing. Of course, growth and modernization of the neuropsychological armamentarium is to be welcomed, but one major downside of unlimited proliferation is a lack of comparability of measurements and assessment procedures. The abundance of available computerized tests and batteries, and the many options every neuropsychologist has for choosing tests of his/her insight and liking, may be a threat to clinical neuropsychology. The profession runs the risk of getting splintered and fragmentized. Assessments of individual patients, as well as results of scientific studies carried out at different points in time or in different institutions may become totally incomparable.

Professional organizations of neuropsychologists should consider playing a regulating role to prevent chaos and to promote harmonization of neuropsychological assessment procedures. Otherwise modernization might jeopardize the future of neuropsychology, both in the clinic and in the academic world. It seems sensible to be cautious when selecting assessment techniques, even if this implies some degree of conservatism.

Conclusions

Modern neuropsychological assessment techniques are promising, but they have to satisfy the same psychometric requirements as paper-and-pencil tests. Many computer tests are released too soon, with insufficient psychometric qualities, insufficient documentation, or insufficient normative data. Many computer tests and batteries do not meet their promises, and need a lot of additional work on reliability, validity, and norms. Traditional tests, on the other hand, are backed by an enormous knowledge base, may be applied more flexibly, and do not suffer as many technical problems as computer tests. Moreover, the current proliferation of computerized tests is a threat to harmonization in clinical neuropsychology.

Perhaps it’s wise to be somewhat conservative.

Acknowledgements

I thank my colleague professor Jaap Murre for his input and for his critical reading of an earlier version of this manuscript. I also thank the guest editors and the anonymous reviewers for their helpful suggestions.

Disclosure statement

Ben Schmand is emeritus professor of clinical neuropsychology at the University of Amsterdam. He initiated the Advanced Neuropsychological Diagnostics Infrastructure (see www.andi.nl). He holds a position at Philips Research, Eindhoven, The Netherlands.

References

  • Albert, M., Zhu, Y., Moghekar, A., Mori, S., Miller, M. I., Soldan, A., … Wang, M. C. (2018). Predicting progression from normal cognition to mild cognitive impairment for individuals at 5 years. Brain: A Journal of Neurology. doi:10.1093/brain/awx365
  • American Educational Research Association, American Psychological Association, National Council on Measurement in Education, & Joint Committee on Standards for Educational and Psychological Testing. (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
  • Backman, L., Jones, S., Berger, A. K., Laukka, E. J., & Small, B. J. (2005). Cognitive impairment in preclinical Alzheimer’s disease: A meta-analysis. Neuropsychology, 19(4), 520–531. doi:10.1037/0894-4105.19.4.520
  • Brown, J. (1958). Some tests of the decay theory of immediate memory. The Quarterly Journal of Experimental Psychology, 10, 12–21.
  • Calamia, M., Markon, K., & Tranel, D. (2013). The robust reliability of neuropsychological measures: Meta-analyses of test-retest correlations. The Clinical Neuropsychologist, 27(7), 1077–1105. doi:10.1080/13854046.2013.809795
  • Cerami, C., Dubois, B., Boccardi, M., Monsch, A. U., Demonet, J. F., Cappa, S. F., & Geneva Task Force for the Roadmap of Alzheimer’s Biomarkers. (2017). Clinical validity of delayed recall tests as a gateway biomarker for Alzheimer’s disease in the context of a structured 5-phase development framework. Neurobiology of Aging, 52, 153–166. doi:S0197-4580(16)30149-X [pii]
  • Chiang, A., & Atkinson, R. C. (1976). Individual differences and interrelationships among a select set of cognitive skills. Memory & Cognition, 4(6), 661–672. doi:10.3758/BF03213232
  • Craft, J. L., & Simon, J. R. (1970). Processing symbolic information from a visual display: Interference from an irrelevant directional cue. Journal of Experimental Psychology, 83(3, Pt.1), 415–420.
  • Cui, Y., Liu, B., Luo, S., Zhen, X., Fan, M., Liu, T., … Alzheimer’s Disease Neuroimaging Initiative. (2011). Identification of conversion from mild cognitive impairment to Alzheimer’s disease using multivariate predictors. PloS One, 6(7), e21896. doi:10.1371/journal.pone.0021896
  • De Champlain, A. F. (2010). A primer on classical test theory and item response theory for assessments in medical education. Medical Education, 44(1), 109–117. doi:10.1111/j.1365-2923.2009.03425.x
  • Dogan, E., Sander, C., Wagner, X., Hegerl, U., & Kohls, E. (2017). Smartphone-based monitoring of objective and subjective data in affective disorders: Where are we and where are we going? systematic review. Journal of Medical Internet Research, 19(7), e262. doi:10.2196/jmir.7006
  • Eriksen, B. A., & Eriksen, C. W. (1974). Effects of noise letters upon the identification of a target letter in a nonsearch task. Perception & Psychophysics, 16(1), 143–149.
  • Feenstra, H. E., Vermeulen, I. E., Murre, J. M., & Schagen, S. B. (2017). Online cognition: Factors facilitating reliable online neuropsychological test results. The Clinical Neuropsychologist, 31(1), 59–84. doi:10.1080/13854046.2016.1190405
  • Feenstra, H. E. M., Murre, J. M. J., Vermeulen, I. E., Kieffer, J. M., & Schagen, S. B. (2018). Reliability and validity of a self-administered tool for online neuropsychological testing: The amsterdam cognition scan. Journal of Clinical and Experimental Neuropsychology, 40(3), 253–273. doi:10.1080/13803395.2017.1339017
  • Gomar, J. J., Conejero-Goldberg, C., Davies, P., Goldberg, T. E., & Alzheimer’s Disease Neuroimaging Initiative. (2014). Extension and refinement of the predictive value of different classes of markers in ADNI: Four-year follow-up data. Alzheimer’s & Dementia: The Journal of the Alzheimer’s Association, 10(6), 704–712. doi:10.1016/j.jalz.2013.11.009
  • Hartshorne, J. K., & Germine, L. T. (2015). When does cognitive functioning peak? The asynchronous rise and fall of different cognitive abilities across the life span. Psychological Science, 26(4), 433–443. doi:10.1177/0956797614567339
  • Jensen, A. R., & Rohwer, W. D. J. (1966). The stroop color-word test: A review. Acta Psychologica, 25(1), 36–93.
  • Kuhn, T. S. (1962). The structure of scientific revolutions. Chicago, IL: University of Chicago Press.
  • Lezak, M. D., Howieson, D. B., Bigler, E. D., & Tranel, D. (2012). Neuropsychological assessment (5th ed.). Oxford, New York: Oxford University Press.
  • Logan, G. D., Cowan, W. B., & Davis, K. A. (1984). On the ability to inhibit simple and choice reaction time responses: A model and a method. Journal of Experimental Psychology: Human Perception and Performance, 10(2), 276–291.
  • Miller, J. B., & Barr, W. B. (2017). The technology crisis in neuropsychology. Archives of Clinical Neuropsychology: The Official Journal of the National Academy of Neuropsychologists, 32(5), 541–554. doi:10.1093/arclin/acx050
  • Mowrey, W. B., Lipton, R. B., Katz, M. J., Ramratan, W. S., Loewenstein, D. A., Zimmerman, M. E., & Buschke, H. (2018). Memory binding test predicts incident dementia: Results from the einstein aging study. Journal of Alzheimer's Disease: JAD, 62(1), 293–304. doi:10.3233/JAD-170714
  • Murre, J. M., Janssen, S. M., Rouw, R., & Meeter, M. (2013). The rise and fall of immediate and delayed memory for verbal and visuospatial information from late childhood to late adulthood. Acta Psychologica, 142(1), 96–107. doi:10.1016/j.actpsy.2012.10.005
  • Nell, V. (2000). Cross-cultural neuropsychological assessment: Theory and practice. Mahwah, NJ: Lawrence Erlbuam Associates, Publishers. Retrieved from http://www.loc.gov/catdir/enhancements/fy0709/99032682-d.html
  • Neubauer, A., Riemann, R., Mayer, R., & Angleitner, A. (1997). Intelligence and reaction times in the Hick, Sternberg and Posner paradigms. Personality and Individual Differences, 22(6), 885–894. doi:10.1016/S0191-8869(97)00003-2
  • Nigg, J. T. (2017). Annual research review: On the relations among self-regulation, self-control, executive functioning, effortful control, cognitive control, impulsivity, risk-taking, and inhibition for developmental psychopathology. Journal of Child Psychology and Psychiatry, and Allied Disciplines, 58(4), 361–383. doi:10.1111/jcpp.12675
  • Nunnally, J. C. (1978). Psychometric theory (2nd ed.). New York: McGraw-Hill.
  • Peterson, L., & Peterson, M. J. (1959). Short-term retention of individual verbal items. Journal of Experimental Psychology, 58(3), 193–198.
  • Rabin, L. A., Barr, W. B., & Burton, L. A. (2005). Assessment practices of clinical neuropsychologists in the united states and canada: A survey of INS, NAN, and APA division 40 members. Archives of Clinical Neuropsychology: The Official Journal of the National Academy of Neuropsychologists, 20(1), 33–65. doi:S0887617704000538 [pii]
  • Rabin, L. A., Paolillo, E., & Barr, W. B. (2016). Stability in test-usage practices of clinical neuropsychologists in the united states and canada over a 10-year period: A follow-up survey of INS and NAN members. Archives of Clinical Neuropsychology: The Official Journal of the National Academy of Neuropsychologists, 31(3), 206–230. doi:10.1093/arclin/acw007
  • Rabin, L. A., Spadaccini, A. T., Brodale, D. L., Grant, K. S., Elbulok-Charcape, M. M., & Barr, W. B. (2014). Utilization rates of computerized tests and test batteries among clinical neuropsychologists in the United States and Canada. Professional Psychology-Research and Practice, 45(5), 368–377. doi:10.1037/a0037987
  • Reitan, R. M. (1955). The relation of the trail making test to organic brain damage. Journal of Consulting Psychology, 19(5), 393–394.
  • Rentz, D. M., Parra Rodriguez, M. A., Amariglio, R., Stern, Y., Sperling, R., & Ferris, S. (2013). Promising developments in neuropsychological approaches for the detection of preclinical alzheimer's disease: A selective review. Alzheimer’s Research & Therapy, 5(6), 58. doi:10.1186/alzrt222
  • Rey, A. (1964). L'examen clinique en psychologie. Paris: Presses Universitaires de France.
  • Ricker, T. J., Vergauwe, E., & Cowan, N. (2016). Decay theory of immediate memory: From brown (1958) to today (2014). Quarterly Journal of Experimental Psychology (2006), 69(10), 1969–1995. doi:10.1080/17470218.2014.914546
  • Schmand, B., Huizenga, H. M., & van Gool, W. A. (2010). Meta-analysis of CSF and MRI biomarkers for detecting preclinical Alzheimer’s disease. Psychological Medicine, 40(1), 135–145. doi:10.1017/S0033291709991516
  • Simon, J. R., & Small, A. M. J. (1969). Processing auditory information: Interference from an irrelevant cue. Journal of Applied Psychology, 53(5), 433–435.
  • Sternberg, S. (1966). High-speed scanning in human memory. Science, 153(3736), 652–654.
  • Strauss, E., Sherman, E. M. S., & Spreen, O. (2006). A compendium of neuropsychological tests: Administration, norms, and commentary (3rd ed.). Oxford, New York: Oxford University Press. Retrieved from http://www.loc.gov/catdir/toc/ecip0516/2005020769.html; http://www.loc.gov/catdir/enhancements/fy0635/2005020769-d.html
  • Stringer, G., Couth, S., Brown, L. J. E., Montaldi, D., Gledson, A., Mellor, J., … Leroi, I. (2018). Can you detect early dementia from an email? A proof of principle study of daily computer use to detect cognitive and functional decline. International Journal of Geriatric Psychiatry, 33(7), 867–874. doi:10.1002/gps.4863
  • Stroop, J. R. (1935). Studies of interference in serial verbal reactions. Journal of Experimental Psychology, 18(6), 643–662.
  • Tierney, M. C., Naglie, G., Upshur, R., Moineddin, R., Charles, J., & Jaakkimainen, R. L. (2014). Feasibility and validity of the self-administered computerized assessment of mild cognitive impairment with older primary care patients. Alzheimer Disease and Associated Disorders, 28(4), 311–319. doi:10.1097/WAD.0000000000000036
  • Zelazo, P. D., Anderson, J. E., Richler, J., Wallner-Allen, K., Beaumont, J. L., Conway, K. P., … Weintraub, S. (2014). NIH toolbox cognition battery (CB): Validation of executive function measures in adults. Journal of the International Neuropsychological Society: JINS, 20(6), 620–629. doi:10.1017/S1355617714000472
  • Zurron, M., Lindin, M., Cespon, J., Cid-Fernandez, S., Galdo-Alvarez, S., Ramos-Goicoa, M., & Diaz, F. (2018). Effects of mild cognitive impairment on the event-related brain potential components elicited in executive control tasks. Frontiers in Psychology, 9, 842. doi:10.3389/fpsyg.2018.00842