471
Views
27
CrossRef citations to date
0
Altmetric
Original Articles

In defence of high-speed memory scanning

Pages 2020-2075 | Received 15 Jan 2016, Accepted 30 May 2016, Published online: 25 Aug 2016
 

Abstract

This paper reviews some of the evidence that bears on the existence of a mental high-speed serial exhaustive scanning process (SES) used by humans to interrogate the active memory of a set of items to determine whether it contains a test item. First proposed in the 1960s, based on patterns of reaction times (RTs), numerous later studies supported, elaborated, extended, and limited the generality of SES, while critics claimed that SES never occurred, that predictions from SES were violated, and that other mechanisms produced the RT patterns that led to the idea. I show that some of these claims result from ignoring variations in experimental procedure that produce superficially similar but quantitatively different RT patterns and that, for the original procedures, the most frequently repeated claims that predictions are violated are false. I also discuss evidence against the generality of competing theories of active-memory interrogation, especially those that depend on discrimination of directly accessible “memory-strength”. Some of this evidence has been available since the 1960s but has been ignored by some proponents of alternative theories. Other evidence presented herein is derived from results of one relevant experiment described for the first time, results of another described in more detail than heretofore, and new analyses of old data. Knowledge of brain function acquired during the past half century has increased the plausibility of SES. The conclusion: SES is alive and well, but many associated puzzles merit further investigation, suggestions for which are offered.

Notes

1 This feature of the paradigm depends on the influence of npos being selective: It affects the search process, but not perceptual or response processes. See Sternberg (Citation1969b) for evidence of such selectivity; see also Schweickert, Fisher, and Sung (Citation2012) and Sternberg (Citation1998).

2 In an alternative approach, Amit, Sagi, and Usher (Citation1990) developed a neural network model with three attractor subnetworks that can generate the speed, linearity, and exhaustiveness of SES.

3 Appendix B contains evidence of the extent to which the performance of many subjects depends on providing feedback and tangible incentives.

4 In Appendix D, I describe variants of the Sf task in which the R–S interval is brief, which produce substantial effects of probe recency and frequency, and I consider the available evidence for such effects when the R–S interval is long.

5 As formulated, if the serial comparison process occurs, it follows the decision based on memory strength of the probe. An alternative is that the two processes occur in parallel, with <, so that when scanning is called for, RT=. One way in which these alternatives might be discriminated is mentioned in Section 12.4.

6 In the hybrid model, errors can arise only during the strength-discrimination stage. If this assumption is relaxed, support for the model is also provided by Banks and Atkinson (Citation1974). The conclusion from the Hockley and Corballis experiment should be regarded as tentative, because they used a brief (500 ms) R–S interval (see Appendix D).

7 But in relation to serial recall, the roles of decay, and of rehearsal in counteracting its effects, are controversial (Lewandowsky & Oberauer, Citation2015).

8 Examples are Bunge, Ochsner, Desmond, Glover, and Gabrieli (Citation2001) and Schneider-Garces et al. (Citation2009).

9 Stephen Monsell was the first investigator to use this task with an adequate number of subjects under conditions that produced an acceptably low error rate. His experiments, but not the others described below, included instructions to subjects not to rehearse the P-set.

10 Using the Sv task, Gaffan (Citation1977, Exp. 2) found a strong N-probe recency effect averaging about 150 ms with P-sets of complex pictures, but no such effect with P-sets of words. In Exp. McEl89.2, using the Monsell task with npos = 3, 4, 5, and 6 words, the effect increased slightly with npos , and averaged about 50 ms: Equations of linear functions fitted to for P-probes, distant N-probes, and recent N-probes, 587 + 19.8 npos , 670 + 13.6 npos , and 687 + 22.3 npos , respectively. Mean error rates for these probe types were 13%, 5%, and 19%, respectively. Results of Monsell's (Citation1978) corresponding experiments, in which the effect also increased with npos and averaged about 25 ms, were qualitatively similar, but slopes of the fitted linear functions were substantially greater, zero-intercepts substantially smaller, and error rates much lower.

11 In considering the results of Exp. Donk12 it is worth noting that although subjects worked for ten sessions, they received feedback about the accuracy but not the speed of their responses.

12 One exception is the study by Burrows and Okada (Citation1971) who compared the Sv task and the Monsell task within subjects, with npos-values of 1, 2, 3, and 4, and found surprisingly small and non-significant differences between the magnitudes of serial-position effects between tasks. However, their subjects, who received no feedback, were highly error-prone (5.7% in the Sv task, compared to 1.3% for the same npos values in Exp. Stern66.1), with a slower  = 607 ms (compared to 492 ms in Exp. Stern66.1), and with a mean slope of 23.7 ms/digit, more characteristic of the Monsell task, so they may not have been performing optimally.

13 Also, for npos < 6, subjects did not know npos until the probe appeared.

14 McElree and Dosher (Citation1989) used their cued-response task, rather than the Meyer, Irwin, Osman, and Kounios (Citation1988) “speed–accuracy decomposition” task, in which the process is permitted to go to completion on a random subset of trials (rather than the response being cued on all trials), which permits testing whether the cued-response procedure is eliciting the same process as in a standard RT experiment.

15 The most thorough analysis of data from this task is provided by Ashby et al. (Citation1993).

16 The figure that Townsend and Ashby use to illustrate the “memory scanning task” (Citation1983, Figure 6.1) actually shows the Ashby task. However, Ashby et al. (Citation1993, p. 543) comment that “the use of simultaneous presentations makes the memory scanning task more similar to visual search (Atkinson et al., Citation1969) and in visual search it is thought that subjects search through a visual short-term store (Townsend & Roos, Citation1973)”. Because the Ashby task does not require the subject to use the visual array representation, what the subject actually chooses to do may depend on previous experience and level of practice, and we may find substantial individual differences. Whether or not a subject is using a visual representation may be revealed by the RT pattern (see ), but mixed strategies present problems for such inference.

17 If responses are made by left and right hands, such a pattern might be complicated by the Simon effect (Kornblum, Stevens, Whipple, & Requin, Citation1999; Lu & Proctor, Citation1995), which, e.g., would shorten RTs when the target is further to the right if the positive response is made by the right hand.

18 The memory sets were fixed for 48 trials, but they called this a “varied-set” procedure.

19 Standard errors are based on differences across four sessions

20 Ashby et al. (Citation1993, Exp. Ashby93) displayed the memory set for npos s. Schneider and Shiffrin (Citation1977, Exp. Schnei77) displayed it for as long as the subject wished on each trial. Franklin and Okada (Citation1983, Exp. Frank83) displayed it for 150 ms, and used a probe delay of only 500 ms.

21 Slope ratios in Exp. Ashby93 (1.33, 1.36, 1.88, and 1.90) differed markedly across the four subjects, with two subjects having values close to 2.0.

22 A similar proposal was made by Horn and Usher (Citation1992).

23 “Time compressed” because the rate of item activations in the memory might be as much as 50 times greater than the rate at which the items were presented; “dynamic” because the memory is maintained by a repeating sequential activation process.

24 However, evidence supporting theta adaptation in the rat is provided by Geisler et al. (Citation2010).

25 See, for example, Sederberg, Kahana, Howard, Donner, and Madsen (Citation2003), Siegel et al. (Citation2009), Axmacher et al. (Citation2010), Fuentemilla, Penny, Cashdollar, Bunzeck, and Duzel (Citation2010), Kawasaki, Kitajo, and Yamaguchi (Citation2014), Lisman and Jensen (Citation2013), Lisman (Citation2010), Roux and Uhlhaas (Citation2014), and references therein. See also Schon, Newmark, Ross, & Stern (Citation2016), who found activity in the parahippocampal region that increased with npos during the probe delay, but who unfortunately used the novel negatives task.

26 The study by Zarahn, Rakitin, Abela, Flynn, and Stern (Citation2006), in which it is concluded that maintenance and search involve different brain regions, used the Ashby task and reported the mean slope of the function to be 61 ms/letter, substantially greater than what is found for Sv or Sf tasks. It is important to ask the same question of those tasks.

27 It is hard to believe that execution of an unrelated response must await the trough. If not, why must this response do so?

28 Examples are recency judgements (Hacker, Citation1980; McElree & Dosher, Citation1993; Sternberg, Citation1969a, Exp. 8), and context recall (Sternberg, Citation1967b; Citation1969a, Exps. 6, 7).

29 Other challenges to the current version of the model, in particular, the exhaustiveness of the process, include the partial selectivity of search of categorized word lists (Naus, Citation1974; Naus, Glucksberg, & Ornstein, Citation1972; see Sternberg, Citation1975, Section 7.2) and the fact that in a procedure that usually elicits fast exhaustive search, the process for some subjects is slower and self-terminating (see Sternberg, Citation1975, Section 7.3).

30 As discussed in Section 2.4, because Exp. Ashby93 used simultaneous visual presentation of the P-set, and also because of properties of the mean data, it can be argued that the process underlying those data differ from that underlying the Sf or Sv tasks.

31 See Sternberg (Citation1998), Fig. 14.15, for additional evidence.

32 An alternative is an effect on pa, if that feature of the model is retained.

33 Given the large effects that feedback and reward can have on performance in these tasks, described in Appendix B, there may be special concerns about differences in motivation between clinical populations and their controls. Also, the possibility of strategy differences being responsible for slope effects cannot be overlooked. As discussed in Sternberg (Citation1975, Section 7.3), greater mean slopes are sometimes associated with βneg ≈ 2βpos , suggesting a self-terminating search strategy. Both of these reports of slower scanning rates by subjects with MS claim no significant interaction of slope with response type, but neither of them provides the data separately for P-trials and N-trials.

34

“ … determinations of oscillation frequencies were very noise sensitive, raising doubts about the conclusion. Rigorous testing of this relationship will require resolution of the controversy about which brain regions are responsible for short-term memory maintenance and better methods for noninvasive measurement of the oscillatory frequencies at those locations.” (Lisman & Jensen, Citation2013, p. 1005)

35 That they actually lowered the ongoing theta frequency was inferred indirectly, as they were not able to measure it during the stimulation. They assumed that their stimulation would not change the relevant gamma frequency. In an improved version of such an experiment, theta and gamma would be measured and an attempt would also be made to drive theta at a higher frequency, as well, chosen to decrease the span.

36 However, the effects were small; the analysis was complicated by requiring correction for a global effect of fc on RT; the foreperiod as well as the number of clicks the subject heard before seeing the probe were confounded with fc; it appears to have been assumed (or found) that there were no individual differences in fγ; and, as in the Vosskuhl et al. (Citation2015) study, the evidence that the oscillation frequency was actually influenced by fc was indirect. In an improved version of such an experiment, tACS might be used (Helfrich et al., Citation2014), as Vosskuhl et al. did, instead of clicks; if possible, fc values would be determined in relation to measurements of individual gamma frequencies; and the effect of the manipulation on fγ would be measured.

37 Personal communication, Joshua Jacobs, December 2014.

38 See Appendix C for an outline of the analysis, and some of the findings for individual rats.

39 Although both distributions are positively skewed for all values shown of nγ and npos , and the Pearson moment coefficient of skewness decreases as both nγ and npos increase, the rate of decrease is slower for the RTs.

40 In an effort to collect memory-span and memory-scanning data in one procedure, Puffe used a large range of P-set sizes, fitted bilinear functions to the data as Burrows and Okada (Citation1975) had done, and assumed that the breakpoint of the bilinear function is a measure of the memory span. Unfortunately, no measure of precision of the estimated spans was provided.

41 Another reason to trust these results (in addition to the large sample sizes) is that all conditions included feedback on speed and accuracy as well as performance-dependent payoffs, whose importance is discussed in Appendix B.

42 These data have been treated as if only the span measure is subject to error.

43 They believe that the correlation they found was between a slower memory-search process and memory span.

44 A different connection between memory span and scanning rate was suggested by Cowan et al. (Citation1998), Cowan et al. (Citation2003), and Hulme et al. (Citation1999), who have proposed that the time intervals between items during immediate memory recall reflect a rapid sequential activation process akin to high-speed memory scanning.

45 It is not implausible that such independence might be violated. For example, if the quality of encoding varies from trial to trial, and influences the comparison time, then this could create a negative covariance between the durations of the encoding and comparison stages, as well as a positive covariance among comparison durations.

46 Schneider and Shiffrin (Citation1977, p. 30) mention the need to elaborate the model.

47 However, analyses of components of variance for Exps. Stern75a and Stern75b, in which the stimulus ensemble consisted of (highly familiar) digits, showed significant contributions of probe differences on both P-trials and N-trials.

48 Error rates are sufficiently low so that the number of trials is a good approximation of the number of trials on which responses were correct.

49 This depends on assuming that the contributions of probe differences to RT in the Sf task are unaffected by npos and are the same for a probe, whether it is a P-probe or an N-probe.

50 Sample sizes are especially small in Exp. Stern66.1: only 16 trials per subject per value of npos . This experiment was continued for two additional exploratory sessions in which between-subject differences in presentation rate and probe delay were introduced—differences that appeared to have no systematic effect on . It is helpful to consider the mean variances over the three sessions. For these combined data, linear functions fit better (Pcnt Lin = 99.2 for N-trials; 88.8 for P-trials), and the slopes are closer to being equal (4,763 ± 1,778 for N-trials; 3,764 ± 1,350 for P-trials).

51 Dosher and McElree (Citation1992, p. 401) also cite Schneider and Shiffrin (Citation1977) to support their assertion that “Predictions of linear increases in variability … also fail”. However, the latter authors say (p. 30) that “On the whole, despite a certain amount of noise in the data, the variances are approximately a linear function of the load”.

52 If the five conditions are satisfied, and the positive minimum mentioned in Condition 1 is large, then Dosher and McElree (Citation1992, p. 401) are correct that the SES model predicts “that the minimum RT should depend fairly strongly on list length”.

53 As examples of the effect of sample size and distributional shape and spread on the bias, for Gaussian, uniform, and exponential distributions, respectively, as the sample size grows from 10 to 100, the sample minimum shrinks by about 96%, 28%, and 9% of the standard deviation.

54 These findings were used in Section 6.1 of Sternberg (Citation1975) as evidence against the search being self-terminating. In both studies, the rates of increase of min(RT) with npos on P-trials and same-category N-trials were both significantly greater than zero, but the difference between these rates was not significant. Also in that section, the approximately linear and equal rates of increase of RT variance were described, with the rate of increase used as additional evidence against the search being self-terminating.

55 Quantiles were estimated using the Hyndman and Fan (Citation1996) Type 8 median-unbiased estimator.

56 For each subject, these ratios were calculated separately for P-trials and N-trials, and then averaged. The values reported are the means and standard errors of these averages over subjects.

57 The histograms of RTs on N-trials for this one subject have been published at least four times: by Hockley (Citation1984, ), Dosher and McElree (Citation1992, b), Dosher and Sperling (Citation1998, Figure 25d), and Hockley (Citation2008, ). Estimates of the minima of these data have, themselves, never been reported, nor have any estimates of the precision of such estimates, nor have this subject's error rates, nor the histograms or minima for the other five subjects. Because the bin size in these histograms is 50 ms, and the bin that represents the shortest RTs in the sample (300–350 ms) is occupied for all npos values, all that we can say with certainty about the change of the sample minimum as npos increases from 3 to 6 (and the mean over subjects increases by about 147 ms), is that it lies between a decrease of 50 ms and an increase of 50 ms.

58 As the R–S interval was only 0.5 s, a variant discussed in Appendix D, conclusions for the Sf task must be tentative.

59 Of course, the validity of these simulated minima depends on the goodness of fit of ex-Gaussian distributions to the data, about which questions can be raised: Whereas only 4% of the memory scanning data sets in Hockley (Citation1984) deviated significantly (p < .05) from the ex-Gaussian distribution, 36% of the data sets from Exp. 1 of Hockley and Corballis (Citation1982) did. And it is not clear what the relation is of the distribution based on averaged parameters to the individual distributions that gave rise to the parameters that were averaged. To support claims about the “leading edge”, it would be far better to determine the minima, or low quantiles, directly.

60 The first three cumulants are the same as the first three central moments, which, for this experiment, are shown in Panels H1, H2, and H3 of . Because this method should ideally be applied to stable data from individual subjects, rather than means, the procedure described is a compromise. Nonetheless, it is worth determining where it leads.

61 Also, the sets of estimated cumulants for each of the six distributions from the three experiments satisfied inequalities that are required if the associated distributions are non-degenerate, unimodal, and contain only non-negative values (Johnson & Rogers, Citation1951; Mallows, Citation1956).

62 This simulation shows that the change in the RT distribution as npos increases from 1 to 4 is associated with an increase in the bias of the sample minimum, but that this increase is only 2.7 ms/item.

63 Promising as these findings are, the experiments that provided the data used to estimate the distributions were not designed for this purpose. More suitable experiments would provide more practice to achieve better stability, and an analysis that permitted removal of any effects of nuisance factors such as probe differences.

64 See, e.g., data in B (discussed in Section 8), from the AM condition.

65 Monsell (Citation1978) found this for stimulus ensembles containing consonants (Exp. 1) and two-syllable words (Exp. 2), using instructions that asked subjects not to rehearse. McElree and Dosher (Citation1989) found it for two-syllable words, in both their pilot experiment and their Exp. 2, with no instructions about rehearsal. For reasons that are unclear, this was not found in the Monsell task of Exp. Nosof11, with a stimulus ensemble containing consonants, also discussed by Donkin and Nosofsky (Citation2012, ): An effect of npos on the RT for the most recently presented probe (lag 1) is shown by all four subjects in that experiment, with a mean of about 15 ms/item, and all four subjects show large effects of recency. However, there is a clear difference between these data and their data using the Sv task with digits (Exp. Donk12, 2012, ), in which two of their three subjects show a substantial effect of npos on the for lag 1, and no subject shows much effect of recency.

66 See also Diener (Citation1988, p. 375) who suggests that “In the absence of the delay, the memory set may not be stored in a form that is amenable to the search that results in the typical set-size effect.”

67 Roeber and Kaernbach (Citation2004) attempted to replicate and extend Exp. Stern69a, but their replication failed; this is not surprising, as they used the novel negatives task.

68 Mean rates of speeded-response errors on P-trials and N-trials, respectively, were 2.4% and 1.6% (AM) and 3.0% and 4.1% (LTM). The recall score was the number of correct letters in correct positions. Mean errors per string were 1.8 (recall control) and 1.7, 1.9, 2.0, and 1.9 for npos = 1, 2, 3, and 4, respectively. Lines fitted by least squares to data from P-trials and N-trials, respectively, are 431 + 25.2npos and 461 + 36.2npos (AM) and 477 + 62.6npos and 528 + 59.4npos (LTM). Corresponding lines fitted to data excluding npos = 4 are 418 + 32.9npos and 464 + 34.6npos (AM) and 462 + 71.4npos and 513 + 67.9npos (LTM).

69 In the Wickens et al. (Citation1981) and Wickens et al. (Citation1985) experiments using words, the novel-negatives task was used (N-probes were never repeated), so that subjects could base their decisions on memory strength, as in the Monsell task. Also, sets were presented simultaneously, as in the Ashby task. In the Wickens et al. (Citation1985) experiment using consonants, sets were presented simultaneously, and subjects were slow, with mean zero-intercepts and slopes of 625 ms and 52 ms/consonant, respectively. In Conway and Engle (Citation1994), mean slopes based on the two smallest sets (npos = 2,4) were 194 ms/item for words, and 96 ms/item for consonants, far outside the range of slopes for high-speed scanning. Also, lists were presented simultaneously during the learning phase, as in the Ashby task. In Zysset and Pollmann's (Citation1999) similar study, using consonants, slopes (only P-trials reported) for the two smallest sets (npos = 4,6) in their primary memory condition were 57.2 and 57.3 ms/item. In all four studies, analyses started with median RTs; as slopes based on medians tend to be smaller than slopes based on means, all these slopes must be considered above the range for high-speed scanning.

70 In Exp. Stern69a, with four subjects, linear functions fitted to the mean data are 336 + 57npos (AM) and 467 + 105npos (LTM); the mean ratio of slopes in the LTM and AM conditions is 1.98 ± 0.22. In the improved experiment, with 12 subjects, mean slopes in AM and LTM conditions are 32.5 ± 2.0 and 62.6 ± 4.5 ms/item, respectively, with the mean ratio 2.00 ± 0.19. Standard errors are based on between-subject differences.

71 These authors have recently argued for a common rate for several sequential processes in memory and have related this idea to the LIJ model discussed in Section 3.

72 In the Monsell task, a brief filled delay was found to reduce from 42.2 to 30.5 ms/item (Exp. Monsl78.1). According to Monsell (Citation1978, p. 481), for direct-access memory-strength models, “if decay [of strength] decelerates … then the effective rate of decay, and hence the slope, will be smaller after a delay.” Could such a dramatic difference—from reducing the slope in his experiment to doubling it in this one—be produced by plausible adjustments of strength criteria in such models? It seems unlikely.

73 In each of the four parts of Exp. Stern75b, 100 test trials were preceded by 40 practice trials. The mean error rate was 2.4%. In each of the three parts of Exp. Stern75a, 120 test trials were preceded by 48 practice trials. The mean error rate was 2.9%. In both experiments, different P-sets were used in different parts.

74 Evidence favouring this possibility is provided by -, usually between 20 and 40 ms when P-trials and N-trials are equiprobable, which was 2, 32, 28, and 36 ms for nneg = 1, 2, 4, and 8, respectively, in Exp. Stern75b, and 12, 36, and 32 ms for nneg = 2, 4, and 6, respectively, in Exp. Stern75a.

75 The absence of an effect of nneg was described in Sternberg (Citation1963) and mentioned in Sternberg (Citation1966); the data are shown in in Sternberg (Citation1975). This finding appears not to have been considered by advocates of strength-based theories of performance in this task (e.g., Monsell, Citation1978 and McElree & Dosher, Citation1989, who believe that recency across trials is one determinant of RT, and Hockley & Murdock, Citation1987). Nor have these findings been considered by those who argue that repetition priming plays a role in the fixed-set procedure (Jou, Citation2014; Stadler & Logan, Citation1989).

76 In a review of the diffusion model, Ratcliff and McKoon (Citation2008, p. 876) say “For recognition memory, for example, drift rate would represent the quality of the match between a test word and memory.” In his analysis of Exp. Hock84 (an Sv task with letters), Ratcliff (Citation1988) found that the primary effect of the increase in npos from 3 to 6 was a systematic reduction in rk for matches (from .405 to .294), and a smaller reduction in rk for non-matches. Also, the separation between the starting point and the “yes” boundary (hence, ej) increased slightly but systematically. In their “VM blocked” condition in their Sf task with words, Strayer and Kramer (Citation1994, Exp. 2) found that the primary effect of the increase in npos from 2 to 6 was a reduction in rk for matches from .362 to .267, and a negligible effect on rk for non-matches. They also found an increase in the separation between starting point and match boundary from .037 to .063. Ratcliff's (Citation1978) earlier findings were similar, but because he used the Monsell task rather than the Sv task, they are not relevant. In contrast to these findings, Donkin and Nosofsky (Citation2012) found, in fitting a version of the parallel self-terminating LBA model to their data from three subjects in an Sv task (Exp. Donk12), that npos influenced the ej as well as the rk.

77 Applying the diffusion model to an experiment in which the relative proportion of two responses was varied, Ratcliff and McKoon (Citation2008, p. 899) concluded that “[a] difference in starting point accounted for most of the proportion effect” and that fitting an effect of proportion on the drift rate as well “increased the chi square goodness of fit value by only 1%”.

78 In retrospect, a full practice session should have been provided, to reduce variability. Because of the importance of knowing how the effects of these two factors combine, a better experiment, perhaps with both factors varied within subjects and with more than three values of npos , should be run.

79 Mean error rates were similar for P-trials and N-trials and differed little across npos values: 1.3%, 1.5%, and 1.4% for values 1, 2, and 4, respectively. However, they did vary with response probability: For low. medium, and high probabilities, mean error rates were 2.4%, 1.2%, and 0.6%, respectively—not surprisingly, subjects tended to make the high-probability response when the low-probability response was called for, more than the reverse. Means of var(RT) differed little between P-trials and N-trials: 7,021 and 6,995 ms2, respectively. Also, they were influenced little by response probability: 6,907, 6,903, and 7,214 ms2 for low-, medium-, and high-probability responses, respectively. However, they were influenced by npos , similarly for P-trials and N-trials: 5,607, 6,587, and 8,830 ms2, for npos values of 1, 2, and 4, respectively.

80 As discussed in Sternberg (Citation1969b), another way of expressing the equality of these effects on P-trials and N-trials is that the effects of response probability and response type (positive or negative) are additive, consistent with their selectively influencing distinct processes arranged in stages.

81 To show this, let τ = duration of the decision process, γ = half of the mean effect of npos (from 1 to 4) on τ, π = half of the mean effect of Pr{response} (from .75 to .25) on τ, and  = the mean of τ over the four conditions, and write τ as τ(Pr{response}, npos). Then τ(.75, 1) = ( − γ)( − π)/; τ(.75, 4) = ( + γ) ( − π)/τ(.25, 1) = ( − γ) ( + π)/; and τ(.25 , 4) = ( + γ)( + π)/. Next, combine these to determine the effects of npos on τ (in this case, three times the slope of the function) under the two conditions of response probability: τ(.75, 4 ) − τ(.75, 1) and τ(.25, 4) − τ(.25, 1).

82 Because the two groups are independent, the assignment to create the pairs is arbitrary, but the result might depend on the assignment. For this reason I created 1000 random assignments and determined, for each, the observed value, the predicted value, and the difference between them. The values reported are the means of the thousand differences determined in this way. It turns out that the results were not especially sensitive to the assignment: While the mean difference between predicted and obtained values was 21.4 ms, the range of this difference was small (20.7, 22.1 ms).

83 The puzzling “translation effect”, investigated thoughtfully by Clifton and his associates (e.g., Clifton, Sorce, & Cruse, Citation1977; see Sternberg Citation1975, Section 7.1) would perhaps be less mysterious if it reflects activation of both the P-set and its translations, rather than comparison to both.

84 If the process is indeed exhaustive, and we consider time to respond to P-probes as the similarity of the probe to two non-matching set members is varied in the same way as described for N-probes, then a similar data pattern, with effects of the same size, would be expected.

85 The prediction of additivity could fail if the similarity of a probe to a member of the set along a particular dimension changes the way in which that dimension is processed in subsequent comparisons.

86 Note that Ix = 0 if and only if the effect of the mean number of shared features is linear, because it is equivalent to − ( + )/2 = ( + )/2 − .

87 Townsend and Fific (Citation2004) also show that with a negative interaction (Ix < 0), a more elaborate analysis based on the RT distributions can distinguish among alternative parallel processes.

88 To test whether is linear with the number of set members to which an N-probe is similar in the same way (i.e., with respect to the same dimension), at least two members of the P-set must have the same value on one of the dimensions, and for both dimensions to be relevant to the decision there should be additional members. An example of such a P-set is (1,1), (1,2), (3,3), (4,4). Then the N-probes (5,5), (3,5), and (1,5), symbolized [0,0], [1,0], and [2,0], are similar with respect to the same dimension to 0, 1, and 2 members, respectively.

89 The interesting Sv experiment by Townsend and Fific (Citation2004), which used an ensemble of Serbian pseudowords with Serbian subjects and showed strong effects of similarity, satisfies neither requirement (2) nor (3). However, it was revealed by Yang, Fific, and Townsend (Citation2014) that their experiment also included a condition with npos = 4, so that requirement (3) could be satisfied, given an adequate analysis. However, assuming these violations to be unimportant, the analysis shows that whether the comparison process is serial or parallel varies from subject to subject (one consistently serial, one consistently parallel) and the probe delay (the remaining three subjects serial with a delay of 0.7 s, parallel with a delay of 2 s). The experiment by Huesmann and Woocher (Citation1976) used the novel-negatives task, in which N-probe words were presented only once during the experiment, which invites the use of strength discrimination. Chase and Calfee's (Citation1969) experiments did not satisfy requirement (2). Dick and Hochhaus’s (Citation1975) subjects were extremely slow and inaccurate. Also, the attempt by Hockley and Corballis (Citation1982, Exp. 2) satisfied neither requirement (1) (using a variant of the Sf task with an R–S interval of only 0.5 s) nor (2).

90 for nsim = 0, 1, and 2 was 510, 534, and 625 ms, respectively; the corresponding error percentages were 0.3%, 2.5%, and 7.8%.

91 The ingenious experiments by Mewhort and Johns (Citation2000) and Johns and Mewhort (Citation2002, Citation2003) fail to satisfy requirements (2) or (3). With ensembles of coloured shapes, their subjects are substantially slower than those of Checkosky (Citation1971), who received no feedback, perhaps because they received feedback only on accuracy. With ensembles of words, and npos = 4, their subjects may also be slower and less accurate than others. For example, Juola and Atkinson (Citation1971), with npos = 4 words, obtained  = 712 ms and 0.3% errors; in their accuracy condition, with npos = 4 words, Banks and Atkinson (Citation1974) obtained  = 828 ms and 1.3% errors (averaged over npos values of 2, 3, 4, 5, and 6). In contrast, averaging over Exps. 5 and 6 in Mewhort and Johns (Citation2000), and Exp. 4 in Johns and Mewhort (Citation2002), all using P-sets containing four words, the was 931 ms and the error rate 3.4%. However, under their conditions, Johns and Mewhort show persuasively that subjects search for probe features in the P-set rather than comparing the probe as a whole to P-set members, just as Clifton and Gutschera (Citation1971) showed that subjects sometimes engage in such “hierarchical search” when the stimulus ensemble consists of two-digit numbers and the “features” are the tens digit and the units digit.

92 Also included in his experiment were “pure” P-sets, containing just letters or just digits. On trials with such P-sets, the category of the probe was the same as that of the P-set, so that subjects knew the category of the probe before it was presented, unlike trials with mixed P-sets, on which the category of the probe was uncertain. Data from pure trials are thus omitted from the present analysis. Also, it should be mentioned that Darley's experiment is best described as an Ashby task, as members of the P-set on each trial were displayed simultaneously, with the letters and digits in different columns.

93 Values read from plot in Atkinson et al., Citation1974, Figure 21.

94 Darley's data also provide three contrasts to answer the question more directly. Let be the mean RT when ns = s and nd = d. Then, for npos = 3,  = 602 ms and  = 602 ms; for npos = 4,  = 638 ms and  = 633 ms; and for npos = 5,  = 667 ms and  = 653 ms. The differences, whose mean implies that βs – βd = 3.2 ± 2.0 ms, provide evidence for sequential comparison that is again suggestive, but not conclusive.

95 If we are able to influence only one of the two processes, and thus disrupt the 2:1 ratio, then having found such an exact 2:1 ratio becomes even more mysterious.

96 Selective attention to evidence that favours the author's position is perhaps to be expected, given the considerable self-discipline required to follow Chamberlin's (Citation1890) method of multiple working hypotheses.

97 I realize that these first two steps may be far from the ideal analysis. A Hilbert transformation should probably be applied to the filter output, and the assumption that the results do not depend strongly on the filter band should be tested.

98 Such a decrease was found from 50 to 500 ms (Bertelson, Citation1961; Smulders et al., Citation2005), from 50 to 1000 ms (Bertelson & Renkin, Citation1966; Soetens, Citation1998), from 100 to 2000 ms (Hale, Citation1967), from 500 to 2000 ms (Theios & Walter, Citation1974), from 250 to 750 ms (Ells & Gotts, Citation1977), and from 100 to 1000 ms (Pashler & Baylis, 1991).

99 After the speeded responses in the experiment by Darley (Citation1973) discussed in Section 11, subjects were required to recall the subset of the P-set that had not been probed. Atkinson et al. (Citation1974, p. 227) report that results were essentially the same in another, similar experiment in which such recall was not required.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.