Publication Cover
Journal of Intelligent Transportation Systems
Technology, Planning, and Operations
Volume 25, 2021 - Issue 6
2,470
Views
7
CrossRef citations to date
0
Altmetric
Original Articles

Driving with and without automation on the motorway – an observational study

ORCID Icon, ORCID Icon, &
Pages 587-608 | Received 27 Mar 2018, Accepted 02 Mar 2020, Published online: 13 Mar 2020

Abstract

User-related assessment of a level 3 automated driving system, providing functions such as lane- and distance-keeping, stop & go driving, lane change and overtaking, was carried out on a motorway in Germany with the aim to assess user-related issues of automated driving, i.e., behavior when driving with automation on motorways, user experiences, reactions and acceptance. Twenty-one persons drove twice along the test route once with the system switched off and once with the system active. Driving data were logged and driving behavior was observed by two observers in the car and the drivers answered questionnaires. The results revealed that the drivers used the system as it was intended to be used and that the system affected driving positively in several ways, resulting in better speed adaptation, less speed variation, better distance keeping, better lane choice, better indicator usage and fewer dangerous lane changes. No differences with regard to subjective workload were shown. The system was perceived as being both useful and satisfactory. However, it was found that the system did not react to other drivers’ intention to make a lane change, especially in situations where they were attempting to merge onto the motorway. Further development is needed to improve the system’s ability to recognize other vehicles’ intention to enter the vehicle’s own lane. Also, the phenomenon of the driver “feeling guilty” because of the system’s reckless behavior by not allowing other drivers to merge onto the motorway or hindering other cars behind due to lengthy overtaking should be investigated further.

Introduction

Automated driving systems are aimed at improving safety, efficiency and comfort, but since driver reactions and acceptance play a decisive role in making these systems a reality on roads, understanding user-related issues is of substantial importance. Brackstone and Mcdonald (Citation2000) reviewing the issue of driver interaction with Advanced Vehicle Control and Safety Systems, concluded that even if behavioral research issues were well advanced, much work was still required before these systems can be readily deployed. There are a large number of user-related issues concerning automated driving applications. Stanton and Young (Citation1998), as well as Saffarian et al. (Citation2012) presented a number of psychological issues pertinent to vehicle automation that they noted should be considered in empirical evaluation studies. These issues include locus of control, the trust the driver has in the automated system, the situation awareness of the driver, the mental representation that the driver develops of the automated system, the mental and physical workload associated with automation, feedback, driver stress and its implications. The relevance of these and additional issues are briefly described below:

  1. Behavior related issues: People adapt their behavior as a response to changes in the road-vehicle-user system (Smiley, Citation2000). Reallocation of attention is an intelligent response to the change (OECD, Citation1990). The adaptive process develops dynamically over time, based on operator experiences of interaction with the automated system (Parasuraman & Manzey, Citation2010). de Winter et al. (Citation2014) in their literature review concluded that drivers of a highly automated car ‘are strongly inclined to engage in non-driving tasks, such as watching a DVD or even sleeping’ (p. 206).

  2. Understanding of automation issues: An insufficient and/or erroneous mental model the driver develops of the automated system may lead to increased risk of user errors (Preece et al., Citation2002). Jenness et al. (Citation2008) in a survey among early adopters of in-vehicle technology found that system owners often do not understand the limitations of the systems and manufacturers’ warnings. Heikoop et al. (Citation2018) concluded that mental models ‘… are not self-correcting, but rather become increasingly complex with time’, suggesting that if drivers don’t receive prior information or training on the automated driving system they use, they could retain an inaccurate mental model.

  3. Trust and reliance related issues: Over- or under-reliance on automation can have crucial effects on automation outcome. Users rely more on automation they trust more (Merritt, Citation2011). Over-trust may lead to misuse of automation, leading to failure of the driver to override the system when necessary and under-trust may yield disuse of automation when users fail to rely on automation when doing so would improve performance (Parasuraman & Riley, Citation1997). There is a variation among users, younger and older users rely on automation differently (McBride et al., Citation2010). Merritt et al. (Citation2012) found that user trust in automation was influenced by both implicit and explicit attitudes.

  4. Locus of control (i.e. the extent to which removal of control from the driver affects the performance of the vehicle/driver entity). Locus of control refers to the extent to which individuals believe they can control events affecting them. Drivers with an internal locus of control believe their vehicle performance derives primarily from their own actions, while drivers with an external locus of control believe the behavior of the vehicle is due to the automated system (Halpert & Hill, Citation2015). Stanton and Young (Citation1998) mean that some drivers may perceive that they are in overall control of the vehicle when it is in automated mode whereas others may not. According to them, research findings had shown that people with an internal locus of control generally perform better than individuals with an external locus of control which might be attributed to the degree of task engagement for the individual. An internal locus of control may lead drivers to take on an active role, while an external locus of control might lead a driver to assume a passive role with the automated system. Stanton and Young (Citation1998) found that the passive drivers failed to intervene when the automated system failed whereas the active drivers took control of the situation.

  5. Resuming control of driving is an important issue in automation. de Winter et al. (Citation2014), based on the findings in their literature review suggest that drivers during highly automated driving are susceptible to drowsiness, which might evoke longer response times than during manual driving. However, if drivers receive a timely warning then they are able to safely regain control in a critical situation. Merat et al. (Citation2014) examined how different methods of transferring control of a Level 3 automated vehicle affected the driver’s ability to resume control and found ‘an overall better performance by drivers when control was transferred after a fixed duration of 6 min, compared to when the automated system disengaged if drivers removed their visual attention away from the road centre’ (p. 281). Gold et al. (Citation2018) studied a number of take-over performance variables based on more than 700 take-over situations in level 3 conditional automated driving in a series of driving simulator experiments. They found that time-budget, traffic density and repetition of take-over situations strongly influenced drivers’ take-over performance, while drivers’ age, lane and non-driving related tasks only explained minor part of the variance in take-over performances.

  6. There is a non-negligible risk that skill degradation accompanies automation of the driving tasks due to overreliance, as the reinforcement coming from constant engagement in the driving task becomes absent (Trimble et al., Citation2014). Endsley and Kiris (Citation1995) refer to Shiff (Citation1983) who found that ‘despite initial manual training, those subjects who had been operating as supervisory controllers of automation in a simulated process control task were slower and more inefficient in bringing the system under control than were subjects who had operated only in a manual mode’ (p. 381). de Winter et al. (Citation2014) in their literature review found indications that ‘after seeing or experiencing automated driving, drivers show poorer lane keeping performance, shorter headways, or delayed reaction times as compared to drivers who have not seen/experienced automated driving’ (p.211).

  7. Automation monitoring may lead to increased mental workload of the driver. Humans are inefficient in monitoring automation (Parasuraman et al., Citation1993; Singh et al., Citation1997; Sheridan, Citation2002; Strand, Citation2014). Overreliance on automation contributes to this inefficiency (Parasuraman & Riley, Citation1997). Banks et al. (Citation2014) in a case study concluded that the number of processes conducted by the driver appear to increase as the level of automation increases. In intermediate levels of automation, driver decision-making remains apparent (only at full automation can this decision be removed). The addition of sub-system monitoring increases task loading and hence driver workload as the driver must remain aware of system state and operation. Automation has different effects on users’ (younger or older) workload (McBride et al., Citation2010). However, de Winter et al. (Citation2014) in their literature review concluded that almost all simulator and on-road studies provided strong evidence that highly automated driving reduced workload as compared to manual driving.

  8. Stress is a factor that may affect driver workload and safety (McDonald & Davey, Citation1996). Stress and vehicle automation has been studied by Funke et al. (Citation2007), who explored (among others) the effects of stress and vehicle automation on driver performance manipulating stress by exposing drivers to a loss of control experience and found that both stress and automation influenced subjective distress, with higher levels of distress under the stressful driving conditions and lower levels of distress under the automated speed control conditions; however, the two factors did not interact. Reimer et al. (Citation2010) evaluated the extent to which vehicle-parking-assist systems affected driver stress by using heart rate measurements along with self-reported ratings and found that participants exhibited lower average heart rates and they reported lower stress levels when using the assistive parking system.

  9. Boredom in low-task-load environment might lead to distraction (Cummings et al., Citation2013). According to Farmer and Sundberg (Citation1986) ‘Boredom is a common emotion, with boredom proneness a predisposition with important individual differences.’ (p.4). Stark and Scerbo (Citation1998) found significant correlations between boredom proneness, workload, and complacency potential, which might indicate that the psychological state of boredom may be a factor that induces complacency (Prinzel et al., Citation2001). Also, Sawin and Scerbo (Citation1995) in their vigilance tasks study found association between boredom proneness and vigilance performance.

  10. Fatigue may affect driving performance negatively (Brown, Citation1997) and it reportedly contributes to a significant share of car accidents (Maycock, Citation1995). Prolonged driving may induce a variety of fatigue symptoms such as drowsiness, boredom, irritation, physical discomfort and daydreaming (Matthews & Desmond, Citation1998). Matthews and Desmond (Citation2002) found that fatigue induction elicited various subjective fatigue and stress symptoms, raised reported workload, increased heading error, reduced steering activity, and reduced perceptual sensitivity on a secondary detection task. Their results suggest that task-induced fatigue is associated with impaired performance evaluation in ‘underload’ and ‘interventions should be geared toward enhancing driving motivation, rather than reducing attentional demands on the driver’ (p.681). They conclude that ‘passive fatigue’ (associated with tasks requiring monitoring the environment but infrequent response) ‘may become increasingly common in intelligent vehicle highway systems as control passes from driver to vehicle, and it merits further investigation’ (p.681).

  11. The situation awareness of the driver concerning the driving context and the operational status of the system is of vital relevance. According to Endlsey’s (Citation1995) definition: ‘Situational awareness is the perception of the elements in the environment within a volume of time and space, the comprehension of their meaning and a projection of their status in the near future’ (p.36). Endsley’s (1995) model of situation awareness is arranged into three hierarchical levels: 1) Perception of the elements in the environment, 2) Comprehension of the current situation, 3) Prediction of future status. Endsley also proposes that situation awareness is discussed in terms of mode awareness, spatial awareness and time awareness. Increased automation may increase the tendency of shifting attention away from the driving task (Carsten et al., Citation2012). Endsley and Kiris (Citation1995) studied the automation of a navigation task and found that situation awareness was lower under automated conditions than under manual conditions and low situation awareness corresponded with out-of-the-loop performance decrements in decision time following a failure of the system. Based on a review of earlier studies, Stanton et al. (Citation2001) conclude that ‘loss of situational awareness is correlated with poor system performance’ and ‘people who have lost situational awareness may be slower to detect problems with the system they are controlling as well as requiring additional time to diagnose problems and conduct remedial activities when they are finally detected’ (p.199). Heikoop et al. (Citation2018) in a driving simulator study, investigated the impact of a mental secondary task on driver situation awareness during platoon driving and found that mental demand impaired situation awareness. Based on a meta-analysis and narrative review of simulator and on-road studies, de Winter et al. (Citation2014) concluded that highly automated driving can improve situation awareness compared to manual driving if drivers are motivated or instructed to detect objects in the environment. However, if drivers are engaged in non-driving tasks, situation awareness deteriorates.

  12. The ‘out-of-the-loop’ performance problem making the driver handicapped in his/her ability to take over in the event of automation failure is attributed to loss of situation awareness and skill degradation, which leads to declining operator performance (Endsley & Kiris, Citation1995). There is no failure free system (Perrow, Citation1984). If reliability is below 70 per cent, it is better having no automation at all (Wickens & Dixon, Citation2007). Automation failure detection better with varying automation reliability (Parasuraman et al., Citation1993). When exposed to automation failure drivers perform better with a lower level of automation (Nilsson, Citation1995; Vollrath et al., Citation2011). Complacency and over-reliance may cause loss of situation awareness leading to errors when automation fails (Parasuraman & Manzey, Citation2010).

  13. Automation-related complacency may occur when the operator has to perform both manual tasks and supervise automation. It can be described in terms of an attention allocation strategy where the operator may attend to manual tasks at the expense of the automated task, especially when task load is high (Parasuraman & Manzey, Citation2010). Wiener (Citation1981) defined complacency as ‘a psychological state characterized by a low index of suspicion’ (p.117). Parasuraman and Manzey (Citation2010) concluded that ‘attention allocation away from the automated task associated with complacency may include not only fixation failures but attention failures as well’ (p.389). Complacency may lead to the consequence that a system malfunction, anomalous condition, or failure is missed or reaction is delayed (Parasuraman & Manzey, Citation2010). Parasuraman et al. (Citation1993) reported considerable individual differences in the performance effects associated with automation complacency. Also, Prinzel et al. (Citation2001), investigating the relationship between individual differences of complacency potential, boredom proneness and automation-induced complacency, found that personality individual differences are related to whether an individual will succumb to automation induced complacency. de Winter et al. (Citation2014) in their literature review found indications that highly automated driving evoke complacent behavior For example, not disengaging the automated driving system, in a speed limit violations scenario, thereby allowing the vehicle to drive above the posted speed limit.

  14. Automation bias is reflected in omission errors (the user fails to respond to a critical situation because the automation aid failed to notify him/her) and commission errors (following a wrong recommendation) made by operators when decision aids are imperfect (Parasuraman & Manzey, Citation2010). According to Mosier and Skitka (Citation1996) automation bias is characterized by that operators use the decision aid as a heuristic replacement for vigilant information seeking and processing and commission errors can be the result of not seeking out confirmatory or disconfirmatory information or discounting other sources of information in the presence of computer-generated cues. Automation bias can lead to decisions that are not based on a thorough analysis of all available information but that are biased by the advice of decision aid and it can compromise performance considerably in case of automation failures (Parasuraman & Manzey, Citation2010). One reason for automation bias is that users have a tendency to ascribe greater power and authority to automated aids than to other sources of advice (Parasuraman & Manzey, Citation2010). User trust in automated aids as powerful agents with superior capability might make users to overestimate the performance of the aids as they may ascribe to the aid greater authority than to other humans or themselves (Parasuraman & Manzey, Citation2010). Another contributory factor to automation bias is the phenomenon of diffusion of responsibility. When sharing monitoring and decision-making tasks with an automated aid (or other humans in a group) humans may reduce their own effort compared to when they work individually on a given task. The operator may perceive him/herself as less responsible for the outcome and, as a consequence, reduce his/her own effort in monitoring and analyzing other available information (Domeinski et al., Citation2007).

  15. In a review of empirical studies of complacency and bias in human interaction with automated and decision support systems, Parasuraman and Manzey (Citation2010) found that complacency and automation bias represent different manifestations of overlapping automation-induced phenomena, where attention plays a central role. Further, they found that automation complacency and automation bias occurred in both naive and expert operators and it cannot be prevented by training or instructions. An integrated model of complacency and automation bias put forward by Parasuraman and Manzey (Citation2010) implies that these issues result from the dynamic interaction of personal, situation and automation-related characteristics.

  16. Usability of a product is of importance for its success among potential users. There are various definitions of usability, but as Harvey et al. (Citation2011) expressed, ‘consideration of the context of use is essential in defining usability criteria and this will be different for each system under investigation’ (p.563).

  17. Acceptance is a key factor for intended use of new technology in the vehicle (Adell et al., Citation2014a). However, as Adell et al. (Citation2014b) put it: ‘Despite the recognized importance of acceptance there is no established definition of acceptance, and there are almost as many ways to measure acceptance as there are researchers trying to do so’ (p.73).

A literature search in Scopus and Web of science revealed that numerous driving simulator test with automated driving have been carried out studying different aspects of automation, such as:

These studies cover various aspects of driver-automation interaction, but there are very few studies investigating the interaction of the driver/automation with surrounding vehicles/drivers. Gouy et al. (Citation2014) examined ‘contagion’ effects from the short time headway held in platoons on the drivers of unequipped vehicles. Schopf et al. (Citation2018) studied cooperative lane-change maneuvers in automated motorway driving and Kauffmann et al. (Citation2018) studied which aspects of lane change behavior of an automated vehicle make them cooperative from the perspective of other drivers.

Kyriakidis et al. (Citation2019), based on interviews conducted with 12 expert researchers in the field of human factors of automated driving concluded, among others, that the researchers underpinned the importance of studies on public roads to validate findings from simulator experiments. Besides, real life studies are important for identifying situations relevant for automated driving applications, generating hypotheses for further studies and for studying interactions with road users around the observed driver. However, our literature search revealed that very few real life/on-road tests have been carried out with automated vehicles. One study addressed the issue of enabling a comfortable highly automated driving style by collecting driving data from participants manually driving in different road environments (Bellem et al. Citation2016). Another study investigated driver engagement in secondary task (Naujoks et al. Citation2016) and a third one investigated the learning process, as well as the development of trust, acceptance and the mental model for interacting with adaptive cruise control (Beggiato, et al. Citation2015).

Rosenfeld et al. (Citation2015) pointed out that “although individual drivers have different driving styles and preferences, current systems do not distinguish among users” and introduced an approach to combine machine learning algorithms with demographic information and behavioral driver models into existing automated assistive systems. The authors claim that their proposed approach can reduce the interactions between drivers and automated systems by adjusting parameters relevant to the operation of these systems based on their specific drivers and context of drive.

Findings from user-related real-life tests of Advanced Driver Assistance Systems (ADAS) (Várhelyi & Mäkinen, Citation2001; Hjälmdahl & Várhelyi, Citation2004a; Adell et al., Citation2008; Adell et al., Citation2011; Kaufmann & Risser, Citation2012; Várhelyi et al., Citation2015) show, that even when these systems deliver the expected positive effects, unwanted behavioral changes also can occur, altering the expected potential safety effects of the system in question. Hence, it is important to examine not just the expected effects, but also potential unwanted effects. A comprehensive evaluation for user-related assessment permitting the investigation of all the above issues would comprise tests in a naturalistic driving environment (real traffic), as well as tests in a driving simulator with naïve (normal) test drivers. Observation of driver behavior in real traffic yields the highest validity in terms of results, whereas driving simulator experiments allow for the staging of situations where, e.g., situation awareness and potential complacency can be studied. In either case, carrying out all these tests is resource- and time-consuming; hence, an evaluation plan must focus on the most rewarding tests.

Considering the available resources and time, the aim of the study described in the present paper was to assess user-related issues of automated driving functions, i.e., behavior when driving with automation on motorways, as well as user experiences, reactions and acceptance.

Method

The system

Within the AdaptIVe EU-project, a ‘motorway-automation’ system was developed. The system, installed in a passenger car, provides the following automated driving functions: (a) lane keeping and distance keeping; (b) stop & go driving; (c) lane change and overtaking. These driving functions were developed as level 3 (conditional automation) functions (SAE, 2014).

The operational conditions of the system were motorway settings with speeds between 60 and 130 km/h (while driving without the system active the test drivers also could choose higher speeds than 130 km/h) and traffic flow varying from free flow to a traffic jam. The system was to operate in all, but extreme weather conditions. The functionalities of the system included keeping the lane and the distance to the vehicle ahead and if the preceding vehicle in the same lane drove slower than the system-predefined safe speed or the driver-selected speed, an overtaking maneuver was initiated. The lane-change function monitored the areas on both sides of the vehicle and decided whether a lane change was possible. If the planned function was system-approved, then the driver was informed via a display in order to ensure mode awareness. Otherwise, the system waited until either the maneuver could safely be conducted or the need for a lane change no longer existed. Once the overtaking maneuver was completed, the car continued traveling in automated lane-keeping/vehicle-following mode. The driver could change the desired speed at any time, similarly to setting the desired speed with an active cruise control system. The system registered this new set speed and displayed it to the driver as a confirmation. The system was always activated after entering the motorway and deactivated shortly before leaving the motorway by the safety driver (who sat in the front seat, next to the test driver). The test drivers could take over on their own desire, but they only had to take over when leaving the motorway. The system announced this by voice command and a blinking LED bar (changing from green to red depending how important it was that the test driver took over). In case of any malfunction or system brake down, the safety driver would intervene.

The driver interface consisted of a screen in the instrument cluster, a navigation screen and a central status indicator screen, as well as an LED bar on the windscreen and voice response (see ). The automated driving mode could be activated by pressing the two dedicated buttons on the steering wheel at the same time, and the system could be deactivated by pressing either of the buttons again or by using the steering wheel or the gas or brake pedals. An emergency off-button allowed the driver to cut off power to the actuators in case of a system failure.

Figure 1. The interface elements in the test car: (A) instrument cluster, (B) central status indicator, (C) LED bar, (D) speech/sound.

Figure 1. The interface elements in the test car: (A) instrument cluster, (B) central status indicator, (C) LED bar, (D) speech/sound.

Research questions

Based on the relevant issues in user-related evaluation of automated driving applications listed above and considering the available evaluation tools – i.e., a fully equipped vehicle - the following research questions were formulated:

  1. Driver behavior and performance issues: (a) Do drivers use the system as it is intended to be used? (b) Do drivers use the system in all situations for which it is available? (c) Do drivers retain the system settings suggested by the system? (d) Does driver behavior differ when driving with the system active compared to driving without automation?

  2. Effects of automation on the driving task: (a) Is stress or the mental workload of the driver affected by automation? (b) Do drivers engage more in secondary tasks when driving with automation compared to driving without automation?

  3. Regaining control: (a) Do drivers detect automation failures? (b) Do drivers act in a timely and appropriate way when a safety-critical event occurs?

  4. Trust and acceptance: (a) Do drivers trust the system? (b) Do drivers find the system useful and satisfactory?

Participants

The test drivers were recruited from a test driver pool. Twenty-seven persons took part in the study: 12 males and 15 females (see ). As an incentive, the test drivers could choose between a voucher for internet shopping or an advent calendar valued at about 20 to 30 Euros.

Table 1. The number of test drivers by age and gender.

Materials

The test vehicle was equipped with data-logging facilities, and a number of variables, including speed, distance from the vehicle ahead and lane position, etc. were logged.

The test drivers were observed by means of an in-car observation method developed and validated by Risser and Brandstatter (Citation1985). Other validation work was done by Hjälmdahl and Várhelyi (Citation2004b), who showed that drivers’ speed levels with observers in the car did not differ from their speed levels without observers in the car. During the observations, two specially trained observers rode in the car with the driver. One observer (called the coding observer) registered the number of events for each standardized variable, such as:

  • Adaptation of speed at ramps (late/inappropriate),

  • Distance keeping (too close to the vehicle in front <1 second),

  • Lane keeping (drifting/lane crossing),

  • Lane change (wrong lane choice/too late/dangerous/hesitant),

  • Use of turning indicator (early/late/inappropriate/not at all).

The other observer made ‘free observations’ of distinctive events, e.g. handling the system (in critical situations), interactions with other road users, which are hard to predict, let alone standardize. In detail, interaction processes are situations where other road users are present and the behavior of one person produces stimuli for the behavior of other persons. Positive (e.g. friendly) and negative processes (e.g. ruthlessness) can be observed. Interaction can be using hand gestures but also behavior such as braking (to let other car drivers change lane) or accelerating (to show that the driver does not give way to others). Furthermore, conflicts, according to the Swedish Traffic Conflicts Technique (Laureshyn & Várhelyi, Citation2018) were observed. A traffic conflict is an event where there is a collision course with a time to collision shorter than ∼1 sec., and where an accident is avoided thanks to the evasive action of at least one of the involved road users (or in this study also by the safety driver). Interaction processes and conflicts were noted during the test ride with short key words and written down in detail immediately after the test ride.

Questionnaires were used to assess the test drivers’ experience with the system, covering the following issues:

  • Subjective workload of the test drivers was assessed via the Raw Task Load indeX (RTLX) method (Byers et al., Citation1989). According to this method, the subjects rate six different workload aspects, namely mental demand, physical demand, time pressure, performance, effort and frustration level. Continuous scales ranging from ‘very low’ (0) to ‘very high’ (10) were used. After both the first and second driving tests, the test persons filled out a short questionnaire on the workload experienced during driving. The difference in workload between driving with the system on and off was calculated for each test driver.

  • To assess trust in the system, a 6-item scale proposed by Merritt (Citation2011) was used. The item responses were measured on a 5-point Likert-type scale ranging from 1 (strongly disagree) to 5 (strongly agree). The statements were as follows: (1) ‘I believe the system is a competent performer’; (2) ‘I trust the system’; (3) ‘I have confidence in the advice given by the system’; (4) ‘I can depend on the system’; (5) ‘I can rely on the system to behave in consistent ways’; (6) ‘I can rely on the system to do its best every time I take its advice’.

  • Usefulness and satisfaction were assessed based on the method proposed by Van der Laan et al. (Citation1997). According to the method, subjects assess nine components related to usefulness and satisfaction: ‘good – bad’, ‘pleasant – unpleasant’, ‘effective – superfluous’, ‘nice – annoying’, ‘likable – irritating’, ‘useful – useless’, ‘assisting – worthless’, ‘desirable – undesirable’, ‘raising alertness – sleep inducing’ on a continuous scale.

  • The test drivers were asked to compare their driving experiences regarding safety, attention, irritation, stress, enjoyment and comfort when using the system to their driving experiences without it on a continuous scale from ‘decreased greatly’ to ‘increased greatly’, where ‘neither’ represented the middle point.

  • The participants were asked to state to what extent they thought the system would give them benefits or disadvantages regarding some given items, such as risk of being involved in an accident, risk of getting a fine, travel time, fuel consumption, enjoyment when driving, comfort and image, if they were to use the system in everyday driving. The answers were given on a continuous scale from ‘very large disadvantage’ to ‘very large benefit’, with ‘neither’ representing the middle point.

  • Questions concerning Human Machine Interaction (HMI) issues regarded activation and deactivation buttons (their intuitiveness), as well as supportiveness, comprehensibility, trustworthiness, level of distraction, and location preferences. The answers were given on a continuous scale from ‘very bad’ to ‘very good’, where ‘neither’ represented the middle point.

An additional open question on user perceptions of the system was also asked.

Design

The independent variable was the driving mode: with or without automation. The dependent variables are shown in .

Table 2. Dependent variables and data collection methods.

Traffic intensity was kept under control by choosing observation periods outside rush hours, and weather/road conditions were monitored by the observers in the car.

A within-group design was used. The test drivers drove twice along the test route. They served as their own controls. The order of driving was balanced in such a way that every other subject drove first with the system switched off and then with the system switched on. For the following subject, the order of driving was reversed. By doing this, the effects of biasing variables, such as becoming familiar with the test route, the observers and the test situation, though not eliminated, were spread evenly across the situations.

Procedure

The test route was approximately 95-km long, comprising motorway sections outside Wolfsburg in Germany. The speed limit along the test route was varying from 60 km/h to no limit. There were also variable speed limit signs along the test route that could dynamically change the limit due to weather or traffic conditions. It took, on average, 59–64 minutes to complete the drive. The test drives were accomplished during three weeks between November 2016 and January 2017. On each day, two test persons were invited, one in the morning 9:00-12:00 and one after lunch 12:30–15:30. Both test periods were outside rush hours.

Before driving, the test drivers were informed that the trial concerned the system and not them as drivers, and that all collected data would be anonymous. The drivers were instructed to drive as they normally would. The ‘safety driver’ (personnel from Ingenieurgesellschaft Auto und Verkehr) explained the technical issues of the system, how it was activated and deactivated, and that he, the safety driver, would assume control in case of an unexpected critical situation. The test drivers got no specific instruction that they should take away their hands off the steering wheel or could take their eyes off the road, but after the first activation of the system the safety driver always mentioned that they now could take their hands off and take their feet away from the pedal (this was needed in order not to deactivate the system). Before the observations began, every test person was given time (up to 10 minutes) to drive around and become familiar with the car, the system, and the situation. After the first test drive, the drivers answered a short workload questionnaire. After the second drive, a comprehensive questionnaire was filled out. For the flowchart of the procedure see .

Figure 2. A simple flowchart to illustrate the procedure.

Figure 2. A simple flowchart to illustrate the procedure.

Log-data variables were on a continuous scale; hence, to analyze differences between driving with or without automation for individual drivers, the paired sample t-test was used; whereas for the whole group of test drivers, the sign test was employed. The standardized observation variables (driver performance, use of turning indicators, speed adaptation, lane use, lane change) were analyzed both individually and on an aggregated level. The Wilcoxon (paired) sign rank test was used to analyze differences in questionnaire answers between driving with and without automation. The variables registered by the free observations as well as the answers to the open questions were analyzed through categorization.

Results

Twenty-one test persons drove both times (with and without the system active) through the complete test route. Due to time constraints and technical problems, three participants took both rides on a shorter route, while three others took the first ride on the normal test route but the second ride (one time with the system active, two times without the system) on only the shortened route. During both test drives, the weather and road conditions were mostly similar.

Driving behavior

Complete log-data for both drives (with and without the system) were available for 12 test drivers due to data loss for various technical reasons. Driving speed was averaged for the whole test route for each test person when driving with and without the system. When driving without the system, the mean driving speed of the 12 individual test drivers was between 93.8 and 124.9 km/h, while it was between 97.2 and 115.6 km/h when driving with the system active (see ). With the system active, the mean speed was lower for seven test drivers and higher for five test drivers compared to when they drove without the system, implying no statistically significant difference according to the sign test (p < 0.05). The standard deviation of the mean driving speeds when driving without the system was between 17.9 and 35.3 km/h, while it was between 18.6 and 32.7 km/h with the system active. The standard deviation was lower for 10 test drivers and higher for two test drivers compared to driving without the system, implying a statistically significant difference according to the sign test (p < 0.05). These findings indicate unchanged mean driving speeds but less variation in them when driving with the system active. The maximum speed when driving with the system active was lower for 10 test drivers and higher for two test drivers compared to driving without the system, implying a statistically significant difference according to the sign test (p < 0.05). Maximum speeds when driving without the system were between 126.7 and 178.7 km/h, while they were between 126.6 and 151.2 km/h when driving with the system active.

Table 3. Mean speed, its standard deviation and maximum speed when driving with and without the system.

Relating the mean driving speed when driving without the system to the change in mean driving speed when driving with the system active, a high correlation between them can be seen (see ): The higher the mean driving speed when driving without the system, the larger the decrease in mean speed; but also, the lower the mean driving speed when driving without the system, the larger the increase in mean speed when driving with the system active.

Figure 3. Relationship between mean driving speed when driving without the system and the change in mean driving speed when driving with the system active.

Figure 3. Relationship between mean driving speed when driving without the system and the change in mean driving speed when driving with the system active.

To illustrate the system’s effect on speeds, shows the distribution of speed when driving with and without the system for a test driver who drove at speeds up to 175 km/h without the system and who attained the highest speed of 130 km/h with the system (i.e., the system’s functional limit). An opposite example is shown in , with speed distributions of a test driver whose highest speed was 125 km/h when driving without the system, but who, when driving with the system active, attained speeds between 125–130 km/h.

Figure 4. Speed distribution when driving with and without the system active. An example: one test driver who drove at speeds up to 175 km/h without the system.

Figure 4. Speed distribution when driving with and without the system active. An example: one test driver who drove at speeds up to 175 km/h without the system.

Figure 5. Speed distribution when driving with and without the system active. An example: one test driver whose highest speed was 125 km/h when driving without the system.

Figure 5. Speed distribution when driving with and without the system active. An example: one test driver whose highest speed was 125 km/h when driving without the system.

The observers in the car registered fewer situations where inappropriate speeds were attained during rides with the system active (see ). The system always accelerated smoothly and chose a speed according to the speed limit and traffic conditions. Without the system, the test persons drove at uneven speeds, sometimes far below the limit, other times faster than the limit.

Table 4. The number of events registered by the coding observer in the car.

In car-following situations, the mean distance to the vehicle ahead when driving without the system was between 35.3 and 41.4 meters, while it was between 36.4 and 41.4 meters when driving with the system active. The mean distance increased for five drivers and decreased for four drivers; hence, no statistically significant differences according to the sign test (p < 0.05) were demonstrated. The observers in the car registered distances too close to the vehicle ahead almost 7 times more often on rides without the system compared to driving with the system active (see ).

Based on logged data, lane-keeping behavior can be described by lateral acceleration. The mean lateral acceleration decreased for eight drivers and increased for four drivers (see ); hence, no statistically significant differences according to the sign test (p < 0.05) were demonstrated. The maximum lateral acceleration decreased for seven drivers and increased for five drivers; hence, no statistically significant differences according to the sign test (p < 0.05) were found. The observers in the car registered a similar number of events regarding lane-keeping behavior on both rides (see ).

Table 5. Mean lateral acceleration, its standard deviation, and maximum lateral acceleration when driving with and without the system.

In total, fewer situations involving problems with lane changes were observed on rides with the system active (see ). Dangerous or hesitant lane changes (8 cases) were only noticed on rides without the system. Regarding lane choice, only on rides without the system was it noticed that the test persons drove for a long period in the middle lane or in the third (inner) lane, even when the right lane was empty (neglecting the prescribed use of the right lane).

Situations in which the turning indicator was not used correctly were observed three times more often on rides without the system (see ). This includes, especially, situations in which the indicator was used too late (e.g., a situation in which the test person already started to make a lane change before indicating).

Conflict situations – situations in which the test person’s car was on a collision course with another vehicle and, at the last moment, the driver, the safety driver or the driver of another vehicle took evasive action – were observed during both rides. However, while on rides without the system, 3 conflict situations were caused by the test driver, 32 conflict situations occurred on rides with the system active (see ). Most of those conflicts involved lane changes in which the system did not recognize the indicator or lane-change behavior of the other vehicle. In three situations, the system lost road marking due to sunlight reflection and drove too far to the left. In four conflict situations, the test car attempted to make a lane change while other vehicles were in that lane; in four others, the system did not recognize the vehicle ahead and nearly caused a rear-end collision.

Comments by the test drivers during driving

After using the system for some time, the participants became more positive about it, stating that they liked it and that it would be advantageous in traffic jams or on longer rides; moreover, they felt it was easy to activate and that it used the right lane as often as possible. Some participants commented that they did not know what to do with their hands or with the ‘earned’ time; others mentioned having to build up trust (and that it was difficult to do so) and that it was good that the safety driver was sitting next to them. Some participants felt scared in some critical situations, stressing how they had to look around at all times.

In some cases, the participants felt they had to apologize (with hand gestures) to other drivers for the behavior of the system (not letting other drivers enter the lane). They also believed they were hindering others, and that others were likely wondering about their driving behavior.

There were other criticisms, as one still had to concentrate (even more – both on the traffic and the system) to check traffic signs and ensure the system was working correctly. One test driver mentioned that the driver was ultimately still responsible, and thus he continued to be careful.

Questionnaire answers

Workload

Subjective workload was in general not affected by the use of the system. The test drivers only assessed their performance to decrease statistically significantly (p < 0.05) when driving with the system active (see ).

Table 6. The rates of subjective workload items and results from the Wilcoxon (paired) signed rank test between driving with and without the system.

Trust

Most participants had trust in the system. Over one-half agreed or strongly agreed that the system was a competent performer, that they could trust it, that they had confidence in its advice, and that it behaved in a consistent manner. Slightly less than one-half of the test persons agreed or strongly agreed that they could depend on the system and rely on it to do its best every time. Only one participant strongly disagreed with the statements. See for details.

Table 7. Number of answers for each statement about test drivers’ trust in the system.

Usefulness and satisfaction

The system was perceived as useful (‘useful’, ‘good’, ‘effective’, ‘assisting’, but not ‘raising alertness’) and satisfactory (‘pleasant’, ‘nice’, ‘desirable’ and ‘likable’) (see ).

Figure 6. Mean ratings of individual items constituting the system’s usefulness and satisfactoriness.

Figure 6. Mean ratings of individual items constituting the system’s usefulness and satisfactoriness.

Driving experiences

The test persons were asked to indicate whether they experienced any changes in a number of matters when driving with the system. They had to assess these issue on a continuous scale from “Decreased greatly” (−50) to “Increased greatly” (+50). Their answers reveal that they did not experience any significant changes when driving with the system in any of the mentioned aspects, see .

Figure 7. Mean ratings of all participants’ answers regarding their experienced effects of the system.

Figure 7. Mean ratings of all participants’ answers regarding their experienced effects of the system.

Expected benefits or disadvantages

The participants were asked whether or not they thought the system would give them any advantages or disadvantages in their daily driving. They were asked to put a mark on a continuous scale from “Very large disadvantage” (−50) to “Very large benefit” (+50). The main advantages according to the test drivers would be a lower risk of getting fines, enhanced comfort and better image, see . The test drivers estimated only small changes with regard to risk to be involved in an accident and fuel consumption, and they assessed hardly any changes in travel time or enjoyment when driving.

Figure 8. Mean ratings of all participants’ answers regarding expected benefits/disadvantages of the system in various aspects.

Figure 8. Mean ratings of all participants’ answers regarding expected benefits/disadvantages of the system in various aspects.

Hmi issues

The participants were asked to rate how intuitive the buttons for activation and deactivation of the system were on a continuous scale from −50 to +50. The answers revealed that they felt intuitive for the activation of the system but less intuitive for deactivation, see .

Table 8. Test drivers’ rating of buttons for activation and deactivation (-50 = Very non-intuitive/+50= Very intuitive).

The participants were of the opinion that the information displayed in the “instrument cluster” was useful, comprehensible and not distractive, see . The information given on the “central status indicator” was assessed less supportive compared to the information given in the “Instrument cluster”.

Table 9. Test drivers’ rating of the information displayed in the instrument cluster and in the central status indicator.

According to the test drivers, the most important information from the system was information about the status of the system - if it was active or not (9 comments), and the information on when the driver had to take over (9 comments). Also information about the system’s next planned maneuver (8 comments), an overview of the maneuvers (8 comments), and possible explanation why the system was reacting in a certain way (8 comments) were judged to be important. Furthermore, seven test persons thought that it was important to know when the system could be activated. Almost all test persons (24 persons) were of the opinion that the best place to display important information was the instrument cluster. As an alternative place, the head-up display was cited (7 persons). As the best channel for information about maneuvers was considered speech, see .

Table 10. Test drivers’ rating of channel for information about maneuvers (-50 = Not important at all/+50 = Very important).

Other experiences

Answering the open questions, some participants mentioned problems while overtaking other vehicles, as it took too long to accelerate and complete the overtaking process, thereby hindering the movement of other cars. It was also mentioned that the system behaved recklessly, as it did not allow other drivers to merge onto the motorway. These events could make the driver troubled due to feeling of guilt for something they could not do anything about.

Discussion

Regarding the research questions about driver behavior and performance, the following findings are of interest:

  1. The drivers used the system as it was intended to be used – no situations were observed in which the system was used in a way other than intended.

  2. The system was used in almost all situations when it offered to drive automatically. Nevertheless, some situations were observed in which the driver assumed control from the system due to impatience (e.g., long overtaking process) or a critical incident.

  3. The drivers retained the system settings suggested by the system – in other words, they did not interfere with the system settings. Only in situations in which the system did not recognize speed limit changes (variable overhead signs) did drivers manually adjust the speed limit.

  4. The test persons drove too fast given the situation significantly more often when driving without the system. This finding is in line with those of earlier field trials with ADAS systems providing speed support (Adell et al., Citation2008; Adell et al., Citation2011; Várhelyi et al., Citation2015). The system always accelerated smoothly, while some test persons maintained uneven speed behavior on rides without the system. The system maintained the correct distance to the vehicle ahead; in contrast, distance-keeping errors were registered on rides without the system. This result is also in line with those of an earlier field trial with an in-car speed limiter (Várhelyi & Mäkinen, Citation2001) a speed limit support system (active accelerator pedal) (Hjälmdahl & Várhelyi, Citation2004a) and a safe distance-keeping support system (Adell et al., Citation2011). Less indicator errors were made on rides with the system active, since the system handled this task automatically. The system also helped avoid dangerous lane changes. Since the system drove as much as possible in the right lane, it had a clear advantage with regard to the prescribed use of the right lane. No differences were observed with regard to lane-keeping behavior. This result is in line with those of an earlier field trial with an ADAS system providing distance keeping and speed support (Adell et al., Citation2011). The system did not react to other drivers attempting to make a lane change, especially in situations where merging onto motorways was occurring. In these situations, the test persons reacted better by manually reducing speed or changing lane. This finding agrees with those of Adell et al. (Citation2011), who examined the effects of system warnings of unsafe speeds or unsafe distances to the vehicle ahead and demonstrated that drivers showed worse facilitating behavior toward other drivers with the system active. Due to the system’s 130 km/h speed limit, the overtaking process required more time, leading to aborted maneuvers. In some cases, trailing cars were hindered, consequently honking, and the test person had to assume control from the system to accelerate and complete the overtaking process.

Regarding the research questions about the effects of automation on the driving task, the following was found:

  1. Some of the test persons believed it was more comfortable and less stressful to drive with the system, while others held that the opposite was true. The main reason for this discrepancy was that some of the participants did not (fully) trust the system and needed more time to do so. Therefore, these participants felt they needed to observe more functions (both traffic and system functions) on rides with the system active; otherwise, they felt they had to be ready to assume control at any time, and experienced greater stress as a result. Subjective workload was in general not affected by use of the system. Only one’s own driving performance was assessed to decrease statistically significantly (p < 0.05) when driving with the system active. This finding is in line with earlier findings on workload when driving with ADAS (see e.g. Adell et al., Citation2008; Adell et al., Citation2011; Várhelyi et al., Citation2015), where drivers felt their driving was ‘supervised’ and they could not ‘perform’ as they would have without driving support.

  2. Concerning drivers’ engagement in secondary tasks when driving with automation, no differences between the two rides were found. The test persons talked with the safety driver on both rides, turning their heads toward him. Some test persons commented on having more time to, e.g., enjoy the view, whereas others were unclear about what to do with their ‘earned’ time.

Regarding the research questions concerning regaining control, the following was found:

  1. The test persons did detect system failures. However, in some cases, they did not have any reaction (e.g., sudden braking behind a truck wrongly recognized by the system as being an obstacle) and the safety driver assumed control by braking or steering.

  2. Concerning drivers’ decision making during a safety-critical event, the test persons on both rides reacted well in conflict situations and took evasive actions on their own. Nevertheless, some conflict situations were resolved only by intervention by the safety driver.

Regarding the research questions concerning drivers’ trust and acceptance, the following was found:

  1. Most participants exhibited trust in the system. Nevertheless, the answers to the open questions also demonstrated that trust had to be built first; otherwise, using the system was more stressful. Considering that the test persons were using the system for the first time and that problems with the system were encountered on nearly every ride (conflicts or critical situations), the test persons were observed to continuously check the system functions, checking whether lane changes were appropriate and speed limits were maintained. Some participants began to trust the system after some time, while others did not fully trust the system during the whole ride.

  2. The system was perceived as being both useful and satisfactory. In comparison, earlier studies of ADASs found them to be useful but their satisfactoriness was determined to not be statistically significantly different from zero – see e.g. Hjälmdahl and Várhelyi (Citation2004a), who assessed a haptic speed limit support system; Adell et al. (Citation2008), who assessed auditory and haptic systems for speed support; Adell et al. (Citation2011), who assessed a system warning of unsafe speed or unsafe distance to the vehicle ahead; and Várhelyi et al. (Citation2015), who assessed a system providing safe speed and distance support.

As a limitation of the study, it could be argued that log-data from only 12 persons is insufficient, but these data were only used to analyze driving speed, distance keeping and lane keeping. These, and other variables were also observed by the human observers in the car, and data from the observers were available for 21 test drivers driving both with and without automation. Moreover, the results from both data collection sources are aligned.

Conclusions

The system affected driving positively in several ways: It accelerated more smoothly and chose a speed in accordance with the speed limit and traffic conditions. It made fewer dangerous lane changes, kept the correct distance from other vehicles, and maintained the prescribed use of the right lane. No differences were observed with regard to lane-keeping behavior or subjective workload. The effects of the system on the observed variables are illustrated in . However, the system did not react when other drivers were attempting to make a lane change, especially when involving a merger onto the motorway. The overtaking process occasionally took a long time, leading to aborted maneuvers. Conflicts and sudden braking were observed more often on rides with the system active.

Table 11. The effects of the system on the observed variables.

The system was perceived as both useful and satisfactory. Driving comfort was assessed to increase. Some test persons stated that it was more comfortable and less stressful to drive with the system, while others expressed feeling more stressed and having to observe even more functions (both traffic and system functions) to be prepared to assume control at any time.

The phenomenon of “feeling guilty” arose when the system behaved recklessly and it did not allow other drivers to merge onto the motorway or hindered other cars behind due to lengthy overtaking.

Further research and development needs for improving automated driving on motorways include improvements in sensor functions to make them more robust, greater sensing of the alignment with the driving lane in adverse environmental conditions, and better recognition of other vehicles’ intention to enter the system vehicle’s lane.

Recommendations with respect to future user-related assessments of automated driving functions include:

  • The system to be assessed in user-related tests should be flawless.

  • The final test should be in a real-life setting where naïve drivers can drive on public roads, but should also occur in an advanced driving simulator to allow staged situations wherein situation awareness and potential complacency can be studied.

  • Since this field trial only looked at short-term effects on driver behavior, and any possible behavioral adaptations would likely only appear after a familiarization period (Draskóczy, Citation1994), longer-term studies are needed to monitor such effects.

Acknowledgments

This paper was built on research financed by the European Commission and Seventh Framework Programme. Special thanks to Jens Lauer, Markus Robert and Nina Schulz for their valuable assistance during the data collection.

References