2,411
Views
3
CrossRef citations to date
0
Altmetric
Research Articles

Smart Speech Systems: A Focus Group Study on Older Adult User and Non-User Perceptions of Speech Interfaces

, ORCID Icon & ORCID Icon
Pages 1149-1161 | Received 04 Jun 2021, Accepted 20 Dec 2021, Published online: 20 Apr 2022

Abstract

Smart speech systems are becoming increasingly pervasive in society. At the same time, the number of older adults is rapidly growing. These simultaneous trends make it likely for older individuals to encounter and, in some cases, benefit from speech systems throughout later stages of life. To date, most research studies have examined older adult non-users’ opinions of speech systems, but not the sentiments of older users. To address this research gap, four focus groups were conducted to compare the perceptions and attitudes of seniors who voluntarily use and do not use speech systems across various devices. Findings suggest that older users and non-users are similar in their perception of the advantages provided this technology, factors that (could) motivate their use, common challenges faced while using these systems, and barriers to using particular features or speech systems altogether. The two groups differed in their preferences for learning how to use these systems, perception of system cost, and global perception of technology. In addition, older adult users exclusively believed speech systems to be easy to use, but also expressed concerns about information transparency and privacy. Older non-users explained that the absence of age-related declines was a barrier to use. These results may guide designers and researchers in developing, evaluating, and refining smart technologies to be used by various senior populations.

1. Introduction

Speech systems are becoming increasingly pervasive throughout society. In 2017, 13% of all United States households had a smart speaker, and this percentage is projected to increase to 55% by the year 2022 (Hayllar & Coode, Citation2018). Generally, a speech system is a technology that utilizes verbal speech to execute a specified command or action (Karat et al., Citation2012), and aids in reducing physical workload, shortening task completion times, and enhancing convenience. One major benefit of these systems is their ability to carry out a wide variety of tasks that range from simple tasks, such as dictating a message/document, initiating a phone call, turning a television channel, retrieving factual information, or providing driving directions, to more complex activities, such as purchasing an item and controlling environmental comfort (e.g., lighting and temperature). Depending on the hardware design, some systems require pressing a button first to activate the recognition feature, while others allow users to simply speak phrases or commands to use. Examples of popular commercial off-the-shelf (COTS) virtual voice assistants that are accessible on various smart devices (e.g., smartphone and smart speakers) include Google Assistant™, Siri®, Alexa®, and Cortana®. It is important to note the terms “speech” and “voice” are often used synonymously. The work described in this article focuses on speech software as opposed to the hardware/form factors of speech systems.

1.1. Speech systems and a growing aging population

At the same time as speech systems are becoming more widespread, the 60 years and older age population are rapidly growing. In 2017, there were roughly 70 million adults in this age group in the U.S. (approximately 22% of the total population), and this number is expected to increase roughly to 108 million by 2050 (approximately 28%) (United Nations, Citation2017). Thus, it is likely that older adults will increasingly encounter and interact with speech systems throughout later stages of life. In fact, the older adult age demographic may benefit in many ways from this particular technology because aging increases the probability for various perceptual, cognitive, and physical challenges that can hinder one’s ability to perform instrumental activities of daily living (IADLs), such as using a phone, shopping, arranging transportation, and managing medications (Lawton & Brody, Citation1969). For example, speech systems can verbally present information to people who experience difficulties with vision. They can also support memory by automatically performing activities such as generating lists, scheduling, and reminding and, in turn, reduce the number of (recall or intermediary) steps needed to complete a task for individuals with cognitive challenges. Finally, interactive speech systems can carry out physical actions or system changes without manual input for those with limited mobility and other impairments that affect manual dexterity.

While there are a number of potential benefits to this technology, one challenge that could create inefficiencies in older adults’ usage of speech systems is the degree to which these systems can decipher their speech. This is because aging commonly alters physical structures that influence the production of speech in older individuals. For example, weakened muscles in the lungs and tongue, stiffened vocal cords, and reduced respiratory strength all result in imprecise pronunciation and high-pitched speech, which ultimately produce greater speech variability and slower speech rates (e.g., Kalasky et al., Citation1999; Ravichander et al., Citation2010; Vipperla, Citation2011; Vipperla et al., Citation2008). A recent literature review showed that these changes in speech production can result in significantly lower recognition accuracies, or higher word error rates (WERs) for speech systems, for older adults compared to younger adults (Werner, Huang, & Pitts, Citation2019). While the capabilities of speech technologies will continue to improve with time, little is known regarding how older adults perceive and feel about speech systems in their daily lives, given the many advantages and potential limitations associated with this technology in its current form. In addition, it is unclear which factors encourage or hinder their use of these systems as a whole.

A growing body of research has begun to examine older adults’ perspectives of speech technologies. Recent studies that have investigated speech systems and older adults focus mainly on mock-up voice user interfaces or embodied agents and/or the use of these technologies via a particular medium, e.g., computer/laptop, phone, tablet, or smart speaker (see review in Stigall, Citation2019). These studies also tend to either compare older adults’ preferences for voice systems to non-voice systems, such as manual input via keyboard (both virtual and non virtual) or to robotic device input, or investigate older adults’ direct interactions with speech systems in order to address usability concerns and improve design/features (Stigall, Citation2019). Finally, previous work in this area has answered research questions with respect to older adults’ general perception of specific COTS speech systems across a spectrum of non-users, ranging from inexperienced, i.e., having no experience at all, to limited experience, i.e., having infrequent use since adoption, to abandoned, i.e., having used it once, but no longer a user.

1.2. Older adult non-users of speech systems

Five recently published studies examined older adults non-user general perceptions and attitudes towards specific COTS speech systems, and three of the five report common findings related to privacy conerns (Bonilla & Martin-Hammond, Citation2020; Cowan et al., Citation2017; Pradhan et al., Citation2020), speech/voice recognition issues (Cowan et al., Citation2017; Pradhan et al., Citation2020; Kim & Choudhury, Citation2021), and lack of knowledge of speech systems (Bonilla & Martin-Hammond, Citation2020; Kim & Choudhury, Citation2021; Pradhan et al., Citation2020).

Trajkova and Martin-Hammond (Citation2020) mainly studied the perceptions of limited experience older adult users who had an Echo™ smart speaker in their homes for at least one year or had abandoned this system. They cited perceived usability and accessibility concerns as a challenge for use, limited essential value as a common reason to abandon/limit use, and absence of disabilities as a barrier to use, as they preferred to complete tasks without assistance from systems. However, benefits to using this technology included the ability to support healthy living, connectivity between people and other resources, and support for people with disabilities. Cowan et al. (Citation2017) also studied limited experience older users (i.e., less than one month of usage) as well as those who had abandoned Siri®. This study reported several usage concerns, including lack of accurately recognizing speech with or without accent(s), lack of trust regarding reliability and consistency of the system’s performance, data privacy concerns regarding sensitive information (i.e., health and banking information), and data permanency (or “uncertainty around what happens to the user’s data and overall lack of transparency”). In addition, the lack of integration with 3rd party apps, platforms, and systems, in addition to embarrassment from use in public were considered barriers to use (Cowan et al., Citation2017). Participants in this study also commented that Siri® was most useful in non hands-free environments. In a different study, limited experience older adult users (i.e., those who had purchased Alexa® or Google HomeTM within the last year) identified privacy concerns resulting from experience, rumors, and a lack of understanding that the device might be hacked or “listening and recording conversations” as negatives, and also had a general lack of trust in these speech systems (Bonilla & Martin-Hammond Citation2020).

In contrast to these previous studies, two different studies examined non-users exposure to speech systems over a period of time. Pradhan et al. (Citation2020) compared the perceptions of technologically illiterate no experience users of the Echo™ smart system before use and also three weeks after exposure. Prior to interactions with Echo™, older adults had limited knowledge of system functionality and basic capabilities, as well as a lack of confidence in their ability to use Echo™. However, after a tutorial and a 3-week period of using Echo™, older participants perceived benefits to include ease of use, seeking information utility, and ability to support memory and expedite tasks. They also reported challenges related to voice recognition, such as misrecognition of speech, lack of specific command recall, or device response timing issues, and privacy and security concerns regarding the safety of financial information and recording of conversations (Pradhan et al., Citation2020). Similarity, Kim and Choudhury (Citation2021) investigated participants without prior experience with voice assistants, and recorded their perceptions and activities with Google Home™ over a 16-week period. Following initial interactions with Google Home™, participants felt that the technology was simple and easy to use, but also perceived themselves to have limited knowledge regarding how to interact with the device. However, throughout the 16 weeks, participants enjoyed the convenience of not needing to physically interact with the device, had little concern about personally committing errors while using the speech system, and appreciated the interactive digital companionship provided by the system. They also, however, expressed some concerns regarding the system’s lack of accuracy.

1.3. The current study

While these previous studies showcase overlapping themes in terms of how limited and no experience seniors view specific COTS smart speech systems, studies have not captured the sentiments of experienced older adult users nor have they identified the factors influencing their voluntarily use of such systems. Furthermore, the prior studies that focused on non-users of smart speech systems did not compare perceptions to a use group, but Selwyn (Citation2006) and Satchell and Dourish (Citation2009) highlight the need to collect data from both users and non-users of technology. They explain that both groups are equally important for learning about technology receptivity and usage in real-world applications. To address these research gaps, we conducted four focus groups that included older adults who both use and do not use personal speech systems for various purposes and across different devices. Fundamentally, our research question was to understand whether similarities and differences exist between older adult users and non-users of various speech systems. To this end, we examined the two groups’ general perceptions and attitudes of speech systems, and the underlying motivators and barriers between those who use versus do not use speech systems.

We hypothesized that the themes generated from the use and no-use groups would not be similar since, according to several models (e.g., Extended Technology Acceptance Model (TAM2), Unified Theory of Acceptance and Use of Technology (UTAUT), and Senior Technology Acceptance Model (STAM)), technology acceptance and adoption are often influenced by many factors that, to non-users, may not have enough weight to convince them to become users (Chen & Chan, Citation2014; Venkatesh & Davis, Citation2000; Venkatesh et al., Citation2003). For example, TAM2 and UTAUT posit that factors, such as experience, expectations (e.g., effort, performance, output quality, and result demonstrability), task relevance, and demographics (e.g., age), partially explain an individual’s intent to use a technology (Venkatesh & Davis, Citation2000; Venkatesh et al., Citation2003). The STAM highlights additional factors specific to older adults and suggests that one’s willingness to use technology to complete tasks (i.e., gerontechnoloigcal self-efficacy and anxiety), a person’s health and well-being (i.e., cognitive abilities and physical functioning), social influences (i.e., relationships), and one’s attitude towards life could all determine usage behavior (Chen & Chan, Citation2014). Secondly, we also hypothesized that the findings from our no-use group would be similar to those from other studies that investigated non-user perceptions of COTS speech technologies, given similarities in the methods employed and target populations. The knowledge generated from this research can be valuable to product and system designers who seek to gain greater insights into the perceptions, behaviors, and decisions of users and non-users of smart speech systems, and can help inform the design of future products.

2. Methods

2.1. Participants

A total of 18 self-rated healthy older adults (14 females, 4 males) between the ages of 60 and 82 years (M = 70.67, SD = 6.16), participated in the study. All were native English speakers, and all but one participant was Caucasian. Participants were recruited from independent and assisted-living communities as well as a community center in the greater Lafayette, Indiana area. Also, this study required volunteers to be at least 60 years of age and to have had some experience using speech systems in the past. Using self-reports of usage behavior, we categorized the 18 participants into “use” and “no-use” groups. Use group participants were those who willingly and regularly used personal speech systems. In particular, they voluntarily used a personal speech system at least once a day. In contrast, the no-use group consisted of individuals who did not willing use speech systems for personal use with regularity. They might have used a personal speech system only once a month or a few times a year and/or a commercial speech system in the past if required by a transaction (such as those primarily associated with retail and/or customer service that appear in the form of automated (telephone) prompts and menus).

Two separate use groups and two separate no-use groups consisted of eight and and 10 participants, respectively. All groups were equally diverse in terms of education attainment, marital status, and working status, yet were different in terms of their usage of smart technology. In particular, use group participants reported using technologies such as smartphones, smart TVs, and smart speakers, whereas the no-use group reported using only smartphones. Speech systems utilized by the use group included Alexa®, Siri®, Google AssistantTM, Cortana®, and other unidentified systems, such as those found in vehicles. While our aim was to examine a wide variety (i.e., no particular type) of speech systems, these four speech systems were the main systems mentioned by focus group participants in our study. These speech systems were primarily used by the participants in the use group leisurely (at home), but also for the purposes of wellness (for health) and efficiency/necessity (for work). Additional participant demographic information for each group is reported in (questions adapted from Mitzner et al., Citation2010). This research study was approved by the Purdue University Institutional Review Board (IRB#: 1906022274).

Table 1. Demographics of participants in focus groups.

2.2. Procedure

We conducted the focus groups in easily accessible facilities, including a senior community center, an assisted-living community, and at Purdue University. Each focus group lasted approximately 1.5 hr and had a minimum of four participants, in accordance with focus group size research (Bender & Ewbank, Citation1994; Kitzinger, Citation1996; Krueger, Citation2014; Stewart & Shamdasani, Citation2014; Twinn, Citation1998). The focus groups were conducted sequentially and data collection ended once data saturation was reached.

Each focus group followed four phases: screening, introduction, discussion, and closing. The screening phase checked participants for study eligibility and divided them into use or no-use groups. The introduction phase included a statement of the purpose of the study and its relevance to society, informed consent (signed by each participant), and a demographics questionnaire. Participants filled out a questionnaire that contained several background questions including age, gender, education, work status, self-reported mental and physical health, and technologies used in daily life. Afterwards, there was a round of brief introductions. During the discussion phase, the moderator guided the conversation using a script of probing questions. We asked two different sets of questions to the use and no-use groups, given the differences in experiences between the two groups. For users of speech systems, questions aimed to understand older adults’ interactions with speech systems as well as to inquire about how and why they began using these devices. For non-users, given their limited experiences, questions focused on their general perceptions of speech systems in addition to the reasons why this group chose not to use them. However, some questions regarding basic impressions, attitudes, and understandings of speech systems were asked to both groups (i.e., “What does [the term] ‘speech systems’ mean to you?” and “What are your initial reactions or feelings towards ‘speech systems’?”). Both lists of questions were constructed based on those asked in previous focus groups studies involving the perception of technology (i.e., Coughlin et al., Citation2007; Lees et al., Citation2005; Mitzner et al., Citation2008). Two audio recorders were used to collect responses to questions in the discussion phase. This phase was concluded when all the questions had been asked and no further comments were provided. Finally, in the closing phase of the study, participants were compensated $30 and were encouraged to ask any questions they might have had.

2.3. Data analysis

The data was first transcribed using a professional audio-to-text service. Then, thematic analysis (Braun & Clarke, Citation2006) was used to analyze the data, where two coders developed their own categories separate from previously developed categories (Hsieh & Shannon, Citation2005). Both coders read the transcribed data twice before conducting their own thematic analysis. Initially, coders analyzed the data by noting important terminology and first impressions. Once key ideas and thoughts were recorded, subthemes were developed based on the line-by-line data. Finally, subthemes were divided into relevant groupings, which allowed main themes to be developed, representative of grouped patterned data. Coders compared and discussed interpretations of data between the use and no-use groups by subthemes and themes generated, followed by final subtheme and theme category creation. Data saturation was reached when no new themes were developed by the second focus group for both the use and no-use groups (Hennink et al., Citation2017). All data was analyzed in the qualitative analysis software, NVIVO version 10.

3. Results

Five similar themes were developed from the data analysis of both the no-use and use groups transcripts, respectively, which included (a) (perceived) advantages, (b) motivators and facilitators, (c) (perceived) challenges and difficulties, (d) barriers, and (e) perception of technology. The descriptions of these five themes are discussed in .

Table 2. Descriptions of themes developed for the no-use and use groups.

With respect to the identified themes, it is important to note that since the use group voluntarily interacts with speech systems, themes (a) and (c) reflect their actual day-to-day experiences, whereas for the no-use group, these same two themes highlight their “perceived” impressions based on indirect experiences with these systems and/or their own direct limited experiences. Similarities and differences in the perception of speech systems between the no-use and use groups are showcased across the subthemes in . Supporting evidence of subthemes was reported using direct participant quotes. In particular, U or N indicate use or no-use groups, respectively, and P# and G# represent participant and group numbers, respectively. Also, in place of participant number, PF and PM indicate participant gender (i.e., female or male, respectively). Gender was only used in few cases when the automated transcript could not distinguish between participant voices. Across all four groups, this occurred for only one group (i.e., UG2 or the second use group with a total of six people—four females and two males). Overall, while main themes for both groups were very similar, subthemes partially differed across the two groups either by (1) appearing as a slightly different subtheme, (2) being categorized under different main themes, or (3) were developed as a new subtheme altogether.

Table 3. No-use and use group themes and subthemes.a

3.1. No-use group results

3.1.1. Perceived advantages

3.1.1.1. Assistive functions

Speech systems were commended by no-use participants for the variety of assistance they could provide to support daily activities. Examples include enabling communication with others, seeking information, converting verbal speech into written text (messages), aiding in navigation, controlling other devices, contacting entities during emergencies, and scheduling reminders. This diverse set of functionalities of assistive was perceived as complex, but fascinating.

P5 (G1):“They can control devices that you have [them] hooked up to”

P4 (G2): “[You] can set reminders”

3.1.1.2. Benefit to society

No-use participants perceived speech systems to provide benefits to society in particular contexts. The group noted that limited mobility individuals could use verbal commands instead of manual inputs. Also, speech systems were seen a way to aid in security by using voice or speech recognition features as passcodes.

P4 (G2): “I think it would be handy for people with limited mobility, that might need assistance. Maybe they don't have really good use of their hands or arms because of a disability or arthritis, and they could speak into… [and] says, ‘Please turn on the lights’ or ‘Answer the phone’… I see that being a very beneficial thing to people with limited mobility or mobility issues”

P3 (G2): “I think it would be a good safety feature to put on equipment if you wanted to restrict access”

3.1.2. Motivators and facilitators

3.1.2.1. Social influence

No-use group participants discussed that seeing family and friends use speech systems provided them indirect usage experiences and, to some extent, motivated them to want to use this technology. Positive observations of other people using speech systems, included assisting with the spelling of words through spellcheck features, aiding with vehicle navigation, and seeking desired information.

P4 (G2): “My son… [he] can't type it if he doesn't know how to spell it, so he speaks it into his phone and goes, ‘Oh, that's how it's spelled’”

P1 (G1): “I saw [her] using her phone a lot to talk to … when we worked together, so I thought that was pretty impressive”

3.1.2.2. Desire to be taught

Those who currently do not use speech systems indicated that they welcomed the opportunity to be taught how to, which would encourage their use. They explained that instructions through a variety of learning approaches would motivate them, such as written text (e.g., CliffsNotes or shortened textbooks) and/or visual and auditory methods, such as one-on-one or video instruction.

P4 (G1): “I would love to learn more about this kind of stuff. I don't really know where I could have the opportunity to learn about it, but I would love to learn about it”

P2 (G2): “As I grow older and I need some help, I would learn more”

Participants also explained that the instructional methods should allow for repetition and reinforcement of the material, and should limit the extent to which they have to rely on other individuals for learning.

P1 (G1): “A source so you don’t have to go to a person, bother a person, and try to get them when they’re available… and they’re hardly ever available”

3.1.2.3. Saves time

No-use focus group participants’ perception of speech systems helping people to save time was identified as a motivator.

P4 (G2): “I think when you're trying to say something long winded in a text to someone, speaking it [instead of] having to type it for yourself is very handy. It saves time when you're doing a lengthy one”

3.1.3. Perceived challenges and difficulties

3.1.3.1. Lack of accuracy and reliability

No-use group participants held the perception that speech systems often incorrectly recognize speech and thus require the users to adopt corrective strategies, such as repeating words at a slower pace and/or with more proper pronunciations.

P4 (G2): “My husband tries to talk to [the speech system] that we got for Christmas and it never understands what he says”

P4 (G2): “Doesn't get your accent”

P3 (G1): “Sometimes… [it] don't understand what I said the first time, so I'll say it slower, and try to say it clearer”

P4 (G1): “Sometimes you get a system that doesn't seem to get it right away… but other times I can talk to one and it get it”

P4 (G1): “My son…[has a] speech impediment…[e]very once in a while he'll just hand the phone to me and say, ‘Can you say such and such’ because it won't recognize him saying it but if I say it, it's more clear and it'll type it”

3.1.3.2. Privacy and trust concerns

Given that speech systems often have access to personal information and history, privacy concerns were identified as a disadvantage of using this technology by no-use participants, but not a barrier to use.

P3 (G1): “I would want to know more about it before I put anything out there that I might not want shared. I would definitely want to check on privacy, but I won't say that that would stop me, necessarily”

3.1.4. Barriers

3.1.4.1. Lack of knowledge

Participants described that a lack knowledge of the basic operations and functions of speech systems ultimately limited their ability to use them. While the no-use group expressed a desire to be taught how to use speech systems, they indicated that they currently have no knowledge of where to begin nor how to recover from any errors that may occur while using the technology.

P3 (G1): “I think I would, but I don't know how to even start going about using it”

P2 (G1): “if anything goes wrong, I don’t know why it went wrong, where it went wrong, how it went wrong, or how to fix it”

3.1.4.2. Cost

The cost of speech systems was not perceived to outweigh the benefits of adopting this technology by the no-use group.

P3 (G2): “I wouldn't spend money that way”

4 (G2): “I think the cost of them is still a little up”

3.1.4.3. No-significant age-related declines

The no-use group explained that the absence of noticeable age-related declines and/or disabilities does not currently hinder their daily activities and thus prevented them from using speech systems.

P3 (G2): “If my eyes or my hearing should decline that would be a mighty incentive”

P3 (G2): “If I had any kind of minor or major disability that would be mighty incentive”

3.1.5. Perception of technology

3.1.5.1. Awareness of increasing technological presence

No-use group participants acknowledged the increasing presence of technology in society. They reported not wanting to become dependent on technology, but did communicate that some technological awareness and literacy is needed as the world changes.

P1 (G1): “I understand it to be the future. So, then we need to know how to work with. [speech systems] and all that stuff”

P4 (G2): “All these young people that grew up with this technology, it's second-hand to them. We have to learn it”

3.1.5.2. Past technological experiences more negative

In general, no-use participants reported more negative past experiences with various types of technology such as smartphones, computers, and smart TVs. This perspective was formed based on their lack of knowledge, need for help in learning to use technology, lack of agency or ability to control technology, concerns about information security, and perception of technology interfaces being non-user friendly.

P2 (G1): “See you don’t know enough about it you don't know which button you need to push to get it turned over to where he can hear it. That's the problem when your technological illiterate, which I am”

3.2. Use group results

3.2.1. Advantages

3.2.1.1. Assistive functions

The use group discussed the numerous benefits that speech systems provide to their daily activities. This includes enabling communication with others, helping them to seek information, controlling other devices, recording notes and lists, playing music, and aiding with vehicle navigation.

P3 (G1): “Coming home, I hit that snowy, rainy, sleet-y day, okay? And I wanted to get off on Highway 30… So I picked up the phone and went to [speech system] and it got me to roads I knew… And I am so impressed. It isn't me, it was the phone”

3.2.1.2. Benefit to society

Individuals who use speech systems explained that they have seen this technology help support people who have difficulty with writing or spelling, such as those with speech impediments or learning disabilities.

PF (G2): “[M]y girlfriend is terrible at spelling, I mean terrible. So, she's the one that first showed me several years ago about pressing the talk button for texting” now she has “correct spelling or even near correct”

3.2.2. Motivators and facilitators

3.2.2.1. Social influence

Use group participants explained that witnessing and receiving suggestions from family and friends who use speech systems initially encouraged their own personal usage.

PF (G2): “'Okay, if they can do it, I can do it, you know?' And I started using it more”

PF (G2): “'You mean to say you don't use that?' And suddenly being introduced to it was that easy”

PF (G2): “But that is, we see them doing some? ‘Oh, okay, I can learn that’”

P4 (G1): “My daughter told me; you might as well get a phone to practice with because that's the only way you're going to learn”

P2 (G1): “Wanting to be on top of everything because everybody else is”

Use group participants also reported that family and friends had helped them learn to use speech systems.

PF (G2): “'Hey, did you know you could do this?' And you know, then [you] starting to try it, you know? Yes, I knew it, but I just never bothered to try to learn it, I guess. So, a lot of it is where other people show someone else how they can make their life easier”

3.2.2.2. Desire to learn

Participants who currently use speech systems reported that they were self-motivated to learn this technology in the beginning. Their curiosity led them to actively seek information about and initiate interactions with speech systems to further their own knowledge and facilitate their ability to use various functions.

P4 (G1): “I was curious about that button and then it had that mic on there and I wanted to see what it was going to do”

P3 (G1): “Just wanted to learn technology”

In addition, participants associated using technology with appearing more youthful throughout later stages of life.

PF (G2): “Setting up my [speech system] for me, he said 'I'm surprised that you want to learn this and you want to do this at your age'… Yeah seriously, hey, you know I'm not dead. I'm still alive”

PF (G2): “But I'm just thinking in terms of our age group that I tend to think that in a way keeps us younger as we've learned newer things that we can communicate, do things that our grandsons do or our grandchildren”

3.2.2.3. Saves time

Use focus group participants explained that the ability of speech systems to find information faster than the participants can themselves to be a significant motivator for using them.

PF (G2): “I like it on [speech systems] because you just say what you're looking for and it brings it up from heaven knows how many other places that you'd have to look at one group of information and page down and find what you're looking for but if you just say into. [speech system], it does all that”

3.2.2.4. Easy to use, convenient, and easily accessible

Participants expressed excitement about speech systems being easy to use and that they only needed to perform few steps to achieve their goals. To them, pressing a button to activate the system and/or simply speaking to the system was reported as much less demanding compared to carrying out several physical interactions with it. Participants also reported that speech systems reduce memory requirements as they no longer had to remember series of manual steps to achieve their goal.

PF (G2): “I think just pushing one button is far easier than having to learn a series of steps to push several time…or the number…we don’t remember the station numbers”

3.2.2.5. Cost

The relatively low cost of purchasing and using speech systems were reported as motivators that initiated and sustained the use of speech systems by the use group participants. Some commented that companies highlight their affordability by offering the systems as part of a packaged deal.

P3 (G1): “Friends and also I see where they're trying to sell the device, the package deal”

3.2.3. Challenges and difficulties

3.2.3.1. Lack of accuracy and reliability

Participants who use speech systems explained that they occasionally experience accuracy issues with the system. This was most common when using speech-to-text translation and the system failed to recognize command syntax, especially for longer dictations, such as text messages, compared to simple command phrases. In addition, participants mentioned that systems sometimes make inaccurate assumptions about spoken references.

PF (G2): “I noticed in West Lafayette [in the State of Indiana], if you're looking for something specific and you just put in Lafayette [in the State of Indiana] nine out of 10 times, you get [Lafayette in the State of] Louisiana”

Participants also expressed concerns about speech systems misbehaving as a result of software limitations or due to network and/or device connectivity issues.

PF (G2): “I connected my phone through Bluetooth and I had an address book in there as I could say call Dave, but the system started going haywire and the screen would blink and beep and then locked up my phone … I think they tried to integrate too much software for … too many vendors… [I]t’s software problems”

3.2.3.2. Privacy and trust concerns

Participants perceived speech systems to lack privacy and were concerned about the usage and storage of credit card numbers, social security numbers, and addresses. This lowered users’ trust of speech systems.

P2 (G1): “[I] [d]on't use dates, social security, nothing addresses, no numbers”

Participants also expressed a lack of trust, particularly in the navigation function of speech systems, and reported often having to reference virtual and/or physical maps to check for accuracy.

PM (G2): “My kids look at me and say, 'Dad, why do you have this map out?' I go 'Because I want to know where I'm going'”

3.2.3.3. Limited knowledge of particular features/functions

Use group participants communicated that their limited knowledge of basic and more complex features and functions of speech systems created some usability challenges. Specifically, knowledge limits in terms of the operation of the microphone icon leads to inefficient, and sometimes frustrating, interactions. While this knowledge does affect usability, it is not a barrier that completely prevents usage.

P3 (G1): “Well, I wondered where [the speech system] lived. If she listening, for sure, where's she at? Where'd she come from? Is she in La-la or some place? But I had no idea”

P3 (G1): “[L]ittle microphone is there, and I don't have to text to push the buttons. Push that microphone. Okay, one of the problems …it only takes a few words and then I have to push that microphone again. A few words, and then I have to push it again. So, I think there's something screw-y in my phone”

3.2.4. Barriers

3.2.4.1. Lack of knowledge

Participants acknowledged that they lacked knowledge about troubleshooting and the operations of specific command functions, which created some barriers in using speech systems to their fullest potential.

P2 (G1): “I don't understand how to work it”

P3 (G1): “Okay, I thought for sure they didn't cut the lights on, I wasn't quite sure… Oh that would be nice, it's like those lamps where you clap your hands”

3.2.4.2. Limited transparency of information

Use group participants also conveyed that speech systems are not always transparent regarding its activities. This was a particular concern when participants use the system to purchase an item. The systems do not always reveal the cost of an item to the user prior to completing the transaction.

PF (G2): “And [System X] will say, ‘Would you like to subscribe?’ But I think there's a cost and so I'm afraid to say yes, because I know there are costs for some of this, there are podcasts that are free but the way she words it, I'm afraid if I say yes, suddenly I'll have a charge on my [account] but it doesn't tell me, she [,the speech system,] doesn't say specifically there will be a charge”

3.2.5. Perception of technology

3.2.5.1. Concerns about privacy invasions

Use group participants had direct past experiences with illegal privacy invasions associated with other technologies they use. As such, they were concerned about hacking attempts to steal their personal information while using social media sites, personal email, or smartphones.

PF (G2): “I've got a similar thing from somebody that sent me a private message on Facebook several weeks ago that said, ‘Hey, there's this, you can apply for this urban housing grant.’ And she said, ‘I just got $100,000 and if you'd like, you know, here's the link’”

P3 (G1): “I would love to text to buy something over the Internet” but “[i]f I give them my debit card, or credit card, is everybody else sitting there writing my debit and credit card down? That's the reason I won't do it. I would love to do it. But I'm afraid to”

P4 (G1): “Well, they've got this thing now that they say that they don't have these calls and people can call you up and call your number and it will have everything that you know now. Social security number, phone tapped, everything”

3.2.5.2. Past technological experiences more positive

Use group participants reported many positive experiences with technology in general, such as helping them to save time, retrieving information easily, and supporting a host of other daily activities. Also, participants reported feeling confident in learning (to use) new technologies and were more patient with the limitations of technology.

PF (G2): “My first phone was a flip phone… it was only for emergencies and then it evolved into a smart phone and I learned something almost every day”

4. Discussion

The goal of this paper was to investigate the perceptions and attitudes of older adults who use versus do not use various applications of smart speech systems. Use and no-use focus groups were conducted to answer this research question. Overall, both groups were similar in terms of their subjective opinions. However, several key differences were revealed that provide insight into the particular factors that either encourage or hinder usage of smart speech systems.

4.1. Similarities between users and non-users

Surprisingly, our study found that users and non-users perceived speech systems very similarly in that the majority of the subthemes reported between the two groups were the same. In particular, both groups held positive attitudes (in the advantages and motivators/facilitators themes) regarding the assistive functionality and various benefits to society provided by these systems, as well as their ability to save time and satisfy others who use the technology. Negative attitudes, on the other hand, (in the challenges/difficulties and barriers themes) expressed by both groups included concerns related to lack of privacy and trust, lack of accuracy and reliability, and lack of knowledge regarding particular functions/features or the system as a whole.

Given the lack of studies that have compared attitudes towards speech systems between older users and non-users, we first associate our findings with prior research that explored older adult user and non-user perceptions regarding combinations of various technologies, such as phones and computers. In general, many of these previous studies also report such similiarities in the perception of users and non-users for their technologies of interest (e.g., Fausset et al., Citation2013; Heinz et al., Citation2013; Hill et al., Citation2015; Mitzner et al., Citation2010). For example, positive attitudes for both groups with respect to technology as a whole, included supporting everyday tasks, aiding with physical limitations, such as limited mobility (Hill et al., Citation2015), containing useful features (Mitzner et al., Citation2010), and assisting with instrumental activities of daily living (IADL) (Heinz et al., Citation2013; Hill et al., Citation2015). In contrast, common negative attitudes reported for users and non-users in these studies encompassed concerns about security (Fausset et al., Citation2013; Hill et al., Citation2015; Mitzner et al., Citation2010) and usability issues (Heinz et al., Citation2013). The fact that our study findings also reflect these same general sentiments, which are independent of any particular application or device, reveals similiarities in the ways that older adults think about technologies that provide specific utilities and/or offer support for various common tasks.

We also identified unique impressions not reported by previous studies. Particularly, social influence as a motivator/facilitator and lack of accuracy and reliability as a challenge/difficulty were identified as subthemes in both the use and no-use groups. While social influence has been cited as an important factor for technology acceptance (i.e., Senior Technology Acceptance Model (STAM), Extended Technology Acceptance Model (TAM2), and Unified Theory of Acceptance and Use of Technology (UTAUT)), this particular factor did not distinguish our use and no-use group participants (Chen & Chan, Citation2014; Venkatesh et al., Citation2003). One potential reason why social influence was mentioned by both groups could be that current speech systems represent a convenient and ubiquitous technology with increasing presence throughout society, as opposed to studies that investigated older adults general perceptions of technologies, as not all these technoloiges are pervasive. With respect to the accuracy and reliability subtheme, both older adult groups were concerned that speech systems were too sensitive to word pronunciations and enunciations (compared to other technologies). They believed that a system needs to achieve a minimum word accuracy rate (WAR) in order to execute a desired command. Although system errors are not unique to speech technology, the outcomes of speech systems can be significantly more (negatively) affected by few inaccuracies compared to other types of systems that require multiple interactions (physical steps).

Finally, one particular view that was not expressed by any of our focus group participants, but for which on-going work on conversational speech interfaces continues to investigate (Clark et al., Citation2019; Doyle et al., Citation2019), relates to anthropomorphism. This concept refers to an individual’s perception of a technology possessing human attributes (Culley & Madhavan, Citation2013). Clark et al. (Citation2019) and Doyle et al. (Citation2019) suggest that developing speech agents that emulate humanness is difficult to achieve. Thus, one reason why anthropomorphism was not mentioned in our study could be explained by the finding that users rarely assign human attributes to speech systems, but instead view them as “more formal, fact based, impersonal, and less authentic” (Doyle et al., Citation2019).

4.2. Similarities among non-users

Although the literature lacks data on the comparison of attitudes towards speech systems between older adult users and non-users, particular sentiments expressed in our study’s no-use group have also been reported in the previous studies on non-users of COTS smart speech systems (i.e., Bonilla & Martin-Hammond, Citation2020; Cowan et al., Citation2017; Pradhan et al., Citation2020; Trajkova & Martin-Hammond, Citation2020). Specifically, common perceived issues discussed in our study were privacy and trust concerns (cited as privacy concerns in Bonilla & Martin-Hammond, Citation2020; trust, data privacy, transparency, and data ownership in Cowan et al., Citation2017; and privacy and security in Pradhan et al., Citation2020), lack of accuracy and reliability (recorded as reliability and consistency of performance in Cowan et al., Citation2017 and voice recognition issues in Pradhan et al., Citation2020), and lack of knowledge (reported as lack of understanding in Bonilla & Martin-Hammond, Citation2020 and very limited knowledge in Pradhan et al., Citation2020). There was significant agreement between these current and previous study findings in that privacy and trust concerns and lack of accuracy and reliability were coveyed by all of our non-users, and six out of eight non-users also reported lack of knowledge as a barrier. These results, which distinguish users and non-users, could further be partially explained by the TAM2 and UTAUT, where expectations, in terms of outcome quality and performance, and prior experience are key factors that contribute to technology use (Venkatesh & Davis, Citation2000; Venkatesh et al., Citation2003). In other words, systems that lack reliability and accuracy and/or people who do not have adequate device knowledge could both be factors that potentially hinder use.

Other similarities between the current study’s no-use group and previous investigations of older adult non-user perceptions of speech systems were found with Trajkova and Martin-Hammond (Citation2020). In our no-use group, two particular negative perceptions were noted as barriers to use: (1) lack of significant age-related decline (reported as lack of system use due to the capacity to perform task independently of speech assistance in Trajkova & Martin-Hammond, Citation2020) and (2) cost (expensive) (cited as limited benefit and essential value in Trajkova & Martin-Hammond, Citation2020). Additional discussions regarding these particular results are provided in Section 4.2. With respect to the perceived positive aspects of speech systems, assistive functions identified in the current study was reported as common assistive functions (most popular features mentioned included music/radio, timer/reminder/alarm clock/time, and weather) in Trajkova and Martin-Hammond (Citation2020). Also, our study participants commented on the benefit to society (i.e., people with and without disabilities), which was regarded as the ability to support people with and without disabilities in Trajkova and Martin-Hammond (Citation2020).

4.2. Differences between users and non-users

While there are many similarities in the opinions of users and non-users regarding COTS speech systems in our study, several differences also exist between the two groups that may explain voluntarily use as well as avoidance behavior. In some cases, differences reflect opposite perceptions between users and non-users, while in other cases, perceptions are exclusive to only one of the two groups.

4.2.1. Opposing perceptions between users and non-users

There are several differences that could partially explain why some older adults chose to use speech system and others do not. First, both groups had different past experiences with perceptions of technology, in general. The use group had a more positive perception of technology, while the no-use group’s perception was more negative. The positive perception, which may have accumulated from collective past positive experiences with a wide variety of other types of technologies, such as smartphones and smart TVs, could have led the use group to initially start to use speech systems (e.g., Davidson & Jaccard, Citation1979; Weigel & Newman, Citation1976). On the other hand, those with negative perceptions of technology may have not been encouraged to explore this particular innovation (an application of the Adaption-Level Theory; Helson, Citation1964). An alternative explanation for this finding could be that individuals with positive attitudes towards technology may attribute this positivity to their belief in their own technological self-efficacy (Petty et al., Citation2002). In other words, older adults who voluntarily began using speech systems might have believed that they could be successful interacting with a system, whereas those who do not use this technology on their own might have not had the same level of confidence.

A related finding was that both groups expressed being interested in knowing more about speech systems. However, the no-use group emphasized a desire to be taught, whereas the use group had a desire to learn. In fact, the use group described their willingness and ability to self-learn, which was also evident by their success with having learned various other technologies. In contrast, the no-use group had comparatively fewer self-learning experiences, and explained that they “don’t know how to even start going about using it [on their own]” and experienced difficulty obtaining aid. The demographic data revealed similar ranges in education level, working status, and mental and physical self-reported health across the use and no-use groups. This similarity in the groups’ composition further suggests that the older adults who elected to use speech systems were not more capable, but likely exhibited greater self-confidence in learning to use speech systems. Supported by the STAM, self-efficacy is an important factor for determining technology use among older adults.

Interestingly, cost was identified as a subtheme for both groups, but was a motivator/facilitator for the use group and a barrier for the no-use group. Specifically, the cost of owning a personal speech system was perceived as cheap for the use group and too expensive for the no-use group. The underlying reasons for these differences are not completely apparent. Aside from those built into other technologies such as smartphones, the price of speech systems can range from low as roughly $30 to a few hundred dollars and, in some cases, are packaged with other products and services. But, the no-use group was not aware of this cost range and neither the fact that many systems do not require recurring subscriptions. Alternatively, they may assign less value to smart speech systems than the use group, as suggested by Trajkova and Martin-Hammond (Citation2020). For example, given that the no-use group perceived their own lack of knowledge to be a barrier to use, had a desire to be taught, and implied difficulty obtaining help, the benefits of using speech systems could simply not outweigh the value of the resources (i.e., time, energy, and money) that need to be invested into learning and using them.

4.2.2. Perceptions exclusive to users of speech systems

The use group perceived speech systems to be convenient, easy to use, and easily accessible, which motivated their use. In theory, speech systems are increasingly ubiquitous and simply require a person to speak a command, which could explain why these feelings were expressed. As mentioned, perceived ease of use significantly contributes to technology acceptance and usage by older adults (see STAM in Chen & Chan, Citation2014). In fact, the absence of this subtheme for the no-use group could point to their lack of knowledge regarding how these systems operate. They may conceive that phrases spoken into speech systems must be precise and may not be aware of efforts to improve accuracy and extend flexibility via machine learning and natural language processing.

Use group volunteers also identified limited transparency of information as a barrier for use of particular features. They explained that speech systems often complete tasks without revealing the specific steps and information used. This was especially a concern when purchasing items. This lack of transparency, in terms of not explicitly stating the cost of a product, prevented them from using the speech systems for particular tasks, such as shopping. Similar sentiments were shared regarding the sources that speech technology uses to find information. For example, when a user asks a question to a speech system about factual information, the informational sources and/or sites used by the system are not always shared with the user.

Even though both groups articulated concerns about privacy and trust regarding speech systems, the use group exclusively had concerns about privacy invasions regarding technology as a whole. Previous studies have also highlighted older adults’ concerns about privacy, especially related to theft of personal information, such as identity and bank details, and fear of cyberattacks (Fausset et al., Citation2013; Hill et al., Citation2015). To users of speech systems, privacy invasions were mainly thought of as illegally obtaining one’s digital property, such as personal data. Participants mentioned having experienced privacy invasions personally via email and social media hacks. They had also heard stories from family and friends about breaches, and read about electronic burglaries in various news/media outlets. These experiences seem to perpetuate fears about the potential of future invasions, and have resulted in their heightened awareness about such technological invasions.

Also, both groups commented that their lack of knowledge was a barrier preventing usage of several features/functions of speech, but the use group exclusively acknowledged that some limited knowledge was a challenge, as opposed to a barrier, to use. This lack of knowledge and/or familiarity only inhibited users from exploiting particular features/functions and/or utilizing common features/functions to their maximum potential. For example, older adult users did not currently use speech systems to adjust environmental comfort, such as lighting or temperature, because they do not know about the required hardware, software, and programming components needed to enable these functions. Thus, they considered this a barrier. However, they were able to use speech-to-text applications on their smartphones, but also lacked proper knowledge associated with the location of the microphone, the use of the microphone icon, and length of time required to press the microphone button. In this case, this limited knowledge was considered only a challenge or difficulty, since this dictation feature was still being used.

4.2.3. Perceptions exclusive to non-users of speech systems

Finally, current non-users felt that the lack of significant age-related declines (within themselves) was a barrier, as it discouraged them from using speech system to some degree. But, in reality, both users and non-users self-rated their mental and physical health similiarly (average use group ratings = 4.41 and 3.54, respectively; average no-use group ratings = 4.06 and 3.65, respectively, on a 5-point scale). Interestingly, STAM suggests that cognitive and physical health may distinguish older adult users and non-users of a particular technology (Chen & Chan, Citation2014), but that was not the case in our study. Given the similarity between the use and no-use groups in terms of self-reported mental and physical well-being, the sentiments conveyed by the no-use group suggests that they perceive speech systems to be intended for use by individuals with such age-related declines. In other words, current older non-users do not feel compelled to use speech systems, which in reality highlights their perception of the utility of speech technology as well as the types of tasks and activities they believe it is designed to support.

4.3. Limitations

Some limitations of this study should be noted. First, even though the sample size used in this study meets the minimum requirement for focus groups (Bender & Ewbank, Citation1994; Kitzinger, Citation1996; Krueger, Citation2014; Stewart & Shamdasani, Citation2014; Twinn, Citation1998), future studies may seek to recruit a larger number of participants to increase variability among participants.

With respect to demographics, more than 70% of participants in our study were females from the Midwestern United States. Future research should build upon this work by including segments of seniors that represent more diversity in national and international demographics as well as aim to balance gender, given that gender has been shown to moderate technology acceptance and usage (e.g., Gefen & Straub, Citation1997; Venkatesh & Morris, Citation2000). Relatedly, we recommend that a more comprehensive analysis of demographics (i.e., income, education, and occupation) be performed in future research to help more granularly explain the similarities and differences between the two groups. All participants were also native English speakers, but including those who speak different languages (especially given variations in accents/dialects) may highlight differences in expectations of speech systems. Our participants also self-reported having no significant impairments in visual and motor skills. However, people who do experience age-related perceptual and physical challenges likely have different perspectives on this topic, which also need to be captured in follow-up work. Finally, we focused on older adults, but future investigations may benefit from extending this work to younger populations to highlight similarities and differences in perceptions and attitudes between the different generations.

For methodology, future researchers should employ more in-depth subjective methodologies, such as unstructured interviews, and also consider the use of complementary task evaluation methods to assess user behavior (such as cognitive walkthroughs, usability tests, A/B testing, and task analyses), which can help to provide quantitative data and, in turn, a more holistic perspective and understanding of user experiences and interactions across different speech systems beyond subjective impressions.

5. Conclusion

In summary, findings from this study align well with those from pre-existing work on the attitudes of older adult users and non-users with respect to technology in general as well as those of non-users regarding smart speech systems, in particular. First, convergence of positive attitudes identified in pre-existing studies regarding various types of technologies as well as in our speech system study between older users and non-users included: assistive functionality, societal benefits, and ability to save time, while negative perceptions included: lack of accuracy/reliability, cost, and privacy concerns, as well as user’s knowledge and self-perception that hinder usage. Second, convergence was also observed between the sentiments shared by our non-user study pariticipants and non-users in previous studies on speech system regarding why older adults do not use them. Common converging findings that prevent or discouraged usage included privacy and trust concerns, lack of accuracy or reliability of the speech system, and their own lack of knowledge regarding the speech system.

Our key contribution in this paper is the delineation of differences in perceptions between users and non-users of smart speech systems. Particularly, opposing perspectives between the two groups included: the desire to learn (users) versus to be taught (non-users) how to use speech systems, more positive (users) versus negative (non-users) past technological experiences, and the perceived cost of speech systems as inexpensive (users) versus expensive (non-users). In addition, older adult users exclusively believed speech systems were easy to use and access, but also suffered from limited information transparency and privacy vulnerabilities. Non-users were exclusively deterred from using speech systems due to the absence of significant age-related declines.

There were also a number of similarities between our users and non-users. Converging positive perceptions between these two groups included: assistive functionality, ability benefit to society and save time and recognition of widespread social influence. On the other hand, converging negative perceptions included: lack of accuracy and reliability, privacy and trust concerns, as well as one’s own lack of knowledge that prevented successful usage of individual features/function or speech systems altogether.

While additional research is needed to more comprehensively describe the similiarities and differences between older adults who use versus do not use speech systems, the current findings have managerial implications for designers and researchers in the development, evaluation, and refinement of next-generation technologies to be used by senior populations, as well as for users of speech technology. In particular, for designers, the results of this research can help to inform artificial intelligence (AI) algorithms used that improve recognition accuracy rates, support the development of protocols and mechanisms to better protect the privacy of users, and promote the creation of instructional approaches and advertisements that showcase various use cases for speech technology. For researchers, this work can assist in developing and modifying technology acceptance and adoption models for older populations to include additional factors identified in this study. Finally, for current (and future) users, this effort could help them to better understand common difficulties in and barriers to use, and guide them in developing self-employed strategies to overcome particular interaction challenges.

Acknowledgments

The authors would like to thank the Tippecanoe Senior Center and the Friendship House Community in Lafayette/West Lafayette, IN for assisting with participant recruitment and hosting the focus group sessions.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work was supported, in part, by the Arvin Calspan Fellowship (recipient: Lauren Werner).

Notes on contributors

Lauren Werner

Lauren Werner is a master’s thesis candidate in the School of Industrial Engineering at Purdue University specializing in Human Factors. She received a Bachelor of Science in Brain and Behavioral Science/ Psychology from Purdue University in 2019.

Gaojian Huang

Gaojian Huang is an assistant professor in the Department of Industrial and Systems Engineering at San Jose State University. He earned an M.S. in Cognitive Psychology and a Ph.D. in Industrial Engineering from Purdue University in 2020 and 2021, respectively.

Brandon J. Pitts

Brandon J. Pitts is an assistant professor in the School of Industrial Engineering and a faculty associate with the Center on Aging and the Life Course (CALC) at Purdue University. He earned a Ph.D. in Industrial and Operations Engineering from the University of Michigan in 2016.

References

  • Bender, D. E., & Ewbank, D. (1994). The focus group as a tool for health research: Issues in design and analysis. Health Transition Review: The Cultural, Social, and Behavioural Determinants of Health, 4(1), 63–80.
  • Bonilla, K., Martin-Hammond, A. (2020). Older adults’ perceptions of intelligent voice assistant privacy, transparency, and online privacy guidelines. In Sixteenth Symposium on Usable Privacy and Security (SOUPS 2020). USENIX.
  • Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology. Qualitative Research in Psychology, 3(2), 77–101. [Database] https://doi.org/10.1191/1478088706qp063oa
  • Chen, K., & Chan, A. H. S. (2014). Gerontechnology acceptance by elderly Hong Kong Chinese: A senior technology acceptance model (STAM). Ergonomics, 57(5), 635–652. https://doi.org/10.1080/00140139.2014.895855
  • Clark, L., Pantidi, N., Cooney, O., Doyle, P., Garaialde, D., Edwards, J., Spillane, B., Gilmartin, E., Murad, C., Munteanu, C., Wade, V., Cowan, B. R. (2019). What makes a good conversation? Challenges in designing truly conversational agents. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (pp. 1–12). ACM.
  • Coughlin, J. F., D'Ambrosio, L. A., Reimer, B., & Pratt, M. R. (2007). Older adult perceptions of smart home technologies: Implications for research, policy & market innovations in healthcare. In 2007 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (pp. 1810–1815). IEEE.
  • Cowan, B. R., Pantidi, N., Coyle, D., Morrissey, K., Clarke, P., Al-Shehri, S., Earley, D., Bandeira, N. (2017). “What can I help you with?” Infrequent users' experiences of intelligent personal assistants. In Proceedings of the 19th International Conference on Human-Computer Interaction with Mobile Devices and Services (pp. 1–12). ACM.
  • Culley, K. E., & Madhavan, P. (2013). A note of caution regarding anthropomorphism in HCI agents. Computers in Human Behavior, 29(3), 577–579. https://doi.org/10.1016/j.chb.2012.11.023
  • Davidson, A. R., & Jaccard, J. J. (1979). Variables that moderate the attitude–behavior relation: Results of a longitudinal survey. Journal of Personality and Social Psychology, 37(8), 1364–1376. https://doi.org/10.1037/0022-3514.37.8.1364
  • Doyle, P. R., Edwards, J., Dumbleton, O., Clark, L., & Cowan, B. R. (2019). Mapping perceptions of humanness in intelligent personal assistant interaction. In Proceedings of the 21st International Conference on Human-Computer Interaction with Mobile Devices and Services (pp. 1–12). ACM.
  • Fausset, C. B., Harley, L., Farmer, S., & Fain, B. (2013). Older adults’ perceptions and use of technology: A novel approach. In International Conference on Universal Access in Human-Computer Interaction (pp. 51–58). Heidelberg.
  • Gefen, D., & Straub, D. W. (1997). Gender differences in the perception and use of e-mail: An extension to the technology acceptance model. MIS Quarterly, 21(4), 389–400. https://doi.org/10.2307/249720
  • Hayllar, W., & Coode, M. (2018). The talking shop (p. 3). OC&C Strategy Consultants.
  • Heinz, M., Martin, P., Margrett, J. A., Yearns, M., Franke, W., Yang, H. I., Wong, J., & Chang, C. K. (2013). Perceptions of technology among older adults. Journal of Gerontological Nursing, 39(1), 42–51.
  • Helson, H. (1964). Adaptation-level theory: An experimental and systematic approach to behavior. Harper and Row.
  • Hennink, M. M., Kaiser, B. N., & Marconi, V. C. (2017). Code saturation versus meaning saturation: How many interviews are enough? Qualitative Health Research, 27(4), 591–608.
  • Hill, R., Betts, L. R., & Gardner, S. E. (2015). Older adults' experiences and perceptions of digital technology: (Dis)empowerment, wellbeing, and inclusion. Computers in Human Behavior, 48, 415–423. https://doi.org/10.1016/j.chb.2015.01.062
  • Hsieh, H. F., & Shannon, S. E. (2005). Three approaches to qualitative content analysis. Qualitative Health Research, 15(9), 1277–1288. [Database] https://doi.org/10.1177/1049732305276687
  • Kalasky, M. A., Czaja, S. J., Sharit, J., Nair, S. N. (1999). Is speech recognition technology robust for older populations? In Proceedings of the 43rd Annual Meeting of the Human Factors and Ergonomics Society, Santa Monica, CA (pp. 123–127). SAGE. https://doi.org/10.1177/154193129904300206
  • Karat, C., Lai, J., Stewart, O., & Yankelovich, N. (2012). Speech and Language Interfaces, Applications, and Technologies. In J. A. Jacko (Ed.) The Human-Computer Interaction Handbook: Fundamentals, Evolving, Technologies, and Emerging Application (pp. 367–386). CRC press.
  • Kim, S., & Choudhury, A. (2021). Exploring older adults' perception and use of smart speaker-based voice assistants: A longitudinal study. Computers in Human Behavior, 124. https://doi.org/10.1016/j.chb.2021.106914
  • Kitzinger, J. (1996). Introducing focus groups. In N. Mays & C. Pope (Eds.), Qualitative research in health care (pp. 36–45). B. M. J. Publishing Group.
  • Krueger, R. A. (2014). Focus groups: A practical guide for applied research. SAGE publications.
  • Lawton, M. P., & Brody, E. M. (1969). Assessment of older people: Self-maintaining and instrumental activities of daily living. The Gerontologist, 9(3), 179–186.
  • Lees, F. D., Clark, P. G., Nigg, C. R., & Newman, P. (2005). Barriers to exercise behavior among older adults: A focus-group study. Journal of Aging and Physical Activity, 13(1), 23–33.
  • Mitzner, T. L., Boron, J. B., Fausset, C. B., Adams, A. E., Charness, N., Czaja, S. J., Dijkstra, K., Fisk, A. D., Rogers, W. A., & Sharit, J. (2010). Older adults talk technology: Technology usage and attitudes. Computers in Human Behavior, 26(6), 1710–1721.
  • Mitzner, T. L., Fausset, C. B., Boron, J. B., Adams, A. E., Dijkstra, K., Lee, C. C., Rogers, W. A., Fisk, A. D. (2008). Older adults' training preferences for learning to use technology. In Proceedings of the 52nd Annual Meeting of the Human Factors and Ergonomics Society, New York, NY (pp. 2047–2051). Human Factors & Ergonomics Society.
  • Petty, R. E., Briñol, P., & Tormala, Z. L. (2002). Thought confidence as a determinant of persuasion: The self-validation hypothesis. Journal of Personality and Social Psychology, 82(5), 722–741.
  • Pradhan, A., Lazar, A., & Findlater, L. (2020). Use of intelligent voice assistants by older adults with low technology use. ACM Transactions on Computer-Human Interaction, 27(4), 1–27. https://doi.org/10.1145/3373759
  • Ravichander, V., Steve, R., Joe, F. (2010). Ageing voices: The effect of changes in voice parameters on ASR performance. EURASIP Journal on Audio, Speech, and Music Processing.
  • Satchell, C., & Dourish, P. (2009, November). Beyond the user: use and non-use in HCI. In Proceedings of the 21st annual conference of the Australian computer-human interaction special interest group: Design: Open 24/7 (pp. 9–16).
  • Selwyn, N. (2006). Digital division or digital decision? A study of non-users and low-users of computers. Poetics, 34(4-5), 273–292. https://doi.org/10.1016/j.poetic.2006.05.003
  • Stewart, D. W., & Shamdasani, P. N. (2014). Focus groups: Theory and practice (Vol. 20). Sage Publications.
  • Stigall, B., Waycott, J., Baker, S., & Caine, K. (2019). Older adults' perception and use of voice user interfaces: A preliminary review of the computing literature. In Proceedings of the 31st Australian Conference on Human-Computer-Interaction (pp. 423–427). ACM.
  • Trajkova, M., Martin-Hammond, A. (2020). “Alexa is a toy”: Exploring older adults' reasons for using, limiting, and abandoning echo. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (pp. 1–13). ACM.
  • Twinn, S. (1998). An analysis of the effectiveness of focus groups as a method of qualitative data collection with Chinese populations in nursing research. Journal of Advanced Nursing, 28(3), 654–661. https://doi.org/10.1046/j.1365-2648.1998.00708.x
  • United Nations. (2017). World population ageing 2017: Highlights. Department of Economic and Social Affairs, United Nations.
  • Venkatesh, V., & Davis, F. D. (2000). A theoretical extension of the technology acceptance model: Four longitudinal field studies. Management Science, 46(2), 186–204. https://doi.org/10.1287/mnsc.46.2.186.11926
  • Venkatesh, V., & Morris, M. G. (2000). Why don't men ever stop to ask for directions? Gender, social influence, and their role in technology acceptance and usage behavior. MIS Quarterly, 24(1), 115–139. https://doi.org/10.2307/3250981
  • Venkatesh, V., Morris, M. G., Davis, G. B., & Davis, F. D. (2003). User acceptance of information technology: Toward a unified view. MIS Quarterly: Management Information Systems, 27(3), 425–478. https://doi.org/10.2307/30036540
  • Vipperla, R. (2011). Automatic speech recognition for ageing voices [Doctoral dissertation]. University of Edinburgh.
  • Vipperla, R., Renals, S., Frankel, J. (2008). Longitudinal study of ASR performance on ageing voices. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 2550–2553). SJR.
  • Weigel, R. H., & Newman, L. S. (1976). Increasing attitude-behavior correspondence by broadening the scope of the behavioral measure. Journal of Personality and Social Psychology, 33(6), 793–802. https://doi.org/10.1037/0022-3514.33.6.793
  • Werner, L., Huang, G., Pitts, B. J. (2019). Automated speech recognition systems and older adults: A literature review and synthesis. In Proceedings of the 63rd Annual Meeting of the Human Factors and Ergonomics Society, Seattle, WA (pp. 42–46). HFES. https://doi.org/10.1177/1071181319631121