Sign In

Communications of the ACM

181 - 190 of 3,299 for bentley

"Nobody Speaks that Fast!" An Empirical Study of Speech Rate in Conversational Agents for People with Vision Impairments

The number of people with vision impairments using Conversational Agents (CAs) has increased because of the potential of this technology to support them. As many visually impaired people are accustomed to understanding fast speech, most screen readers or voice assistant systems offer speech rate settings. However, current CAs are designed to interact at a human-like speech rate without considering their accessibility. In this study, we tried to understand how people with vision impairments use CA at a fast speech rate. We conducted a 20-day in-home study that examined the CA use of 10 visually impaired people at default and fast speech rates. We investigated the difference in visually impaired people's CA use with different speech rates and their perception toward CA at each rate. Based on these findings, we suggest considerations for the future design of CA speech rate for those with visual impairments.


Effects of Persuasive Dialogues: Testing Bot Identities and Inquiry Strategies

Intelligent conversational agents, or chatbots, can take on various identities and are increasingly engaging in more human-centered conversations with persuasive goals. However, little is known about how identities and inquiry strategies influence the conversation's effectiveness. We conducted an online study involving 790 participants to be persuaded by a chatbot for charity donation. We designed a two by four factorial experiment (two chatbot identities and four inquiry strategies) where participants were randomly assigned to different conditions. Findings showed that the perceived identity of the chatbot had significant effects on the persuasion outcome (i.e., donation) and interpersonal perceptions (i.e., competence, confidence, warmth, and sincerity). Further, we identified interaction effects among perceived identities and inquiry strategies. We discuss the findings for theoretical and practical implications for developing ethical and effective persuasive chatbots. Our published data, codes, and analyses serve as the first step towards building competent ethical persuasive chatbots.


Read Between the Lines: An Empirical Measurement of Sensitive Applications of Voice Personal Assistant Systems

Voice Personal Assistant (VPA) systems such as Amazon Alexa and Google Home have been used by tens of millions of households. Recent work demonstrated proof-of-concept attacks against their voice interface to invoke unintended applications or operations. However, there is still a lack of empirical understanding of what type of third-party applications that VPA systems support, and what consequences these attacks may cause. In this paper, we perform an empirical analysis of the third-party applications of Amazon Alexa and Google Home to systematically assess the attack surfaces. A key methodology is to characterize a given application by classifying the sensitive voice commands it accepts. We develop a natural language processing tool that classifies a given voice command from two dimensions: (1) whether the voice command is designed to insert action or retrieve information; (2) whether the command is sensitive or nonsensitive. The tool combines a deep neural network and a keyword-based model, and uses Active Learning to reduce the manual labeling effort. The sensitivity classification is based on a user study (N=404) where we measure the perceived sensitivity of voice commands. A ground-truth evaluation shows that our tool achieves over 95% of accuracy for both types of classifications. We apply this tool to analyze 77,957 Amazon Alexa applications and 4,813 Google Home applications (198,199 voice commands from Amazon Alexa, 13,644 voice commands from Google Home) over two years (2018-2019). In total, we identify 19,263 sensitive “action injection” commands and 5,352 sensitive “information retrieval” commands. These commands are from 4,596 applications (5.55% out of all applications), most of which belong to the “smart home” category. While the percentage of sensitive applications is small, we show the percentage is increasing over time from 2018 to 2019.


How Do We Create a Fantabulous Password?

Although pronounceability can improve password memorability, most existing password generation approaches have not properly integrated the pronounceability of passwords in their designs. In this work, we demonstrate several shortfalls of current pronounceable password generation approaches, and then propose, ProSemPass, a new method of generating passwords that are pronounceable and semantically meaningful. In our approach, users supply initial input words and our system improves the pronounceability and meaning of the user-provided words by automatically creating a portmanteau. To measure the strength of our approach, we use attacker models, where attackers have complete knowledge of our password generation algorithms. We measure strength in guess numbers and compare those with other existing password generation approaches. Using a large-scale IRB-approved user study with 1,563 Amazon MTurkers over 9 different conditions, our approach achieves a 30% higher recall than those from current pronounceable password approaches, and is stronger than the offline guessing attack limit.


Designing self-monitoring data for chronic care

When a self-monitoring tool is developed and implemented in chronic care nurses' work, it changes the way nurses accomplish their work, creating new requirements. This article is based on a design ethnographic study that helps us understand the implications of these changes.


Interpretable subgroup discovery in treatment effect estimation with application to opioid prescribing guidelines

The dearth of prescribing guidelines for physicians is one key driver of the current opioid epidemic in the United States. In this work, we analyze medical and pharmaceutical claims data to draw insights on characteristics of patients who are more prone to adverse outcomes after an initial synthetic opioid prescription. Toward this end, we propose a generative model that allows discovery from observational data of subgroups that demonstrate an enhanced or diminished causal effect due to treatment. Our approach models these sub-populations as a mixture distribution, using sparsity to enhance interpretability, while jointly learning nonlinear predictors of the potential outcomes to better adjust for confounding. The approach leads to human interpretable insights on discovered subgroups, improving the practical utility for decision support.


A smart speaker performance measurement tool

Recently voice-controlled virtual assistants (VA) in smart speakers or smartphones have been popular. As VA provides interactive services by executing complicated processes such as speech recognition, natural language understanding, service invocation, and TTS generation jobs, its functions are performed in the cloud. However, we do not know why the response time of voice commands is slow and what is the performance bottleneck of the VA service. In this paper, we present a comprehensive VA performance measurement framework that analyzes the timing events and the response time by processing audio, video and packets. From experiments of 414 voice commands with five smart speakers and 178 commands for two VAs in smartphones, we observed that 24.9% of voice commands are completed within two seconds and 63.2% within three seconds and 36.8% of voice commands over three seconds result in poor user experiences. In particular, 96.2% of music commands and 66.7% of IoT control commands show the slow response time longer than three seconds. We found that our performance measurement tool is useful for finding the slow service such as music and news with the overhead of extracting the user intent from the voice command, the content app startup delay, and the initial playback time. Our tool shows that IoT control with a smart speaker produces the slow response time.


Ergonomic Adaptation of Robotic Movements in Human-Robot Collaboration

Musculoskeletal Disorders (MSDs) are common occupational diseases. An interesting research question is whether collaborative robots actively can minimise the risk of MSDs during collaboration. In this work ergonomic adaptation of robotic movements during human-robot collaboration is explored in a first test case, namely, adjustment of work sureface height. Vision based markerless posture estimation is used as input in combination with ergonomic assessment methods to adapt robotic movements in order to facilitate better ergonomic conditions for the human worker.


Assumptions Checked: How Families Learn About and Use the Echo Dot

Users of voice assistants often report that they fall into patterns of using their device for a limited set of interactions, like checking the weather and setting alarms. However, it's not clear if limited use is, in part, due to lack of learning about the device's functionality. We recruited 10 diverse families to participate in a one-month deployment study of the Echo Dot, enabling us to investigate: 1) which features families are aware of and engage with, and 2) how families explore, discover, and learn to use the Echo Dot. Through audio recordings of families' interactions with the device and pre- and post-deployment interviews, we find that families' breadth of use decreases steadily over time and that families learn about functionality through trial and error, asking the Echo Dot about itself, and through outside influencers such as friends and family. Formal outside learning influencers, such as manufacturer emails, are less influential. Drawing from diffusion of innovation theory, we describe how a home-based voice interface might be positioned as a near-peer to the user, and that by describing its own functionality using just-in-time learning, the home-based voice interface becomes a trustworthy learning influencer from which users can discover new functionalities.