Sign In

Communications of the ACM

31 - 40 of 3,299 for bentley

Dangerous Skills Got Certified: Measuring the Trustworthiness of Skill Certification in Voice Personal Assistant Platforms

With the emergence of the voice personal assistant (VPA) ecosystem, third-party developers are allowed to build new voice-apps are called skills in the Amazon Alexa platform and actions in the Google Assistant platform, respectively. For the sake of brevity, we use the term skills to describe voice-apps including Amazon skills and Google actions, unless we need to distinguish them for different VPA platforms. and publish them to the skills store, which greatly extends the functionalities of VPAs. Before a new skill becomes publicly available, that skill must pass a certification process, which verifies that it meets the necessary content and privacy policies. The trustworthiness of skill certification is of significant importance to platform providers, developers, and end users. Yet, little is known about how difficult it is for a policy-violating skill to get certified and published in VPA platforms. In this work, we study the trustworthiness of the skill certification in Amazon Alexa and Google Assistant platforms to answer three key questions: 1) Whether the skill certification process is trustworthy in terms of catching policy violations in third-party skills. 2) Whether there exist policy-violating skills published in their skills stores. 3) What are VPA users' perspectives on the skill certification and their vulnerable usage behavior when interacting with VPA devices? Over a span of 15 months, we crafted and submitted for certification 234 Amazon Alexa skills and 381 Google Assistant actions that intentionally violate content and privacy policies specified by VPA platforms. Surprisingly, we successfully got 234 (100%) policy-violating Alexa skills certified and 148 (39%) policy-violating Google actions certified. Our analysis demonstrates that policy-violating skills exist in the current skills stores, and thus users (children, in particular) are at risk when using VPA services. We conducted a user study with 203 participants to understand users' misplaced trust on VPA platforms. Unfortunately, user expectations are not being met by the skill certification in leading VPA platforms.


The Interactive Enactment of Care Technologies and its Implications for Human-Robot-Interaction in Care

Various technical innovations for the care sector, particularly robots, are being developed to cope with demographic change and to support nursing staff. A central issue for the successful integration of such technology into gerontological care practices has not yet been appropriately addressed from an HCI perspective. Here, we draw from observation of lifting devices, used to move residents between bed and chairs. We found that this process is always moderated and facilitated by caregivers’ ‘interaction work’: The function(ing) of care technology is inseparable from the interactive practices of care staff enacting these functions and from the emotional labor inherent to care practice. The caregivers’ verbal, manual and emotional actions, and also the residents’ active cooperation in the process are important factors for safe, fluid, and pleasant human-machine interaction. We propose to understand such technical care settings as a triadic interaction, and to take account of this in the future design of care technologies, in particular for robotic solutions.


PrintARface: Supporting the Exploration of Cyber-Physical Systems through Augmented Reality

The increasing functionalities and close integration of hardware and software of modern cyber-physical systems present users with distinct challenges in applying and, especially, appropriating those systems within their practices. Existing approaches to design for appropriation and the development of sociable technologies that might support users seeking to understand how to make such technologies work in a specific practice, often lack appropriate user interfaces to explain the internal and environment-related behavior of a technology. By taking the example of 3D printing, we examine how augmented reality can be used as a novel human–machine interface to ease the way for hardware-related appropriation support. Within this paper we designed, implemented and evaluated a prototype called PrintARface, that extends a physical 3D printer by incorporating virtual components. Reflections upon the evaluation of our prototype are used to provide insights that foster the development of hardware-related appropriation support by encompassing augmented reality-based human–machine interfaces.


Beyond Health Literacy: Navigating Boundaries and Relationships During High-risk Pregnancies: Challenges and Opportunities for Digital Health in North-West India

Few studies in HCI4D have examined the lived experiences of women with pregnancy complications. We conducted a qualitative study with 15 pregnant women to gain an in-depth understanding of the context in which pregnancy takes place and everyday experiences living with complications in rural North-West India. To complement our interviews, we conducted six focus groups with three pregnant women, three community health workers and three members of an NGO. Our study reveals insights about the challenges and experiences of the pregnant women with complications while navigating the physical, spatial, social and emotional aspects of antenatal care as part of complex and contradictory structures and settings of their everyday life. We argue that the design of digital health in support of pregnancy care for the Global South must center around supporting the navigational work done by the pregnant women and their families.

We summarize research in two areas including an overview of public health strategies and challenges to improve maternal health in India, and digital health in the Global South, with focus on the Indian context.


Effects of Visual Locomotion and Tactile Stimuli Duration on the Emotional Dimensions of the Cutaneous Rabbit Illusion

In this study, we assessed the emotional dimensions (valence, arousal, and dominance) of the multimodal visual-cutaneous rabbit effect. Simultaneously to the tactile bursts on the forearm, visual silhouettes of saltatorial animals (rabbit, kangaroo, spider, grasshopper, frog, and flea) were projected on the left arm. Additionally, there were two locomotion conditions: taking-off and landing. The results showed that the valence dimension (happy-unhappy) was only affected by the visual stimuli with no effect of the tactile conditions nor the locomotion phases. Arousal (excited-calm) showed a significant difference for the three tactile conditions with an interaction effect with the locomotion condition. Arousal scores were higher when the taking-off condition was associated with the intermediate duration (24 ms) and when the landing condition was associated with either the shortest duration (12 ms) or the longest duration (48 ms). There was no effect for the dominance dimension. Similar to our previous results, the valence dimension seems to be highly affected by visual information reducing any effect of tactile information, while touch can modulate the arousal dimension. This can be beneficial for designing multimodal interfaces for virtual or augmented reality.


Enhancing Affect Detection in Game-Based Learning Environments with Multimodal Conditional Generative Modeling

Accurately detecting and responding to student affect is a critical capability for adaptive learning environments. Recent years have seen growing interest in modeling student affect with multimodal sensor data. A key challenge in multimodal affect detection is dealing with data loss due to noisy, missing, or invalid multimodal features. Because multimodal affect detection often requires large quantities of data, data loss can have a strong, adverse impact on affect detector performance. To address this issue, we present a multimodal data imputation framework that utilizes conditional generative models to automatically impute posture and interaction log data from student interactions with a game-based learning environment for emergency medical training. We investigate two generative models, a Conditional Generative Adversarial Network (C-GAN) and a Conditional Variational Autoencoder (C-VAE), that are trained using a modality that has undergone varying levels of artificial data masking. The generative models are conditioned on the corresponding intact modality, enabling the data imputation process to capture the interaction between the concurrent modalities. We examine the effectiveness of the conditional generative models on imputation accuracy and its impact on the performance of affect detection. Each imputation model is evaluated using varying amounts of artificial data masking to determine how the data missingness impacts the performance of each imputation method. Results based on the modalities captured from students? interactions with the game-based learning environment indicate that deep conditional generative models within a multimodal data imputation framework yield significant benefits compared to baseline imputation techniques in terms of both imputation accuracy and affective detector performance.


Counterweight: Diversifying News Consumption

The bias of news articles can strongly affect the opinions and behaviors of readers, especially if they do not consume sets of articles that represent diverse political perspectives. To mitigate media bias and diversify news consumption, we developed Counterweight---a browser extension that presents different perspectives by recommending articles relevant to the current topic. We provide a platform to encourage a more diversified consumption of news and mitigate the negative effects of media bias.


Multi-Modal Repairs of Conversational Breakdowns in Task-Oriented Dialogs

A major problem in task-oriented conversational agents is the lack of support for the repair of conversational breakdowns. Prior studies have shown that current repair strategies for these kinds of errors are often ineffective due to: (1) the lack of transparency about the state of the system's understanding of the user's utterance; and (2) the system's limited capabilities to understand the user's verbal attempts to repair natural language understanding errors. This paper introduces SOVITE, a new multi-modal speech plus direct manipulation interface that helps users discover, identify the causes of, and recover from conversational breakdowns using the resources of existing mobile app GUIs for grounding. SOVITE displays the system's understanding of user intents using GUI screenshots, allows users to refer to third-party apps and their GUI screens in conversations as inputs for intent disambiguation, and enables users to repair breakdowns using direct manipulation on these screenshots. The results from a remote user study with 10 users using SOVITE in 7 scenarios suggested that SOVITE's approach is usable and effective.


Direction-of-Voice (DoV) Estimation for Intuitive Speech Interaction with Smart Devices Ecosystems

Future homes and offices will feature increasingly dense ecosystems of IoT devices, such as smart lighting, speakers, and domestic appliances. Voice input is a natural candidate for interacting with out-of-reach and often small devices that lack full-sized physical interfaces. However, at present, voice agents generally require wake-words and device names in order to specify the target of a spoken command (e.g., 'Hey Alexa, kitchen lights to full bright-ness'). In this research, we explore whether speech alone can be used as a directional communication channel, in much the same way visual gaze specifies a focus. Instead of a device's microphones simply receiving and processing spoken commands, we suggest they also infer the Direction of Voice (DoV). Our approach innately enables voice commands with addressability (i.e., devices know if a command was directed at them) in a natural and rapid manner. We quantify the accuracy of our implementation across users, rooms, spoken phrases, and other key factors that affect performance and usability. Taken together, we believe our DoV approach demonstrates feasibility and the promise of making distributed voice interactions much more intuitive and fluid.


SurfaceFleet: Exploring Distributed Interactions Unbounded from Device, Application, User, and Time

Knowledge work increasingly spans multiple computing surfaces. Yet in status quo user experiences, content as well as tools, behaviors, and workflows are largely bound to the current device-running the current application, for the current user, and at the current moment in time. SurfaceFleet is a system and toolkit that uses resilient distributed programming techniques to explore cross-device interactions that are unbounded in these four dimensions of device, application, user, and time. As a reference implementation, we describe an interface built using SurfaceFleet that employs lightweight, semi-transparent UI elements known as Applets. Applets appear always-on-top of the operating system, application windows, and (conceptually) above the device itself. But all connections and synchronized data are virtualized and made resilient through the cloud. For example, a sharing Applet known as a Portfolio allows a user to drag and drop unbound Interaction Promises into a document. Such promises can then be fulfilled with content asynchronously, at a later time (or multiple times), from another device, and by the same or a different user.