Communication is a vital part of being—a means to affect change in the world, speak with loved ones, or simply get our needs and wants met. However, this can be challenging for people with communication impairments—a growing proportion of the population who might experience difficulties because of a stroke, autism, or sensory challenges. A growing field intends to use AI to augment language to support these users, however, a key tension is whether this can be done without reducing autonomy. Here, I discuss a piece of recent research that explores how AI can use photos to support communication.
People with communication impairments are underserved due to the nature of their disability. It is important to note that communication is not solely verbal (for example, spoken or written) but is often augmented by a complex mixture of body language, gesture, tone of voice, and objects or devices in our environment. Innovations in recent decades have seen digital Augmentative and Alternative Communication (AAC) devices move into commonplace use—with representatives such as the late astrophysicist Stephen Hawking using AAC in a prominent role. As communication has been increasingly supported by electronic means in the form of AAC devices, technologists have had an increasingly important role in supporting communication—a responsibility we should take seriously.
Until recently, AAC devices were often bespoke to each user, relatively bulky and prohibitively expensive. However, tablets and smartphones have become popular platforms of choice for those seeking to support their communication. Such devices have arguably ‘democratized’ augmented communication, allowing for similar results, at a fraction of former costs. Additionally, features of these devices might be appropriated, for instance, to support a person to tell a story about their day by showing the pictures they took. Indeed, bespoke apps that use interactive grids of symbols can support verbal output. Consequently, there is a growing research field that aims to improve the way we design AAC devices.
The accompanying paper is the culmination of an exciting line of work that considers this intersection of photography and AAC. Specifically, the authors describe the Click AAC app, which generates vocabulary for use in an AAC from pictures taken. The app generates a human-like description of photos and produces description tags that serve as the material for the AAC to use. The app is then able to pull in graphics from icon sets publicly available in many languages to assist comprehension supported by text-to-speech.
The critical contribution of this paper is the insights gained from Click AAC’s use in speech therapy sessions. In particular, the authors deploy the app in a range of settings with 20 professionals who support people with diverse communication needs. This study represents extensive data capture, with up to two months of data from continuous use by some participants. This is an impressive deployment and the authors set an example here for other researchers.
The extensive study and detailed qualitative analysis of interviews have allowed the authors to tease out nuanced challenges in using AI to support communication. They highlight the potential the immediate generation of AAC vocabulary can bring by facilitating speech and language therapists in supporting communication with AAC with previously unexplored immediacy. This allows speech and language therapists to explore unconventional topics more easily, for which it would have been complex and arduous to populate an AAC with dialogue. The authors also allude to how this work intercepts ongoing challenges in AI research. Their data uncovers the tensions of using AI to populate vocabulary. For instance, photographs considered ‘low quality’ did not represent talking topics well, resulting in challenges generating helpful vocabulary. This highlights how shortcomings or biases in models might proliferate even into the construction of language. Expressing the importance of human-AI cooperation, the authors remind us its powerful, yet unpredictable, nature impacts seemingly every domain.
When considering the future of AAC, I hope we can be cautiously optimistic. AAC has the potential to transform the quality of life of many, yet it has a historically high abandonment rate. Work which seeks to integrate established practices (for example, photography) into communication contexts has the potential to be meaningful to its users. In future work, we must learn from the experts—the AAC users, the communication partners, and the speech and language experts. We must work to establish the possibilities and the tensions with their insight. There will be transformative innovations in AI that have the potential to impact our field, but we should proceed only with close engagement with the lived expertise of key stakeholders.
Join the Discussion (0)
Become a Member or Sign In to Post a Comment