This is an early draft. The task force intends to add more research and improved discussion. We also intend to make significant editorial changes to be in line with our style guide, including for citations.
Please feel free to let us know any research we should be looking at, as well as other comments.
This document is part of set of papers that describe accessibility issues for users with various cognitive or learning disabilities and mental health issues. See cognitive or learning disabilities issue papers for other modules.
This document is part of a set of related informative publications from the Cognitive and Learning Disabilities Accessibility Task Force (COGA TF), a joint task force of the Accessible Platform Architectures Working Group (APA WG) and the Accessibility Guidelines Working Group (AG WG) of the Web Accessibility Initiative.
Feedback on any aspect of the document is welcome. The Working Groups particularly seeks feedback on the following questions:
To comment, file an issue in the W3C coga GitHub repository. You can also send an email to public-coga-comments@w3.org (comment archive). Comments are requested by 16 July 2024. In-progress updates to the document may be viewed in the publicly visible editors' draft.
In this issue paper, we address issues for users with cognitive disabilities using conversational voice systems. Conversational voice systems, including voice menu systems and voice user interfaces, are systems with bi-directional communication in which a user:
Conversational Voice Systems may include:
It is worth noting that many crucial systems integrate voice systems, including emergency notifications, healthcare scheduling, prescription refilling, and more. With this in mind, full accessibility needs to be supported.
An example use case of a voice system used in telephone self-service may be as follows:
An example of a use case for a voice user interface may be as follows:
Voice systems are often implemented with the W3C VoiceXML 2.0 standard and supporting standards from the Voice Browser Working Group.
VoiceXML 2.0 has an appendix regarding accessibility that briefly discusses use by users with hearing and speech disabilities as well as WCAG and WAI specifications. Cognitive disability is mentioned once in regards to allowing users to control the length of time before time out. See VoiceXML2.0#accessibility for more.
Do we need to include something about training AI for VUIs or is this covered elsewhere?
Voice technology can create a number of challenges for people with cognitive disabilities, due to its heavy demands on memory and on the ability to understand and produce speech in real time. In particular, voice systems and VUIs can be inaccessible to people with cognitive disabilities that affect:
General: The user needs to recall information needed to successfully interact with the system, such as activating phrases, as well as information presented by the system during the interaction
Voice Menu Systems: Systems that rely on menus that present several choices at once may pose challenges for users with disabilities related to working memory. Such systems require users to hold multiple pieces of information, such as the number associated with an option, while processing the terms that follow. This is true of systems that require either a voice response or a key press.
Extensive lists that require strong working memory may result in users with cognitive disabilities making the wrong selection. Holding a list of 5-7 items (which is often considered the standard amount to use in working memory) may be difficult for someone with cognitive impairment/ working memory impairment.
Voice User Interfaces: For VUIs, users may be required to remember key phrases (such as activating phrases like “Hey Google”) in order to operate successfully. Users who may not be able to recall such phrases due to long-term memory impairments may not be able to operate the system.
General: When interacting with a system, if a system response is too slow, a user with disabilities related to executive functioning may not know that their input has registered, and may press the key or speak again.
Voice Menu Systems: The user needs to be able to decide when to act on a menu choice. If the user does not know how many options will be presented or if the system presents them too slowly, they may make an incorrect choice based on partial information.
Voice Menu Systems: The user may need to compare similar options such as "billing", "accounts," "sales" and decide which is the service that is best suited to solve the issue at hand. Without additional context or prior knowledge,the user is likely to select the wrong menu option.
General: If responses produced by a system are not provided in clear and accessible language, the user may have difficulty interpreting them, meaning that even if the request is appropriately processed by the system, the response may be inaccessible to the user.
General: Systems that time out may not give users with cognitive disabilities affecting processing speed sufficient time to interpret information and formulate a response. Advertisements and additional, unrequested information also increase the amount of processing required.
Voice Menu Systems: The user needs to focus on the different options and select the correct one. Having long or multi-level spoken menus without written counterparts, inserting advertisements, or otherwise including additional, unrequested information may make it harder to retain attention for users with attention-related disabilities.
General: The user needs to interpret the correct terms and match them to their needs within a certain time limit. This involves speech perception and language understanding: sounds of language are heard, interpreted and understood, within a given time.
Users with disabilities related to language and auditory perception may make mistakes in interpretation due to auditory-only input.
General: The user needs to be able to formulate a spoken response to the prompt before the system "times out" and generates another prompt.
Users who utilize assistive and augmentative communication devices (AAC) or speech-to-speech technologies may require additional time to respond before the system times out.
In directed dialog systems the user only needs to be able to speak a word or short phrase. However, increasingly, natural language systems allow the user to describe their issue in their own words. While this feature is an advantage for some users because it does not require them to remember menu options, it can be problematic for users with disabilities that impact their ability to produce spontaneous speech, such as people with aphasia, or autistic people for whom stress may impact spoken communication.
Speech recognition systems may not recognize and be responsive to inputs from users with disabilities that impact the intelligibility of their speech, such as people with Down syndrome.
Users may not be able to interact with a system that requires a verbal input but does not recognize their speech.
General: Mental health, such as anxiety levels, may also impact a user’s ability to interact with a conversational voice system. High demands to cognitive load, negative experiences with conversational systems, and interruptions can exacerbate anxiety or frustration, and decrease a user’s ability to interact with a system.
Voice Menu Systems: For users who are unable to use the automated system, it must be possible to reach a human through an easy transfer process (e.g., not by being directed to call another phone number).
For telephone self-service systems, there should be a reserved digit for requesting a human operator. The most common digit used for this purpose is "0"; however, if another digit is already in widespread use in a particular country, then that digit should always be available to get to a human agent. Systems especially should not attempt to make it difficult for users to reach an agent through the use of complex digit combinations. This could be enforced by requiring implementations to not allow the reserved digit to mean anything other than going to an operator. Other digits similarly could be used for specific reserved functions, keeping in mind that too many reserved digits will be confusing and difficult to learn. Remembering more than one or two reserved digits may be problematic for some users, but repeated verbal recitals of the reserved digits will also be distracting.
User-specific settings can be used to customize the voice user interface, keeping in mind that the available mechanisms for invoking user-specific settings are minimal in a voice interface (speech or DTMF tones). If it is difficult to set user preferences, they won't be used. Setting preferences by natural language is the most natural ("slow down!") but is not currently very common. Examples of customization include:
General:
Voice Menu Systems:
Voice User Interfaces
Standard best practices in voice user interface apply to users with cognitive disabilities, and should be followed, such as those provided by the Association for Voice Interaction Design Wiki [ AVIxD ] or ETSI ETR 096 . Some examples of generally accepted best practices in voice user interface design:
See the AVIxD wiki cited above for additional recommendation and detail.
Some specific best practices for users with cognitive disabilities include:
For example, the U.S. Telecommunications Act Section 255 Accessibility Guidelines [Section255] paragraph 1193.41 Input, control, and mechanical functions, clauses (g), (h) and (i) apply to cognitive disabilities and require that equipment should be operable without time-dependent controls, the ability to speak, and should be operable by persons with limited cognitive skills.
Recent technological developments may be helpful for users with cognitive disabilities.
Note: The above proposed solutions have been tested for users in the general population and have been shown to improve the usability of voice systems. Some of these solutions have been tested with users with cognitive disabilities, primarily in academic research contexts.
Currently VoiceXML does not directly enforce accessibility for people with cognitive disabilities. However, a considerable literature on voice user interface design exists and is in many cases very applicable to cognitive accessibility for voice systems. Developers must become aware of these resources and of the need to design systems with these users in mind.
Search terms in World Cat: