Assistive technologies, such as screen readers or TTS, play a vital role in converting visual interfaces into audible or tactile formats for users with visual impairments. All platforms, apps, and websites must provide support for assistive technology to create a narrated representation of the interface and its interactions.
2.2.1-A
All platforms, apps, and websites must include full support for screen readers and TTS engines to create a narrated, accessible representation of the interface. This includes clear labeling of all interactive elements (buttons, links, menus).
2.2.1-B
Assistive technologies must provide options for adjusting speed, voice, verbosity, and volume. These customization options ensure that users can tailor the auditory interface to meet their individual preferences.
2.2.1-C
If a platform lacks native screen reader or TTS support, it is the responsibility of the app or website to implement these features to comply with these standards. This requirement guarantees that users always have access to assistive tools, even if the platform does not offer them natively.
2.2.1-D
Visual elements that convey meaning—such as images, logos, and icons—must include alternative text descriptions to provide context for screen readers or TTS engines. Images must not be used to present long strings of text; what is spoken via TTS should ultimately be visible or conveyed on the screen.
2.2.1-E
For video platforms, apps, and websites with focusable images (like tiles or poster art) in swimlanes, search results, asset details pages, etc., images must include additional context for screen readers, such as the title.
2.2.1-F
When using a TTS engine on connected TV video platforms, only navigable or interactable images should receive focus (e.g., tiles in a swimlane). These images must provide context for the text-to-speech engine to speak on focus, including related, visible, and unfocusable text, such as title, genre, or rating. When images are not navigable (e.g., background art on an asset details page), the interface must provide additional context to the text-to-speech engine when focused on a related object or button.