Skip to content
SiteEmail

Gameface’s TTS module spans three layers: a platform-level synthesizer managed by the integration team, a SpeechAPI JS library that maintains a four-channel priority queue in JavaScript, and an ARIA JS library that translates DOM events into speech requests automatically. As a frontend developer, your scope is entirely within the JavaScript layer: loading the libraries, initializing the ARIA manager with the plugins you need, and annotating HTML with standard ARIA attributes.

Players with low vision rely on audio feedback to navigate menus and understand UI state during gameplay. Players who use a gamepad or keyboard exclusively have no pointer to hover over elements, making focus-based reads essential. Chat messages, mission objectives, and critical alerts can all be voiced automatically without per-feature custom logic, as long as the TTS architecture is in place from the start.


The TTS module is built from three distinct components:

  • SpeechAPI C++ library (speech_api_cpp_lib) - implements a cross-platform API for synthesizing text into audio data and playing it to the audio output. This is the engine-side concern; you do not interact with it directly.
  • SpeechAPI JS library (speech_api_js_lib) - manages a queue of speech requests across four priority channels. Runs in the JavaScript context of your UI. This is what you load and call.
  • ARIA JS library - a plugin system built on top of the SpeechAPI JS library that listens to DOM events and schedules speech requests automatically. You configure which plugins to activate and which DOM subtree to observe.

A single TTS instance supports a single voice channel. Only one request speaks at a time; others wait in queue or are interrupted depending on priority.


Knowing the end-to-end flow helps you trace issues. A speech request passes through the following steps:

  1. The game process initializes the SpeechSynthesizerManager with the needed listeners and synthesizer. Default implementations are available under DefaultSynthesizers and DefaultAudioListeners in the package.
  2. When the view is ready for bindings, the engine enables the view to use the speech module.
  3. The view loads the ARIA and SpeechAPI JS libraries.
  4. Plugins inside the ARIA library detect DOM events (hover, focus change, live region mutation) and schedule speech requests in the SpeechAPI JS library.
  5. The SpeechAPI JS library uses engine binding calls to notify the controller that a new speech request is ready.
  6. The controller updates state and notifies the ProcessorListener that a task is available.
  7. The client invokes the synthesizer on a worker thread to process the request.
  8. Once processed, the worker thread asks the manager to handle audio output via the Processor.
  9. On platforms where the synthesizer writes directly to audio output, audio plays at this point. On others, the AudioListener receives callbacks.
  10. The AudioListener schedules the resulting PCM or WAVE data for playback through the game engine’s audio system.
  11. The SpeechAPI JS library uses the controller to manage request queues and check the state of the last scheduled request.

Steps 1 and 2 are the integration team’s responsibility. Steps 3 onwards are your concern.


Script loading order is strict. The ARIA utilities must load before the plugin files, the SpeechAPI library must load before the ARIA plugins, and the manager entry point must load last. The files live in the Gameface package under uiresources/javascript/text-to-speech/. Copy that folder into your project and load the scripts in this sequence:

index.html
<!-- Step 1: ARIA infrastructure - base classes and utilities -->
<script src="../js/aria-js/cohtml-aria-utils.js"></script>
<script src="../js/aria-js/cohtml-aria-common.js"></script>
<script src="../js/aria-js/cohtml-aria-plugin.js"></script>
<!-- Step 2: SpeechAPI JS library - the priority queue manager -->
<script src="../js/speechAPI/cohtml-speech-api.js"></script>
<!-- Step 3: Individual plugins - include only the ones your UI uses -->
<script src="../js/aria-js/plugins/cohtml-aria-live-region.plugin.js"></script>
<script src="../js/aria-js/plugins/cohtml-aria-hover-read.plugin.js"></script>
<script src="../js/aria-js/plugins/cohtml-aria-focus-change.plugin.js"></script>
<!-- Step 4: ARIA manager entry point - must come last -->
<script src="../js/aria-js/cohtml-aria-observer.js"></script>
<script src="../js/aria-js/cohtml-aria-manager.js"></script>

Including a plugin file makes it available to the manager but does not activate it. Activation happens at manager initialization, covered in the next section.


After the scripts load, create a CohtmlARIAManager instance with the plugins you need, then call observe on the DOM subtree TTS should cover. The manager registers its event listeners during observe; nothing is spoken before that call.

accessibility.js
// Pass only the plugins your UI needs. Each plugin registers only the
// DOM event types it cares about, so unused plugins have zero overhead.
const ariaManager = new CohtmlARIAManager([
new CohtmlARIALiveRegionsPlugin(), // monitors aria-live regions
new CohtmlARIAFocusChangePlugin(), // reads focused elements on change
new CohtmlARIAHoverReadPlugin(), // reads aria-label of hovered elements
]);
// Observe the full document body to cover the entire UI.
// Pass a specific root element if only part of the UI needs TTS.
ariaManager.observe(document.body);

If your UI dynamically rebuilds its root (React unmounting and remounting a tree root, for example), call observe again with the new root after the rebuild. The ARIA library supports React SPAs with no special configuration; it needs only an explicit root to observe.


The ARIA plugins consume standard HTML ARIA attributes. No Gameface-specific attributes are required.

CohtmlARIAHoverReadPlugin and CohtmlARIAFocusChangePlugin both read aria-label first. If it is absent, they fall back to the element’s visible text content. Use aria-label when the element has no text (icon buttons), when the visible text is a bare number without context, or when the spoken phrasing should differ from what is displayed.

hud.html
<!-- Icon button - aria-label is the only readable text -->
<button class="toolbar-btn toolbar-btn--equip" aria-label="Equip selected item">
<img src="./icons/equip.svg" alt="" />
</button>
<!-- Value display - aria-label adds the unit context missing from the raw number -->
<div
class="stat-block stat-block--ammo"
aria-label="Ammo: 24 rounds remaining"
data-bind-value="player.ammo"
></div>

CohtmlARIALiveRegionsPlugin monitors elements marked with aria-live and speaks the element’s text content whenever it changes. The attribute value sets urgency:

ValueBehavior
politeWaits for current speech to finish before reading the update
assertiveInterrupts any in-progress speech immediately
hud.html
<!-- Objective changes queue politely behind whatever is currently speaking -->
<div
class="objective-tracker"
aria-live="polite"
data-bind-value="mission.currentObjective"
></div>
<!-- Critical combat alert - speaks immediately, interrupting anything else -->
<div
class="critical-alert"
aria-live="assertive"
data-bind-value="alerts.criticalMessage"
></div>

Restrict assertive to genuinely urgent content (imminent danger, blocking errors, critical state changes). Overusing it makes the TTS experience feel hostile for players who depend on it. Mission updates, chat messages, and status changes belong on polite.


The Gameface package includes a SampleTextToSpeech sample in the Samples folder. The JS integration examples live under uiresources/TextToSpeech/examples. This sample covers the full setup end-to-end and is the most reliable reference for how the binding layer is wired on the game side. Note that the sample is packaged only on supported platforms.