Abstract:
A voice to text model used by a voice-enabled electronic device is dynamically and in a context-sensitive manner updated to facilitate recognition of entities that potentially may be spoken by a user in a voice input directed to the voice-enabled electronic device. The dynamic update to the voice to text model may be performed, for example, based upon processing of a first portion of a voice input, e.g., based upon detection of a particular type of voice action, and may be targeted to facilitate the recognition of entities that may occur in a later portion of the same voice input, e.g., entities that are particularly relevant to one or more parameters associated with a detected type of voice action.
Abstract:
Online processing of a voice input directed to a voice-enabled electronic device is selectively aborted whenever it is determined that a voice input directed to the voice-enabled electronic device can be successfully processed locally by the device. Doing so may in some instances reduce the latency of responding to a voice input.