Abstract:
An offline semantic processor of a resource-constrained voice-enabled device such as a mobile device utilizes an offline grammar model with reduced resource requirements to parse voice-based queries received by the device. The offline grammar model may be generated from a larger and more comprehensive grammar model used by an online voice-based query processor, and the generation of the offline grammar model may be based upon query usage data collected from one or more users to enable a subset of more popular voice-based queries from the online grammar model to be incorporated into the offline grammar model. In addition, such a device may collect query usage data and upload such data to an online service to enable an updated offline grammar model to be generated and downloaded back to the device and thereby enable a dynamic update of the offline grammar model to be performed.
Abstract:
Data associated with a selectively offline capable voice action is locally persisted in a voice-enabled electronic device whenever such an action cannot be competed locally due to the device being offline to enable the action to later be completed after online connectivity has been restored. Synchronization with an online service and/or another electronic device, and/or retrieval of context sensitive data from an online service may be performed after online connectivity has been restored to enable the voice action to thereafter be completed.
Abstract:
Online processing of a voice input directed to a voice-enabled electronic device is selectively aborted whenever it is determined that a voice input directed to the voice-enabled electronic device can be successfully processed locally by the device. Doing so may in some instances reduce the latency of responding to a voice input.
Abstract:
A voice to text model used by a voice-enabled electronic device is dynamically and in a context-sensitive manner updated to facilitate recognition of entities that potentially may be spoken by a user in a voice input directed to the voice-enabled electronic device. The dynamic update to the voice to text model may be performed, for example, based upon processing of a first portion of a voice input, e.g., based upon detection of a particular type of voice action, and may be targeted to facilitate the recognition of entities that may occur in a later portion of the same voice input, e.g., entities that are particularly relevant to one or more parameters associated with a detected type of voice action.