Abstract:
A computer-implemented method for monitoring and notifying a user of changes to characteristics of interest list items. The method includes recording within a memory of a user device a specification of a user’s interest list items and characteristics thereof, and a specification of the user’s criteria for notification of changes to the characteristics. The method includes receiving at the user device update information including updated characteristics of interest list items, comparing the update information with the recorded specifications, and determining that the user’s criteria for notification of changes to characteristics of interest list items are satisfied. In response, a notification of the changes is generated for the user. The method further includes replacing the recorded characteristics with the updated characteristics of the interest list items.
Abstract:
Various embodiments contemplate systems and methods for performing automatic speech recognition (ASR) and natural language understanding (NLU) that enable high accuracy recognition and understanding of freely spoken utterances which may contain proper names and similar entities. The proper name entities may contain or be comprised wholly of words that are not present in the vocabularies of these systems as normally constituted. Recognition of the other words in the utterances in question, e.g. words that are not part of the proper name entities, may occur at regular, high recognition accuracy. Various embodiments provide as output not only accurately transcribed running text of the complete utterance, but also a symbolic representation of the meaning of the input, including appropriate symbolic representations of proper name entities, adequate to allow a computer system to respond appropriately to the spoken request without further analysis of the user's input.
Abstract:
Various embodiments contemplate systems, architectures and methods for extracting and selecting headshots of human or non-human entities from catalogs of images of such subjects. The methods described may find and extract faces from within groups of subjects, verify that the extracted faces correspond to the desired subject, determine cropping or masking regions, or both, of rectangular, circular, elliptical or some other geometry to provide an easily recognized image of the desired subject, expand the output image by synthesizing pixels as may be needed for a desired cropping or masking region, select preferred images among a collection of images of the desired subject, and perform other useful functions. The resulting output images may be in either direct form, reference form or both forms.
Abstract:
Various embodiments contemplate systems and methods for performing automatic speech recognition (ASR) and natural language understanding (NLU) that enable high accuracy recognition and understanding of freely spoken utterances which may contain proper names and similar entities. The proper name entities may contain or be comprised wholly of words that are not present in the vocabularies of these systems as normally constituted. Recognition of the other words in the utterances in question, e.g. words that are not part of the proper name entities, may occur at regular, high recognition accuracy. Various embodiments provide as output not only accurately transcribed running text of the complete utterance, but also a symbolic representation of the meaning of the input, including appropriate symbolic representations of proper name entities, adequate to allow a computer system to respond appropriately to the spoken request without further analysis of the user's input.
Abstract:
Various embodiments contemplate systems and methods for performing automatic speech recognition (ASR) and natural language understanding (NLU) that enable high accuracy recognition and understanding of freely spoken utterances which may contain proper names and similar entities. The proper name entities may contain or be comprised wholly of words that are not present in the vocabularies of these systems as normally constituted. Recognition of the other words in the utterances in question, e.g. words that are not part of the proper name entities, may occur at regular, high recognition accuracy. Various embodiments provide as output not only accurately transcribed running text of the complete utterance, but also a symbolic representation of the meaning of the input, including appropriate symbolic representations of proper name entities, adequate to allow a computer system to respond appropriately to the spoken request without further analysis of the user's input.
Abstract:
A system (100) for enabling a user to select media content in an entertainment environment, comprising a remote control device (110) having a set of user-activated keys and a speech activation circuit adapted to enable a speech signal; a speech engine (160) comprising a speech recognizer (170); an application wrapper (180) configured to recognize substantive meaning in the speech signal; and a media content controller (190) configured to select media content. Every function that can be executed by activation of the user-activated keys can also be executed by the speech engine (160) in response to the recognized substantive meaning.
Abstract:
Various embodiments contemplate systems and methods for performing automatic speech recognition (ASR) and natural language understanding (NLU) that enable high accuracy recognition and understanding of freely spoken utterances which may contain proper names and similar entities. The proper name entities may contain or be comprised wholly of words that are not present in the vocabularies of these systems as normally constituted. Recognition of the other words in the utterances in question—e.g., words that are not part of the proper name entities—may occur at regular, high recognition accuracy. Various embodiments provide as output not only accurately transcribed running text of the complete utterance, but also a symbolic representation of the meaning of the input, including appropriate symbolic representations of proper name entities, adequate to allow a computer system to respond appropriately to the spoken request without further analysis of the user's input.
Abstract:
Efficient empirical determination, computation, and use of an acoustic confusability measure comprises: (1) an empirically derived acoustic confusability measure, comprising a means for determining the acoustic confusability between any two textual phrases in a given language, where the measure of acoustic confusability is empirically derived from examples of the application of a specific speech recognition technology, where the procedure does not require access to the internal computational models of the speech recognition technology, and does not depend upon any particular internal structure or modeling technique, and where the procedure is based upon iterative improvement from an initial estimate; (2) techniques for efficient computation of empirically derived acoustic confusability measure, comprising means for efficient application of an acoustic confusability score, allowing practical application to very large-scale problems; and (3) a method for using acoustic confusability measures to make principled choices about which specific phrases to make recognizable by a speech recognition application.
Abstract:
A system and method are provided for improving efficiency of operation and convenience of access to a fleet of taxis, or other service vehicles, requiring rapid, on-demand dispatch to customer-determined locations. Automatic speech recognition (ASR) and/or radiolocation technology are used to automate the entry of the customer pickup location, and optionally the dropoff location and other relevant information as well. A customer speaks the pickup location into a cellular telephone which then digitizes and transmits it as a data communication to an ASR system. The ASR system decodes the digitized utterance into a pickup location which is passed to a vehicle matching and dispatch system. The vehicle matching and dispatch system matches a taxi and dispatches it to the pickup location. In one embodiment, the identified pickup location is transmitted to the customer's cellular telephone for confirmation or correction, before dispatch of the requested taxi.
Abstract:
A global speech user interface (GSUI) comprises an input system to receive a user's spoken command, a feedback system along with a set of feedback overlays to give the user information on the progress of his spoken requests, a set of visual cues on the television screen to help the user understand what he can say, a help system, and a model for navigation among applications. The interface is extensible to make it easy to add new applications.