Abstract:
A system for providing navigational information to a vehicle driver. An on-board system is disposed on a vehicle and processes and transmits to a data center, via a wireless link, spoken requests from a vehicle driver requesting navigational information. The data center performs automated voice recognition on the received spoken requests to attempt recognition of destination components of the spoken requests, generates a list of possible destination components corresponding to the spoken requests, assigns a confidence score for each of the possible destination components on the list, determines if a possible destination component with a highest confidence score has a confidence score above a selected threshold, and computer-generates a representation of the possible destination components for transmission to the on-board system via the wireless link for confirmation by the vehicle driver if the highest confidence score of every possible destination component is above the selected threshold.
Abstract:
An information presentation device includes an audio signal input unit configured to input an audio signal, an image signal input unit configured to input an image signal, an image display unit configured to display an image indicated by the image signal, a sound source localization unit configured to estimate direction information for each sound source based on the audio signal, a sound source separation unit configured to separate the audio signal to sound-source-classified audio signals for each sound source, an operation input unit configured to receive an operation input and generates coordinate designation information indicating a part of a region of the image, and a sound source selection unit configured to select a sound-source-classified audio signal of a sound source associated with a coordinate which is included in a region indicated by the coordinate designation information, and which corresponds to the direction information.
Abstract:
Techniques to provide automatic speech recognition at a local device are described. An apparatus may include an audio input to receive audio data indicating a task. The apparatus may further include a local recognizer component to receive the audio data, to pass the audio data to a remote recognizer while receiving the audio data, and to recognize speech from the audio data. The apparatus may further include a federation component operative to receive one or more recognition results from the local recognizer and/or the remote recognizer, and to federate a plurality of recognition results to produce a most likely result. The apparatus may further include an application to perform the task indicated by the most likely result. Other embodiments are described and claimed.
Abstract:
A voice processing apparatus includes: a voice receptor configured to collect a user voice, convert the user voice into a first voice signal, and to output the first voice signal; an audio processor configured to process a sound output through a speaker to output an audio signal; a memory unit configured to store the first voice signal output from the voice receptor and the audio signal output from the audio processor; an echo cancelor configured to remove an echo from the first voice signal to generate a second voice signal; and a first controller configured to control the echo cancelor to generate the second voice signal based on the first voice signal and the audio signal stored in the memory unit.
Abstract:
A speech processing method executed by a computer, the speech processing method includes: extracting, based on speech recognition for an input speech data, a plurality of word candidates including a first word candidate and a second word candidate from a memory, the plurality of word candidates being candidates for a word corresponding to the input speech data; determining at least one different part between the first word candidate and the second word candidate based on a comparison between the first word candidate and the second word candidate; and outputting the first word candidate with emphasis on the at least one different part.
Abstract:
Systems and methods to provide a set of dictionaries and highlighting lists for speech recognition and highlighting, where the speech recognition focuses only on a limited scope of vocabulary as present in a document. The systems and methods allow a rapid and accurate matching of the utterance with the available text, and appropriately indicate the location in the text or signal any errors made during reading. Described herein is a system and method to create speech recognition systems focused on reading a fixed text and providing feedback on what they read to improve literacy, aid those with disabilities, and to make the reading experience more efficient and fun.
Abstract:
Predicting and learning users' intended actions on an electronic device based on free-form speech input. Users' actions can be monitored to develop a list of carrier phrases having one or more actions that correspond to the carrier phrases. A user can speak a command into a device to initiate an action. The spoken command can be parsed and compared to a list of carrier phrases. If the spoken command matches one of the known carrier phrases, the corresponding action(s) can be presented to the user for selection. If the spoken command does not match one of the known carrier phrases, search results (e.g., Internet search results) corresponding to the spoken command can be presented to the user. The actions of the user in response to the presented action(s) and/or the search results can be monitored to update the list of carrier phrases.
Abstract:
A method is described for user correction of speech recognition results. A speech recognition result for a given unknown speech input is displayed to a user. A user selection is received of a portion of the recognition result needing to be corrected. For each of multiple different recognition data sources, a ranked list of alternate recognition choices is determined which correspond to the selected portion. The alternate recognition choices are concatenated or interleaved together and duplicate choices removed to form a single ranked output list of alternate recognition choices, which is displayed to the user. The method may be adaptive over time to derive preferences that can then be leveraged in the ordering of one choice list or across choice lists.
Abstract:
The present invention suggests an interface device for processing a voice of a user which efficiently outputs various information so as to allow a user to contribute to the voice recognition or the automatic interpretation and a method thereof. For this purpose, the present invention suggests an interface device for processing a voice of a user which includes an utterance input unit configured to input utterance of a user, an utterance end recognizing unit configured to recognize the end of the input utterance; and an utterance result output unit configured to output at least one of a voice recognition result, a translation result, and an interpretation result of the ended utterance.
Abstract:
Disclosed are a display apparatus, a voice acquiring apparatus and a voice recognition method thereof, the display apparatus including: a display unit which displays an image; a communication unit which communicates with a plurality of external apparatuses; and a controller which includes a voice recognition engine to recognize a user's voice, receives a voice signal from a voice acquiring unit, and controls the communication unit to receive candidate instruction words from at least one of the plurality of external apparatuses to recognize the received voice signal.