Abstract:
A system and method are provided for augmenting information on business directory databases and communicating with businesses is disclosed. Using the enriched business directory database and Web mining technology, customized email message are sent inviting businesses to enter their enriched business information into the directory or even subscribe to other paid services provided by the directory service.
Abstract:
Disclosed are systems, methods, and computer readable media for comparing customer voice prints with a database of known fraudulent voice signatures and continually updating the database to decrease the risk of identity theft. The method embodiment comprises comparing a received voice signal against a database of known fraudulent voice signatures, denying the caller's transaction if the voice signal substantially matches the database of known fraudulent voice signatures, adding the caller's voice signal to the database of known fraudulent voice signatures if the voice signal does not substantially match a separate speaker verification database and received additional information is not verified.
Abstract:
A method and apparatus for providing a click-to-talk service for advertisements carried over packet networks such as digital cable networks, Voice over Internet Protocol (VoIP) and Service over Internet Protocol (SoIP) networks are disclosed. For example, an enterprise customer subscribes to a service with a service provider that provides a click-to-talk feature with its advertisements on television channels. In one embodiment, the network service provider obtains meta-information from a video content and transmits the meta-information and the video content to a set-top box. The network service provider also enables consumers while viewing the advertisements to click on their remote control to initiate a call to talk to the advertising enterprise entity. Thus, when the consumer clicks-to-talk to the enterprise entity, the network service provider enables the consumer to reach the enterprise entity immediately.
Abstract:
A method, a system and a machine-readable medium are provided for an on demand translation service. A translation module including at least one language pair module for translating a source language to a target language may be made available for use by a subscriber. The subscriber may be charged a fee for use of the requested on demand translation service or may be provided use of the on demand translation service for free in exchange for displaying commercial messages to the subscriber. A video signal may be received including information in the source language, which may be obtained as text from the video signal and may be translated from the source language to the target language by use of the translation module. Translated information, based on the translated text, may be added into the received video signal. The video signal including the translated information in the target language may be sent to a display device.
Abstract:
Disclosed herein are systems and methods to incorporate human knowledge when developing and using statistical models for natural language understanding. The disclosed systems and methods embrace a data-driven approach to natural language understanding which progresses seamlessly along the continuum of availability of annotated collected data, from when there is no available annotated collected data to when there is any amount of annotated collected data.
Abstract:
The invention provides for a system, method, and computer readable medium storing instructions related to controlling a presentation in a multimodal system. The method embodiment of the invention is a method for the retrieval of information on the basis of its content for real-time incorporation into an electronic presentation. The method comprises receiving from a presenter a content-based request for at least one segment of a first plurality of segments within a media presentation and while displaying the media presentation to an audience, displaying to the presenter a second plurality of segments in response to the content-based request. The computing device practicing the method receives a selection from the presenter of a segment from the second plurality of segments and displays to the audience the selected segment.
Abstract:
Disclosed herein are systems, methods, and non-transitory computer-readable storage media for assigning saliency weights to words of an ASR model. The saliency values assigned to words within an ASR model are based on human perception judgments of previous transcripts. These saliency values are applied as weights to modify an ASR model such that the results of the weighted ASR model in converting a spoken document to a transcript provide a more accurate and useful transcription to the user.
Abstract:
Disclosed herein are systems, methods, and non-transitory computer-readable storage media for generating a model for use with automatic speech recognition. These principles can be implemented as part of a streamlined tool for automatic training and tuning of speech, or other, models with a fast turnaround and with limited human involvement. A system configured to practice the method receives, as part of a request to generate a model, input data and a seed model. The system receives a cost function indicating accuracy and at least one of speed and memory usage, The system processes the input data based on seed model and based on parameters that optimize the cost function to yield an updated model, and outputs the updated model.
Abstract:
A method includes registering a voice of a party in order to provide voice verification for communications with an entity. A call is received from a party at a voice response system. The party is prompted for information and verbal communication spoken by the party is captured. A voice model associated with the party is created by processing the captured verbal communication spoken by the party and is stored. The identity of the party is verified and a previously stored voice model of the party, registered during a previous call from the party, is updated. The creation of the voice model is imperceptible to the party.
Abstract:
Disclosed herein are systems, computer-implemented methods, and computer-readable media for speech recognition. The method includes receiving speech utterances, assigning a pronunciation weight to each unit of speech in the speech utterances, each respective pronunciation weight being normalized at a unit of speech level to sum to 1, for each received speech utterance, optimizing the pronunciation weight by (1) identifying word and phone alignments and corresponding likelihood scores, and (2) discriminatively adapting the pronunciation weight to minimize classification errors, and recognizing additional received speech utterances using the optimized pronunciation weights. A unit of speech can be a sentence, a word, a context-dependent phone, a context-independent phone, or a syllable. The method can further include discriminatively adapting pronunciation weights based on an objective function. The objective function can be maximum mutual information (MMI), maximum likelihood (MLE) training, minimum classification error (MCE) training, or other functions known to those of skill in the art. Speech utterances can be names. The speech utterances can be received as part of a multimodal search or input. The step of discriminatively adapting pronunciation weights can further include stochastically modeling pronunciations.