Abstract:
The embodiments include a system, a computer readable medium, and a method for establishing a communication connection after searching the World Wide Web for relevant phone information. The system can include a first communication device for forming at least one communication connection between the first communication device and a second communication device, search means adapted to accept a query, access means adapted to (i) search and identify relevant phone number information using the query (ii) create at least one icon to link the first communication device to a relevant phone number included in the relevant phone number information identified by the query, and (iii) reformulate the query if no relevant phone numbers are identified during the search. The system also includes click-to-dial means adapted to establish at least one communication connection from the first communication device to the second communication device.
Abstract:
Disclosed are systems, methods and computer-readable media for using a local communication network to generate a speech model. The method includes retrieving for an individual a list of numbers in a calling history, identifying a local neighborhood associated with each number in the calling history, truncating the local neighborhood associated with each number based on the at least one parameter, retrieving a local communication network associated with each number in the calling history and each phone number in the local neighborhood, and creating a language model for the individual based on the retrieved local communication network. The generated language model may be used for improved automatic speech recognition for audible searches as well as other modules in a spoken dialog system.
Abstract:
A method and apparatus for automatically detecting and extracting information from dynamically generated web pages are disclosed. For example, the present method stores user provided information that is entered into a form interface of a web page for a first query. Responsive to the first query, a first response web page is received and stored. The present method then automatically generates a second query to acquire a second response web page that is responsive to the second query. Finally, the present method compares the first response web page and the second response web page. In one embodiment, the present invention extracts information that is dissimilar between the first response web page and the second response web page. This extracted information is deemed to be the pertinent information requested by the user.
Abstract:
A portable communication device has a touch screen display that receives tactile input and a microphone that receives audio input. The portable communication device initiates a query for media based at least in part on tactile input and audio input. The touch screen display is a multi-touch screen. The portable communication device sends an initiated query and receives a text response indicative of a speech to text conversion of the query. The portable communication device then displays video in response to tactile input and audio input.
Abstract:
A computerized method is disclosed for presenting advertising data extracted from a video data stream, the method including storing a plurality of advertising data items extracted from the video data stream at an end user device; and displaying a plurality of sorted advertising indicator data items at the end user device, wherein each of the advertising indicator data items indicates one of the plurality of stored advertising data items. A system is disclosed for performing the method. A data structure is disclosed providing a functional and structural interrelationship between a processor in the system and data in the data structure.
Abstract:
The embodiments include a system and a method for establishing a communication connection after searching the World Wide Web. The system can include a first communication device for forming at least one communication connection between the first communication device and a second communication device, search means adapted to accept a query, access means can be adapted to search and identify relevant phone number information using the query, and create at least one icon to link the first communication device to a relevant phone number included in the relevant phone number information identified by the query, and reformulate the query if no relevant phone numbers are identified during the search. The system also includes click-to-dial means adapted to establish at least one communication connection from the first communication device to the second communication device.
Abstract:
The invention comprises a method and apparatus for predicting word accuracy. Specifically, the method comprises obtaining an utterance in speech data where the utterance comprises an actual word string, processing the utterance for generating an interpretation of the actual word string, processing the utterance to identify at least one utterance frame, and predicting a word accuracy associated with the interpretation according to at least one stationary signal-to-noise ratio and at least one non-stationary signal to noise ratio, wherein the at least one stationary signal-to-noise ratio and the at least one non-stationary signal to noise ratio are determined according to a frame energy associated with each of the at least one utterance frame.
Abstract:
In one embodiment, a semantic classifier input and a corresponding label attributed to the semantic classifier input may be obtained. A determination may be made whether the corresponding label is correct based on logged interaction data. An entry of an adaptation corpus may be generated based on a result of the determination. Operation of the semantic classifier may be adapted based on the adaptation corpus.
Abstract:
Disclosed herein are systems, methods and non-transitory computer-readable media for performing speech recognition across different applications or environments without model customization or prior knowledge of the domain of the received speech. The disclosure includes recognizing received speech with a collection of domain-specific speech recognizers, determining a speech recognition confidence for each of the speech recognition outputs, selecting speech recognition candidates based on a respective speech recognition confidence for each speech recognition output, and combining selected speech recognition candidates to generate text based on the combination.
Abstract:
Disclosed are systems, methods, and computer readable media for performing speech recognition. The method embodiment comprises selecting a codebook from a plurality of codebooks with a minimal acoustic distance to a received speech sample, the plurality of codebooks generated by a process of (a) computing a vocal tract length for a each of a plurality of speakers, (b) for each of the plurality of speakers, clustering speech vectors, and (c) creating a codebook for each speaker, the codebook containing entries for the respective speaker's vocal tract length, speech vectors, and an optional vector weight for each speech vector, (2) applying the respective vocal tract length associated with the selected codebook to normalize the received speech sample for use in speech recognition, and (3) recognizing the received speech sample based on the respective vocal tract length associated with the selected codebook.