摘要:
A method of morphing speech from an original speaker into the speech of a second, target speaker with decomposing either speech into source and filter, and without the need to determine the formant positions by warping spectral envelops.
摘要:
A method of constructing a text message on a mobile communications device, the method involving: storing a plurality of text phrases; for each of the text phrases, storing a representation that is derived from that text phrase; receiving a spoken phrase from a user; from the received spoken phrase generating an acoustic representation thereof; based on the acoustic representation, searching among the stored representations to identify a stored text phrase that best matches the spoken phrase; and inserting into an electronic document the text phrase that is identified from searching.
摘要:
A method including: providing a mobile device (e.g. cellular phone) with a core engine for performing speech recognition; providing a plurality of sets of language-specific modules, each set of the plurality of sets for enabling the core engine to recognize a different language; selecting one set of language-specific modules among the plurality of sets of language-specific modules; and loading into memory within the mobile communication device the selected set of language-specific modules so as to enable the mobile communication device to recognize speech spoken in the language of the selected set.
摘要:
A method of transferring phone book information from one cell phone to another cell phone includes compiling the phone book information relating to one or more cell phone users into a data transmission package, and sending the data transmission package from the first cell phone to the second cell phone, via a communication channel native to the first and second cell phones.
摘要:
Alphabetic filtering of the speech recognition of words uses a key press to indicate a desired character in an alphabetic filter string, where each key press represents two or more letters. The key presses can be disambiguated by recognizing a key-disambiguation utterance in association with a given key press. A user can select a desired recognition candidate from a choice list produced by such filtered word recognition. Ambiguous alphabetic filtering can be performed iteratively in response to the addition of successive ambiguous key presses. A user can select to re-recognize the utterance using filtering based on ambiguous key input after seeing the results of recognition without such filtering. Unambiguous alphabetic filtering can be performed by using multiple presses of an ambiguous key to disambiguate which letter is intended. A user can select between entering text by either large vocabulary speech recognition or by spelling text by pressing phone keys.
摘要:
A mobile device, such as a cellular telephone includes a voice interface that includes one part that may not be specific to a particular carrier, and a second part that provides an interface to services that are specific to a carrier or to service or information providers that are not necessarily available with all carriers. A voice command interface provides easy access to the carrier services. The set of carrier services is optionally extendible by the carrier.
摘要:
A method and a system for testing a voice enabled application on a target device, the method including conducting one or more interactions with the target device, at least some of the interactions including presenting an acoustic utterance in an acoustic environment to the target device, receiving an output of the target device in response to the acoustic utterance, and comparing the output to an output expected from the acoustic utterance.
摘要:
Statistics are measured from an initial portion of a speech utterance. Feature normalization parameters are estimated based on the measured statistics and a statistically derived mapping relating measured statistics and feature normalization parameters.
摘要:
A method of operating a mobile communication device that includes a speaker independent recognizer and a memory storing phonebook including a plurality of names, the method involving: generating a first voice signal from a first voice input received from a user, the first voice input specifying a selected one of a plurality of names; comparing the first voice signal to a plurality of voice tags that are stored in the device to identify the selected name in the phonebook; generating a second voice signal from a second speech input received from the user, the second voice input specifying a selected one of a plurality of phone number types; using the speaker independent recognizer to identify the selected phone number type; retrieving a phone number that is stored in association with the identified type for the identified name; and initiating a call to the phone number associated with the identified type for the identified name.
摘要:
The present invention relates to speech recognition using selectable recognition modes. This includes innovations such as: large vocabulary speech recognition programming that supplies recognized words to external program as they are recognized, and allows a user to select between large vocabulary recognition of an utterance with and without language context from the prior utterance independently of state of the external program; allowing a user to select between continuous and discrete speech recognition that use substantially the same vocabulary; allowing a user to select between continuous and discrete large-vocabulary speech recognition modes; allowing a user to select between at least two different alphabetic entry speech recognition modes; and allowing a user to select from among four or more of the following recognitions modes when creating text: a large-vocabulary mode, an alphabetic entry mode, a number entry mode, and a punctuation entry mode.