摘要:
A method is described for improving the accuracy of a transcription generated by an automatic speech recognition (ASR) engine. A personal vocabulary is maintained that includes replacement words. The replacement words in the personal vocabulary are obtained from personal data associated with a user. A transcription is received of an audio recording. The transcription is generated by an ASR engine using an ASR vocabulary and includes a transcribed word that represents a spoken word in the audio recording. Data is received that is associated with the transcribed word. A replacement word from the personal vocabulary is identified, which is used to re-score the transcription and replace the transcribed word.
摘要:
A voicemail computer system transcribes a voicemail message into text that is presented to a calling party for approval. A calling party is able to approve, disapprove or edit a voicemail message prior to delivery to one or more called parties. The voicemail computer system may analyze a voicemail message to detect errors, omissions, or potentially offensive words. The voicemail computer may analyze a voicemail message to make suggestions as to tone, content or information contained within the voicemail message. The calling party can edit the voicemail message or approve it prior to providing a notification to one or more called parties that they have received the voicemail message.
摘要:
A voicemail computer system transcribes a voicemail message into text that is presented to a calling party for approval. A calling party is able to approve, disapprove or edit a voicemail message prior to delivery to one or more called parties. The voicemail computer system may analyze a voicemail message to detect errors, omissions, or potentially offensive words. The voicemail computer may analyze a voicemail message to make suggestions as to tone, content or information contained within the voicemail message. The calling party can edit the voicemail message or approve it prior to providing a notification to one or more called parties that they have received the voicemail message.
摘要:
Techniques are described for performing actions for users based at least in part on spoken information, such as spoken voice-based information received from the users during telephone calls. The described techniques include categorizing spoken information obtained from a user in one or more ways, and performing actions on behalf of the user related to the categorized information. For example, in some situations, spoken information obtained from a user is analyzed to identify one or more spoken information items (e.g., words, phrases, sentences, etc.) supplied by the user, and to generate corresponding textual representations (e.g., via automated speech-to-text techniques). One or more actions may then be taken regarding the identified information items, including to categorize the items by adding textual representations of the spoken information items to one or more of multiple predefined lists or other collections of information that are specific to or otherwise available to the user.
摘要:
A method is described for improving the accuracy of a transcription generated by an automatic speech recognition (ASR) engine. A personal vocabulary is maintained that includes replacement words. The replacement words in the personal vocabulary are obtained from personal data associated with a user. A transcription is received of an audio recording. The transcription is generated by an ASR engine using an ASR vocabulary and includes a transcribed word that represents a spoken word in the audio recording. Data is received that is associated with the transcribed word. A replacement word from the personal vocabulary is identified, which is used to re-score the transcription and replace the transcribed word.
摘要:
Techniques are described for performing actions for users based at least in part on spoken information, such as spoken voice-based information received from the users during telephone calls. The described techniques include categorizing spoken information obtained from a user in one or more ways, and performing actions on behalf of the user related to the categorized information. For example, in some situations, spoken information obtained from a user is analyzed to identify one or more spoken information items (e.g., words, phrases, sentences, etc.) supplied by the user, and to generate corresponding textual representations (e.g., via automated speech-to-text techniques). One or more actions may then be taken regarding the identified information items, including to categorize the items by adding textual representations of the spoken information items to one or more of multiple predefined lists or other collections of information that are specific to or otherwise available to the user.
摘要:
Techniques for error correction using a history list comprising at least one misrecognition and correction information associated with each of the at least one misrecognitions indicating how a user corrected the associated misrecognition. The techniques include converting data input from a user to generate a text segment, determining whether at least a portion of the text segment appears in the history list as one of the at least one misrecognitions, if the at least a portion of the text segment appears in the history list as one of the at least one misrecognitions, obtaining the correction information associated with the at least one misrecognition, and correcting the at least a portion of the text segment based, at least in part, on the correction information.
摘要:
Some embodiments relate to a method of performing a search for content on the Internet, in which a user may speak a search query and speech recognition may be performed on the spoken query to generate a text search query to be provided to a plurality of search engines. This enables a user to speak the search query rather than having to type it, and also allows the user to provide the search query only once, rather than having to provide it separately to multiple different search engines.
摘要:
In one aspect, a method for determining a validity of an identity asserted by a speaker using a voice print is provided. The method comprises acts of performing a first verification stage comprising comparing a first voice signal from the speaker uttering at least one first challenge utterance-with at least a portion of the voice print and performing a second verification stage if it is concluded in the first verification stage that the first voice signal was obtained from an utterance by the user. The second verification stage comprises adapting at least one parameter of the voice print based, at least in part, on the first voice signal to obtain an adapted voice print, and comparing a second voice signal from the speaker uttering at least one second challenge utterance with at least a portion of the adapted voice print.
摘要:
Some embodiments relate to a method of performing a search for content on the Internet, in which a user may speak a search query and speech recognition may be performed on the spoken query to generate a text search query to be provided to a plurality of search engines. This enables a user to speak the search query rather than having to type it, and also allows the user to provide the search query only once, rather than having to provide it separately to multiple different search engines.