Method and apparatus for processing spoken search queries
    2.
    发明授权
    Method and apparatus for processing spoken search queries 有权
    用于处理口语搜索查询的方法和装置

    公开(公告)号:US08666963B2

    公开(公告)日:2014-03-04

    申请号:US13527500

    申请日:2012-06-19

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30976

    摘要: Some embodiments relate to a method of performing a search for content on the Internet, in which a user may speak a search query and speech recognition may be performed on the spoken query to generate a text search query to be provided to a plurality of search engines. This enables a user to speak the search query rather than having to type it, and also allows the user to provide the search query only once, rather than having to provide it separately to multiple different search engines.

    摘要翻译: 一些实施例涉及对互联网上的内容执行搜索的方法,其中用户可以在其中说出搜索查询,并且可以在口头查询上执行语音识别以生成要提供给多个搜索引擎的文本搜索查询 。 这使得用户能够说出搜索查询而不必输入搜索查询,并且还允许用户仅提供一次搜索查询,而不必单独提供给多个不同的搜索引擎。

    SPEAKER VERIFICATION METHODS AND APPARATUS
    3.
    发明申请
    SPEAKER VERIFICATION METHODS AND APPARATUS 有权
    扬声器验证方法和设备

    公开(公告)号:US20120239398A1

    公开(公告)日:2012-09-20

    申请号:US13442170

    申请日:2012-04-09

    IPC分类号: G10L17/00

    CPC分类号: G10L17/24 G10L17/04 G10L17/20

    摘要: In one aspect, a method for determining a validity of an identity asserted by a speaker using a voice print is provided. The method comprises acts of performing a first verification stage comprising comparing a first voice signal from the speaker uttering at least one first challenge utterance-with at least a portion of the voice print and performing a second verification stage if it is concluded in the first verification stage that the first voice signal was obtained from an utterance by the user. The second verification stage comprises adapting at least one parameter of the voice print based, at least in part, on the first voice signal to obtain an adapted voice print, and comparing a second voice signal from the speaker uttering at least one second challenge utterance with at least a portion of the adapted voice print.

    摘要翻译: 在一方面,提供了一种用于确定由使用语音打印的扬声器所确定的身份的有效性的方法。 该方法包括执行第一验证阶段的动作,包括将来自扬声器的第一语音信号与至少一个第一挑战话语 - 与语音打印的至少一部分进行比较,并且如果在第一验证中得出结论,则执行第二验证阶段 第一语音信号是由用户的话语获得的。 第二验证阶段包括至少部分地基于第一语音信号来调整语音印刷的至少一个参数以获得适应的语音印刷,并且将来自扬声器的第二语音信号与至少一个第二挑战话语与 至少一部分适应的语音打印。

    Method and apparatus for processing spoken search queries
    4.
    发明授权
    Method and apparatus for processing spoken search queries 有权
    用于处理口语搜索查询的方法和装置

    公开(公告)号:US08239366B2

    公开(公告)日:2012-08-07

    申请号:US12877549

    申请日:2010-09-08

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30976

    摘要: Some embodiments relate to a method of performing a search for content on the Internet, in which a user may speak a search query and speech recognition may be performed on the spoken query to generate a text search query to be provided to a plurality of search engines. This enables a user to speak the search query rather than having to type it, and also allows the user to provide the search query only once, rather than having to provide it separately to multiple different search engines.

    摘要翻译: 一些实施例涉及对互联网上的内容执行搜索的方法,其中用户可以在其中说出搜索查询,并且可以在口头查询上执行语音识别以生成要提供给多个搜索引擎的文本搜索查询 。 这使得用户能够说出搜索查询而不必输入搜索查询,并且还允许用户仅提供一次搜索查询,而不必单独提供给多个不同的搜索引擎。

    System and method for modeless large vocabulary speech recognition
    5.
    发明授权
    System and method for modeless large vocabulary speech recognition 有权
    无模式大词汇语音识别的系统和方法

    公开(公告)号:US06292779B1

    公开(公告)日:2001-09-18

    申请号:US09267925

    申请日:1999-03-09

    IPC分类号: G10L1514

    摘要: A modeless large vocabulary continuous speech recognition system is provided that represents an input utterance as a sequence of input vectors. The system includes a common library of acoustic model states for arrangement in sequences that form acoustic models. Each acoustic model is composed of a sequence of segment models and each segment model is composed of a sequence of model states. An input processor compares each vector in a sequence of input vectors to a set of model states in the common library to produce a match score for each model state in the set, reflecting the likelihood that a state is represented by a vector. The system also includes a plurality of recognition modules and associated recognition grammars. The recognition modules operate in parallel and use the match scores with the acoustic models to determine at least one recognition result in each of the recognition modules. The recognition modules includes a dictation module for producing at least one probable dictation recognition result, a select module for recognizing a portion of visually displayed text for processing with a command, and a command module for producing at least one probable command recognition result. An arbitrator uses an arbitration algorithm and a score ordered queue of recognition results, together with their associated recognition modules, to compare the recognition results of the recognition modules to select at least one system recognition result.

    摘要翻译: 提供了一种无噪声大词汇连续语音识别系统,其表示输入语音作为输入向量的序列。 该系统包括用于形成声学模型的序列中布置的声学模型状态的公共库。 每个声学模型由段模型序列组成,每个分段模型由一系列模型状态组成。 输入处理器将输入向量序列中的每个向量与公共库中的一组模型状态进行比较,以产生该集合中每个模型状态的匹配分数,反映状态由向量表示的可能性。 该系统还包括多个识别模块和相关联的识别语法。 识别模块并行运行,并使用声学模型的匹配分数来确定每个识别模块中的至少一个识别结果。 识别模块包括用于产生至少一个可能的听写识别结果的听写模块,用于识别用于用命令处理的视觉显示文本的一部分的选择模块,以及用于产生至少一个可能的命令识别结果的命令模块。 仲裁员使用仲裁算法和识别结果的得分排序队列及其相关联的识别模块来比较识别模块的识别结果以选择至少一个系统识别结果。

    Detecting potential significant errors in speech recognition results
    6.
    发明授权
    Detecting potential significant errors in speech recognition results 有权
    检测语音识别结果中潜在的重大错误

    公开(公告)号:US09064492B2

    公开(公告)日:2015-06-23

    申请号:US13544215

    申请日:2012-07-09

    IPC分类号: G10L15/22 G10L15/08 G10L15/24

    摘要: In some embodiments, recognition results produced by a speech processing system (which may include two or more recognition results, including a top recognition result and one or more alternative recognition results) based on an analysis of a speech input, are evaluated for indications of potential errors. In some embodiments, the indications of potential errors may include discrepancies between recognition results that are meaningful for a domain, such as medically-meaningful discrepancies. The evaluation of the recognition results may be carried out using any suitable criteria, including one or more criteria that differ from criteria used by an ASR system in determining the top recognition result and the alternative recognition results from the speech input. In some embodiments, a recognition result may additionally or alternatively be processed to determine whether the recognition result includes a word or phrase that is unlikely to appear in a domain to which speech input relates.

    摘要翻译: 在一些实施例中,基于语音输入的分析,由语音处理系统(其可以包括两个或多个识别结果,包括顶部识别结果和一个或多个替代识别结果)产生的识别结果被评估用于潜在的指示 错误。 在一些实施例中,潜在错误的迹象可能包括对域有意义的识别结果之间的差异,例如医学上有意义的差异。 识别结果的评估可以使用任何合适的标准进行,包括与ASR系统在确定最高识别结果和来自语音输入的替代识别结果时使用的标准不同的一个或多个标准。 在一些实施例中,识别结果可以附加地或替代地被处理以确定识别结果是否包括不太可能出现在与语音输入相关的域中的单词或短语。

    Detecting potential significant errors in speech recognition results
    7.
    发明授权
    Detecting potential significant errors in speech recognition results 有权
    检测语音识别结果中潜在的重大错误

    公开(公告)号:US08924213B2

    公开(公告)日:2014-12-30

    申请号:US13544331

    申请日:2012-07-09

    IPC分类号: G10L15/00 G10L15/04

    CPC分类号: G10L15/08 G10L15/1815

    摘要: In some embodiments, the recognition results produced by a speech processing system (which may include two or more recognition results, including a top recognition result and one or more alternative recognition results) based on an analysis of a speech input, are evaluated for indications of potential significant errors. In some embodiments, the recognition results may be evaluated using one or more sets of words and/or phrases, such as pairs of words/phrases that may include words/phrases that are acoustically similar to one another and/or that, when included in a result, would change a meaning of the result in a manner that would be significant for a domain. The recognition results may be evaluated using the set(s) of words/phrases to determine, when the top result includes a word/phrase from a set of words/phrases, whether any of the alternative recognition results includes any of the other, corresponding words/phrases from the set.

    摘要翻译: 在一些实施例中,基于语音输入的分析,由语音处理系统(其可以包括两个或多个识别结果,包括顶部识别结果和一个或多个替代识别结果)产生的识别结果被评估为 潜在的重大错误。 在一些实施例中,可以使用一组或多组单词和/或短语来评估识别结果,所述单词和/或短语组诸如可以包括彼此声学上相似的单词/短语的单词/短语对和/或当包括在 结果将以对域的重要性的方式改变结果的含义。 可以使用单词/短语的集合来评估识别结果,以确定当最顶部结果包括来自一组单词/短语的单词/短语时,是否任何替代识别结果包括任何另一个,相应的 集合中的单词/短语。

    Voicemail preview and editing system
    8.
    发明授权
    Voicemail preview and editing system 有权
    语音邮件预览和编辑系统

    公开(公告)号:US08913722B2

    公开(公告)日:2014-12-16

    申请号:US13101909

    申请日:2011-05-05

    IPC分类号: H04M1/64 H04M3/533

    摘要: A voicemail computer system transcribes a voicemail message into text that is presented to a calling party for approval. A calling party is able to approve, disapprove or edit a voicemail message prior to delivery to one or more called parties. The voicemail computer system may analyze a voicemail message to detect errors, omissions, or potentially offensive words. The voicemail computer may analyze a voicemail message to make suggestions as to tone, content or information contained within the voicemail message. The calling party can edit the voicemail message or approve it prior to providing a notification to one or more called parties that they have received the voicemail message.

    摘要翻译: 语音信箱计算机系统将语音邮件消息转换成呈现给主叫方批准的文本。 主叫方可以在发送给一个或多个被叫方之前批准,不批准或编辑语音邮件消息。 语音邮件计算机系统可以分析语音邮件消息以检测错误,遗漏或潜在令人反感的词语。 语音信箱计算机可以分析语音邮件消息,以提供关于语音邮件消息中包含的音调,内容或信息的建议。 主叫方可以在向一个或多个被叫方提供他们已经收到该语音邮件消息的通知之前编辑该语音邮件消息或者批准它。

    SYSTEMS AND METHODS FOR RECEIVING AND PROCESSING AUDIO SIGNALS CAPTURED USING MULTIPLE DEVICES
    9.
    发明申请
    SYSTEMS AND METHODS FOR RECEIVING AND PROCESSING AUDIO SIGNALS CAPTURED USING MULTIPLE DEVICES 审中-公开
    使用多个设备捕获和处理音频信号的系统和方法

    公开(公告)号:US20130022189A1

    公开(公告)日:2013-01-24

    申请号:US13187940

    申请日:2011-07-21

    IPC分类号: H04M3/42

    摘要: Systems, methods, and apparatus for using different interfaces to receive from different devices representations of at least one audio signal. In some embodiments, each representation may be generated using at least one microphone of the respective device during a meeting attended by a plurality of participants. In some further embodiments, a first representation may be received from a first device via a telephone network, while a second representation may be received from a second device via a data network. In yet some further embodiments, the first and second representations may be processed to obtain a processed representation of the at least one audio signal.

    摘要翻译: 用于使用不同接口从不同设备接收至少一个音频信号的系统,方法和装置。 在一些实施例中,可以在由多个参与者参加的会议期间使用相应设备的至少一个麦克风来生成每个表示。 在一些另外的实施例中,可以经由电话网络从第一设备接收第一表示,而可以经由数据网络从第二设备接收第二表示。 在一些另外的实施例中,第一和第二表示可以被处理以获得所述至少一个音频信号的经处理的表示。

    PERFORMING ACTIONS FOR USERS BASED ON SPOKEN INFORMATION
    10.
    发明申请
    PERFORMING ACTIONS FOR USERS BASED ON SPOKEN INFORMATION 有权
    基于发声信息执行用户的行为

    公开(公告)号:US20110268260A1

    公开(公告)日:2011-11-03

    申请号:US13101085

    申请日:2011-05-04

    IPC分类号: H04M11/00

    摘要: Techniques are described for performing actions for users based at least in part on spoken information, such as spoken voice-based information received from the users during telephone calls. The described techniques include categorizing spoken information obtained from a user in one or more ways, and performing actions on behalf of the user related to the categorized information. For example, in some situations, spoken information obtained from a user is analyzed to identify one or more spoken information items (e.g., words, phrases, sentences, etc.) supplied by the user, and to generate corresponding textual representations (e.g., via automated speech-to-text techniques). One or more actions may then be taken regarding the identified information items, including to categorize the items by adding textual representations of the spoken information items to one or more of multiple predefined lists or other collections of information that are specific to or otherwise available to the user.

    摘要翻译: 描述了用于至少部分地基于话音信息(例如在电话呼叫期间从用户接收的基于语音的信息)来为用户执行动作的技术。 所描述的技术包括以一种或多种方式对从用户获得的口语信息进行分类,以及代表与分类信息相关的用户执行动作。 例如,在一些情况下,分析从用户获得的口语信息以识别由用户提供的一个或多个口语信息项(例如,单词,短语,句子等),并生成相应的文本表示(例如,经由 自动语音到文本技术)。 然后可以对所识别的信息项采取一个或多个动作,包括通过将口语信息项的文本表示添加到多个预定义列表中的一个或多个或其他特定或可用于其中的信息集合来分类项目 用户。