Building multi-language processes from existing single-language processes
    1.
    发明授权
    Building multi-language processes from existing single-language processes 有权
    从现有的单一语言流程构建多语言流程

    公开(公告)号:US09098494B2

    公开(公告)日:2015-08-04

    申请号:US13469078

    申请日:2012-05-10

    CPC分类号: G06F17/289

    摘要: Processes capable of accepting linguistic input in one or more languages are generated by re-using existing linguistic components associated with a different anchor language, together with machine translation components that translate between the anchor language and the one or more languages. Linguistic input is directed to machine translation components that translate such input from its language into the anchor language. Those existing linguistic components are then utilized to initiate responsive processing and generate output. Optionally, the output is directed through the machine translation components. A language identifier can initially receive linguistic input and identify the language within which such linguistic input is provided to select an appropriate machine translation component. A hybrid process, comprising machine translation components and linguistic components associated with the anchor language, can also serve as an initiating construct from which a single language process is created over time.

    摘要翻译: 能够以一种或多种语言接受语言输入的过程通过重新使用与不同锚语言相关联的现有语言组件以及在锚语言和一种或多种语言之间进行翻译的机器翻译组件而产生。 语言输入针对机器翻译组件,将这种输入从其语言转换为锚语言。 然后利用那些现有的语言分量来启动响应处理并产生输出。 可选地,输出被引导通过机器翻译组件。 语言标识符可以最初接收语言输入并且识别提供这种语言输入以选择适当的机器翻译组件的语言。 包括与锚语言相关联的机器翻译组件和语言组件的混合过程也可以用作随时间创建单个语言过程的起始构造。

    BUILDING MULTI-LANGUAGE PROCESSES FROM EXISTING SINGLE-LANGUAGE PROCESSES
    2.
    发明申请
    BUILDING MULTI-LANGUAGE PROCESSES FROM EXISTING SINGLE-LANGUAGE PROCESSES 有权
    从现有的单一语言过程建立多语言过程

    公开(公告)号:US20130304451A1

    公开(公告)日:2013-11-14

    申请号:US13469078

    申请日:2012-05-10

    IPC分类号: G06F17/28 G10L15/26

    CPC分类号: G06F17/289

    摘要: Processes capable of accepting linguistic input in one or more languages are generated by re-using existing linguistic components associated with a different anchor language, together with machine translation components that translate between the anchor language and the one or more languages. Linguistic input is directed to machine translation components that translate such input from its language into the anchor language. Those existing linguistic components are then utilized to initiate responsive processing and generate output. Optionally, the output is directed through the machine translation components. A language identifier can initially receive linguistic input and identify the language within which such linguistic input is provided to select an appropriate machine translation component. A hybrid process, comprising machine translation components and linguistic components associated with the anchor language, can also serve as an initiating construct from which a single language process is created over time.

    摘要翻译: 能够以一种或多种语言接受语言输入的过程通过重新使用与不同锚语言相关联的现有语言组件以及在锚语言和一种或多种语言之间进行翻译的机器翻译组件而产生。 语言输入针对机器翻译组件,将这种输入从其语言转换为锚语言。 然后利用那些现有的语言分量来启动响应处理并产生输出。 可选地,输出被引导通过机器翻译组件。 语言标识符可以最初接收语言输入并且识别提供这种语言输入以选择适当的机器翻译组件的语言。 包括与锚语言相关联的机器翻译组件和语言组件的混合过程也可以用作随时间创建单个语言过程的起始构造。

    System and method of providing an automated data-collection in spoken dialog systems
    3.
    发明授权
    System and method of providing an automated data-collection in spoken dialog systems 有权
    在口头对话系统中提供自动数据收集的系统和方法

    公开(公告)号:US08185399B2

    公开(公告)日:2012-05-22

    申请号:US11029798

    申请日:2005-01-05

    IPC分类号: G10L21/00 G10L19/00 G06F17/27

    摘要: The invention relates to a system and method for gathering data for use in a spoken dialog system. An aspect of the invention is generally referred to as an automated hidden human that performs data collection automatically at the beginning of a conversation with a user in a spoken dialog system. The method comprises presenting an initial prompt to a user, recognizing a received user utterance using an automatic speech recognition engine and classifying the recognized user utterance using a spoken language understanding module. If the recognized user utterance is not understood or classifiable to a predetermined acceptance threshold, then the method re-prompts the user. If the recognized user utterance is not classifiable to a predetermined rejection threshold, then the method transfers the user to a human as this may imply a task-specific utterance. The received and classified user utterance is then used for training the spoken dialog system.

    摘要翻译: 本发明涉及一种用于收集在口头对话系统中使用的数据的系统和方法。 本发明的一个方面通常被称为在与对话系统中的用户的对话开始时自动执行数据收集的自动隐藏人。 该方法包括向用户呈现初始提示,使用自动语音识别引擎识别接收到的用户话语,并使用口语理解模块对所识别的用户话语进行分类。 如果识别的用户话语不能被理解或可被分类到预定的接受阈值,则该方法重新提示用户。 如果识别的用户话语不能被分类为预定的拒绝阈值,则该方法将用户转移给人,因为这可能意味着任务特定的话语。 然后,接收和分类的用户话语用于训练口语对话系统。

    Active labeling for spoken language understanding
    4.
    发明授权
    Active labeling for spoken language understanding 有权
    积极标注口语理解

    公开(公告)号:US07949525B2

    公开(公告)日:2011-05-24

    申请号:US12485103

    申请日:2009-06-16

    IPC分类号: G10L15/00 G10L15/06 G10L15/20

    CPC分类号: G10L15/1822

    摘要: A spoken language understanding method and system are provided. The method includes classifying a set of labeled candidate utterances based on a previously trained classifier, generating classification types for each candidate utterance, receiving confidence scores for the classification types from the trained classifier, sorting the classified utterances based on an analysis of the confidence score of each candidate utterance compared to a respective label of the candidate utterance, and rechecking candidate utterances according to the analysis. The system includes modules configured to control a processor in the system to perform the steps of the method.

    摘要翻译: 提供口语理解方法和系统。 该方法包括基于先前训练的分类器对一组标记的候选话语进行分类,为每个候选语音生成分类类型,从训练分类器接收分类类型的置信度分数, 每个候选话语与候选话语的相应标签相比较,并且根据分析重新检查候选话语。 该系统包括被配置为控制系统中的处理器以执行该方法的步骤的模块。

    Apparatus and method for spoken language understanding by using semantic role labeling
    5.
    发明授权
    Apparatus and method for spoken language understanding by using semantic role labeling 有权
    通过使用语义角色标注来进行口语理解的装置和方法

    公开(公告)号:US07742911B2

    公开(公告)日:2010-06-22

    申请号:US11095299

    申请日:2005-03-31

    IPC分类号: G06F17/28

    摘要: An apparatus and a method are provided for using semantic role labeling for spoken language understanding. A received utterance semantically parsed by semantic role labeling. A predicate or at least one argument is extracted from the semantically parsed utterance. An intent is estimated based on the predicate or the at least one argument. In another aspect, a method is provided for training a spoken language dialog system that uses semantic role labeling. An expert is provided with a group of predicate/argument pairs. Ones of the predicate/argument pairs are selected as intents. Ones of the arguments are selected as named entities. Mappings from the arguments to frame slots are designed.

    摘要翻译: 提供了一种使用语义角色标识来进行语言理解的装置和方法。 被语义角色标注语义解析的语音接收语句。 从语义解析的话语中提取谓词或至少一个参数。 根据谓词或至少一个参数估计意图。 另一方面,提供了一种用于训练使用语义角色标注的口语对话系统的方法。 专家提供了一组谓词/参数对。 谓词/参数对的一部分被选为意图。 参数的一部分被选为命名实体。 从框架插槽的参数映射被设计。

    Active learning process for spoken dialog systems
    7.
    发明授权
    Active learning process for spoken dialog systems 有权
    口语对话系统的主动学习过程

    公开(公告)号:US07562014B1

    公开(公告)日:2009-07-14

    申请号:US11862008

    申请日:2007-09-26

    IPC分类号: G06F17/27 G10L15/00

    摘要: A large amount of human labor is required to transcribe and annotate a training corpus that is needed to create and update models for automatic speech recognition (ASR) and spoken language understanding (SLU). Active learning enables a reduction in the amount of transcribed and annotated data required to train ASR and SLU models. In one aspect of the present invention, an active learning ASR process and active learning SLU process are coupled, thereby enabling further efficiencies to be gained relative to a process that maintains an isolation of data in both the ASR and SLU domains.

    摘要翻译: 需要大量的人力劳动来转录和注释创建和更新自动语音识别(ASR)和语言理解(SLU)模型所需的训练语料库。 主动学习可以减少训练ASR和SLU模型所需的转录和注释数据量。 在本发明的一个方面,耦合主动学习ASR过程和主动学习SLU过程,从而相对于维持ASR和SLU域中的数据隔离的过程而获得进一步的效率。

    Unsupervised and active learning in automatic speech recognition for call classification
    8.
    发明授权
    Unsupervised and active learning in automatic speech recognition for call classification 有权
    无监督和主动学习自动语音识别呼叫分类

    公开(公告)号:US08818808B2

    公开(公告)日:2014-08-26

    申请号:US11063910

    申请日:2005-02-23

    IPC分类号: G10L15/06

    摘要: Utterance data that includes at least a small amount of manually transcribed data is provided. Automatic speech recognition is performed on ones of the utterance data not having a corresponding manual transcription to produce automatically transcribed utterances. A model is trained using all of the manually transcribed data and the automatically transcribed utterances. A predetermined number of utterances not having a corresponding manual transcription are intelligently selected and manually transcribed. Ones of the automatically transcribed data as well as ones having a corresponding manual transcription are labeled. In another aspect of the invention, audio data is mined from at least one source, and a language model is trained for call classification from the mined audio data to produce a language model.

    摘要翻译: 提供了至少包含少量手动转录数据的语音数据。 对没有相应的手动转录的话语数据中的一个进行自动语音识别以产生自动转录的话语。 使用所有手动转录数据和自动转录的话语训练模型。 智能地选择并且手动地转录预定数量的不具有对应的手动转录的话语。 自动转录的数据以及具有相应手动转录的数据的标签。 在本发明的另一方面,音频数据从至少一个源开始,并且语言模型被训练用于从所开采的音频数据进行呼叫分类以产生语言模型。

    System and method of semi-supervised learning for spoken language understanding using semantic role labeling
    9.
    发明授权
    System and method of semi-supervised learning for spoken language understanding using semantic role labeling 有权
    使用语义角色标签进行口语理解的半监督学习的系统和方法

    公开(公告)号:US08321220B1

    公开(公告)日:2012-11-27

    申请号:US11290859

    申请日:2005-11-30

    IPC分类号: G10L15/00

    CPC分类号: G10L15/063 G09B19/04

    摘要: A system and method are disclosed for providing semi-supervised learning for a spoken language understanding module using semantic role labeling. The method embodiment relates to a method of generating a spoken language understanding module. Steps in the method comprise selecting at least one predicate/argument pair as an intent from a set of the most frequent predicate/argument pairs for a domain, labeling training data using mapping rules associated with the selected at least one predicate/argument pair, training a call-type classification model using the labeled training data, re-labeling the training data using the call-type classification model and iteratively several of the above steps until training set labels converge.

    摘要翻译: 公开了一种用于为使用语义角色标记的口语理解模块提供半监督学习的系统和方法。 该方法实施例涉及一种产生口头语言理解模块的方法。 该方法中的步骤包括从一个域的最频繁谓词/参数对集合中选择至少一个谓词/参数对作为意图,使用与所选择的至少一个谓词/参数对相关联的映射规则来标记训练数据,训练 使用标记的训练数据的呼叫类型分类模型,使用呼叫类型分类模型重新标记训练数据,并且迭代地执行上述几个步骤,直到训练集标签收敛。

    Combining active and semi-supervised learning for spoken language understanding
    10.
    发明授权
    Combining active and semi-supervised learning for spoken language understanding 有权
    结合积极和半监督的学习语言理解

    公开(公告)号:US08010357B2

    公开(公告)日:2011-08-30

    申请号:US11033902

    申请日:2005-01-12

    IPC分类号: G10L15/06

    摘要: Combined active and semi-supervised learning to reduce an amount of manual labeling when training a spoken language understanding model classifier. The classifier may be trained with human-labeled utterance data. Ones of a group of unselected utterance data may be selected for manual labeling via active learning. The classifier may be changed, via semi-supervised learning, based on the selected ones of the unselected utterance data.

    摘要翻译: 组合主动和半监督学习,在训练口语语言理解模型分类器时减少手动标注量。 分类器可以用人标记的话语数据进行训练。 可以通过主动学习选择一组未选择的话语数据进行手动标注。 分类器可以通过半监督学习,基于所选择的未被选择的话语数据来改变。