METHODS AND SYSTEMS FOR LANGUAGE-AGNOSTIC MACHINE LEARNING IN NATURAL LANGUAGE PROCESSING USING FEATURE EXTRACTION
    1.
    发明申请
    METHODS AND SYSTEMS FOR LANGUAGE-AGNOSTIC MACHINE LEARNING IN NATURAL LANGUAGE PROCESSING USING FEATURE EXTRACTION 审中-公开
    使用特征提取的自然语言处理中语言学习机器学习的方法与系统

    公开(公告)号:US20160162467A1

    公开(公告)日:2016-06-09

    申请号:US14964525

    申请日:2015-12-09

    IPC分类号: G06F17/27

    摘要: Methods, apparatuses, and systems are presented for generating natural language models using a novel system architecture for feature extraction. A method for extracting features for natural language processing comprises: accessing one or more tokens generated from a document to be processed; receiving one or more feature types defined by user; receiving selection of one or more feature types from a plurality of system-defined and user-defined feature types, wherein each feature type comprises one or more rules for generating features; receiving one or more parameters for the selected feature types, wherein the one or more rules for generating features are defined at least in part by the parameters; generating features associated with the document to be processed based on the selected feature types and the received parameters; and outputting the generated features in a format common among all feature types.

    摘要翻译: 提出了使用用于特征提取的新型系统架构来生成自然语言模型的方法,装置和系统。 一种用于提取自然语言处理特征的方法,包括:访问从要处理的文档生成的一个或多个令牌; 接收用户定义的一个或多个特征类型; 从多个系统定义和用户定义的特征类型中接收对一个或多个特征类型的选择,其中每个特征类型包括用于生成特征的一个或多个规则; 为所选择的特征类型接收一个或多个参数,其中用于生成特征的所述一个或多个规则至少部分地由所述参数定义; 基于所选择的特征类型和接收到的参数来生成与要处理的文档相关联的特征; 并以所有特征类型中共同的格式输出生成的特征。

    INTELLIGENT SYSTEM THAT DYNAMICALLY IMPROVES ITS KNOWLEDGE AND CODE-BASE FOR NATURAL LANGUAGE UNDERSTANDING
    7.
    发明申请
    INTELLIGENT SYSTEM THAT DYNAMICALLY IMPROVES ITS KNOWLEDGE AND CODE-BASE FOR NATURAL LANGUAGE UNDERSTANDING 有权
    智能系统动态改进自然语言理解知识和代码

    公开(公告)号:US20160162466A1

    公开(公告)日:2016-06-09

    申请号:US14964512

    申请日:2015-12-09

    IPC分类号: G06F17/27

    摘要: Systems, methods, and apparatuses are presented for a novel natural language tokenizer and tagger. In some embodiments, a method for tokenizing text for natural language processing comprises: generating from a pool of documents, a set of statistical models comprising one or more entries each indicating a likelihood of appearance of a character/letter sequence in the pool of documents; receiving a set of rules comprising rules that identify character/letter sequences as valid tokens; transforming one or more entries in the statistical models into new rules that are added to the set of rules when the entries indicate a high likelihood; receiving a document to be processed; dividing the document to be processed into tokens based on the set of statistical models and the set of rules, wherein the statistical models are applied where the rules fail to unambiguously tokenize the document; and outputting the divided tokens for natural language processing.

    摘要翻译: 系统,方法和设备被呈现给一种新颖的自然语言标记器和标签器。 在一些实施例中,用于对自然语言处理的文本进行标记化的方法包括:从文档池生成包括一个或多个条目的统计模型集合,每个条目表示在文档库中出现字符/字母序列的可能性; 接收一组包含将字符/字符序列识别为有效令牌的规则的规则; 将统计模型中的一个或多个条目转换为当条目表示高可能性时添加到规则集合中的新规则; 接收待处理的文件; 基于统计模型和规则集合将要处理的文档划分为令牌,其中在规则未能明确地标记文档的情况下应用统计模型; 并输出用于自然语言处理的分割令牌。

    BELIEF TRACKING AND ACTION SELECTION IN SPOKEN DIALOG SYSTEMS
    9.
    发明申请
    BELIEF TRACKING AND ACTION SELECTION IN SPOKEN DIALOG SYSTEMS 有权
    双语对话系统中的直接跟踪和行动选择

    公开(公告)号:US20120053945A1

    公开(公告)日:2012-03-01

    申请号:US13221155

    申请日:2011-08-30

    IPC分类号: G10L15/18

    CPC分类号: G10L15/22

    摘要: An action is performed in a spoken dialog system in response to a user's spoken utterance. A policy which maps belief states of user intent to actions is retrieved or created. A belief state is determined based on the spoken utterance, and an action is selected based on the determined belief state and the policy. The action is performed, and in one embodiment, involves requesting clarification of the spoken utterance from the user. Creating a policy may involve simulating user inputs and spoken dialog system interactions, and modifying policy parameters iteratively until a policy threshold is satisfied. In one embodiment, a belief state is determined by converting the spoken utterance into text, assigning the text to one or more dialog slots associated with nodes in a probabilistic ontology tree (POT), and determining a joint probability based on probability distribution tables in the POT and on the dialog slot assignments.

    摘要翻译: 响应于用户的说话话语,在口语对话系统中执行动作。 检索或创建将用户意图的信念状态映射到动作的策略。 信仰状态是根据口语说出来确定的,并且基于确定的信念状态和策略选择动作。 该动作被执行,并且在一个实施例中,涉及请求澄清来自用户的说话话语。 创建策略可以包括模拟用户输入和对话系统交互,并且迭代地修改策略参数,直到满足策略阈值。 在一个实施例中,通过将口语发音转换成文本来确定置信状态,将文本分配给与概率本体树(POT)中的节点相关联的一个或多个对话时隙,以及基于概率分布表中的概率分布表确定联合概率 POT和对话框插槽分配。

    Belief tracking and action selection in spoken dialog systems
    10.
    发明授权
    Belief tracking and action selection in spoken dialog systems 有权
    在口语对话系统中的信念跟踪和动作选择

    公开(公告)号:US08676583B2

    公开(公告)日:2014-03-18

    申请号:US13221155

    申请日:2011-08-30

    IPC分类号: G10L15/22

    CPC分类号: G10L15/22

    摘要: An action is performed in a spoken dialog system in response to a user's spoken utterance. A policy which maps belief states of user intent to actions is retrieved or created. A belief state is determined based on the spoken utterance, and an action is selected based on the determined belief state and the policy. The action is performed, and in one embodiment, involves requesting clarification of the spoken utterance from the user. Creating a policy may involve simulating user inputs and spoken dialog system interactions, and modifying policy parameters iteratively until a policy threshold is satisfied. In one embodiment, a belief state is determined by converting the spoken utterance into text, assigning the text to one or more dialog slots associated with nodes in a probabilistic ontology tree (POT), and determining a joint probability based on probability distribution tables in the POT and on the dialog slot assignments.

    摘要翻译: 响应于用户的说话话语,在口语对话系统中执行动作。 检索或创建将用户意图的信念状态映射到动作的策略。 信仰状态是根据口语说出来确定的,并且基于确定的信念状态和策略选择动作。 该动作被执行,并且在一个实施例中,涉及请求澄清来自用户的说话话语。 创建策略可以包括模拟用户输入和对话系统交互,并且迭代地修改策略参数,直到满足策略阈值。 在一个实施例中,通过将口语发音转换成文本来确定置信状态,将文本分配给与概率本体树(POT)中的节点相关联的一个或多个对话时隙,以及基于概率分布表中的概率分布表确定联合概率 POT和对话框插槽分配。