Semantic frame identification with distributed word representations
    1.
    发明授权
    Semantic frame identification with distributed word representations 有权
    语义帧识别与分布式字表示

    公开(公告)号:US09262406B1

    公开(公告)日:2016-02-16

    申请号:US14271997

    申请日:2014-05-07

    Applicant: Google Inc.

    Abstract: A computer-implemented technique can include receiving, at a server, labeled training data including a plurality of groups of words, each group of words having a predicate word, each word having generic word embeddings. The technique can include extracting, at the server, the plurality of groups of words in a syntactic context of their predicate words. The technique can include concatenating, at the server, the generic word embeddings to create a high dimensional vector space representing features for each word. The technique can include obtaining, at the server, a model having a learned mapping from the high dimensional vector space to a low dimensional vector space and learned embeddings for each possible semantic frame in the low dimensional vector space. The technique can also include outputting, by the server, the model for storage, the model being configured to identify a specific semantic frame for an input.

    Abstract translation: 计算机实现的技术可以包括在服务器处接收包括多组单词的标记训练数据,每组单词具有谓词单词,每个单词具有通用单词嵌入。 该技术可以包括在服务器处提取他们的谓词单词的句法语境中的多组单词。 该技术可以包括在服务器处连接通用词嵌入以创建表示每个单词的特征的高维向量空间。 该技术可以包括在服务器处获得具有从高维矢量空间到低维向量空间的学习映射的模型,以及在低维向量空间中为每个可能的语义帧学习嵌入。 该技术还可以包括由服务器输出用于存储的模型,该模型被配置为识别用于输入的特定语义帧。

    CROSS-LINGUAL DISCRIMINATIVE LEARNING OF SEQUENCE MODELS WITH POSTERIOR REGULARIZATION
    2.
    发明申请
    CROSS-LINGUAL DISCRIMINATIVE LEARNING OF SEQUENCE MODELS WITH POSTERIOR REGULARIZATION 有权
    具有定期定期的序列模型的横向分析学习

    公开(公告)号:US20150169549A1

    公开(公告)日:2015-06-18

    申请号:US14105973

    申请日:2013-12-13

    Applicant: Google Inc.

    CPC classification number: G06F17/289 G06F17/27 G06F17/2827

    Abstract: A computer-implemented method can include obtaining (i) an aligned bi-text for a source language and a target language, and (ii) a supervised sequence model for the source language. The method can include labeling a source side of the aligned bi-text using the supervised sequence model and projecting labels from the labeled source side to a target side of the aligned bi-text to obtain a labeled target side of the aligned bi-text. The method can include filtering the labeled target side based on a task of a natural language processing (NLP) system configured to utilize a sequence model for the target language to obtain a filtered target side of the aligned bi-text. The method can also include training the sequence model for the target language using posterior regularization with soft constraints on the filtered target side to obtain a trained sequence model for the target language.

    Abstract translation: 计算机实现的方法可以包括获得(i)源语言和目标语言的对齐双文本,以及(ii)源语言的监督序列模型。 该方法可以包括使用监督序列模型来标记对准的双文本的源侧,并将标记从标记的源侧投影到对准的双文本的目标侧,以获得对齐的双文本的标记的目标侧。 该方法可以包括基于被配置为利用目标语言的序列模型来获得对齐的双文本的经滤波的目标侧的自然语言处理(NLP)系统的任务来过滤标记的目标侧。 该方法还可以包括使用经过过滤的目标侧的软约束的后验正规化来训练目标语言的序列模型,以获得用于目标语言的经训练的序列模型。

    Cross-lingual discriminative learning of sequence models with posterior regularization

    公开(公告)号:US09779087B2

    公开(公告)日:2017-10-03

    申请号:US14105973

    申请日:2013-12-13

    Applicant: Google Inc.

    CPC classification number: G06F17/289 G06F17/27 G06F17/2827

    Abstract: A computer-implemented method can include obtaining (i) an aligned bi-text for a source language and a target language, and (ii) a supervised sequence model for the source language. The method can include labeling a source side of the aligned bi-text using the supervised sequence model and projecting labels from the labeled source side to a target side of the aligned bi-text to obtain a labeled target side of the aligned bi-text. The method can include filtering the labeled target side based on a task of a natural language processing (NLP) system configured to utilize a sequence model for the target language to obtain a filtered target side of the aligned bi-text. The method can also include training the sequence model for the target language using posterior regularization with soft constraints on the filtered target side to obtain a trained sequence model for the target language.

    SEMANTIC FRAME IDENTIFICATION WITH DISTRIBUTED WORD REPRESENTATIONS
    4.
    发明申请
    SEMANTIC FRAME IDENTIFICATION WITH DISTRIBUTED WORD REPRESENTATIONS 审中-公开
    具有分布式词汇表示的语义框架识别

    公开(公告)号:US20160239739A1

    公开(公告)日:2016-08-18

    申请号:US15008794

    申请日:2016-01-28

    Applicant: Google Inc.

    Abstract: A computer-implemented technique can include receiving, at a server, labeled training data including a plurality of groups of words, each group of words having a predicate word, each word having generic word embeddings. The technique can include extracting, at the server, the plurality of groups of words in a syntactic context of their predicate words. The technique can include concatenating, at the server, the generic word embeddings to create a high dimensional vector space representing features for each word. The technique can include obtaining, at the server, a model having a learned mapping from the high dimensional vector space to a low dimensional vector space and learned embeddings for each possible semantic frame in the low dimensional vector space. The technique can also include outputting, by the server, the model for storage, the model being configured to identify a specific semantic frame for an input.

    Abstract translation: 计算机实现的技术可以包括在服务器处接收包括多组单词的标记训练数据,每组单词具有谓词单词,每个单词具有通用单词嵌入。 该技术可以包括在服务器处提取他们的谓词单词的句法语境中的多组单词。 该技术可以包括在服务器处连接通用词嵌入以创建表示每个单词的特征的高维向量空间。 该技术可以包括在服务器处获得具有从高维矢量空间到低维向量空间的学习映射的模型,以及在低维向量空间中为每个可能的语义帧学习嵌入。 该技术还可以包括由服务器输出用于存储的模型,该模型被配置为识别用于输入的特定语义帧。

Patent Agency Ranking