ADVERSARIAL PRETRAINING OF MACHINE LEARNING MODELS

    公开(公告)号:US20210326751A1

    公开(公告)日:2021-10-21

    申请号:US16882296

    申请日:2020-05-22

    摘要: This document relates to training of machine learning models. One example method involves providing a machine learning model having one or more mapping layers. The one or more mapping layers can include at least a first mapping layer configured to map components of pretraining examples into first representations in a space. The example method also includes performing a pretraining stage on the one or more mapping layers using the pretraining examples. The pretraining stage can include adding noise to the first representations of the components of the pretraining examples to obtain noise-adjusted first representations. The pretraining stage can also include performing a self-supervised learning process to pretrain the one or more mapping layers using at least the first representations of the training data items and the noise-adjusted first representations of the training data items.

    Representation Learning Using Multi-Task Deep Neural Networks
    2.
    发明申请
    Representation Learning Using Multi-Task Deep Neural Networks 审中-公开
    使用多任务深层神经网络的表征学习

    公开(公告)号:US20170032035A1

    公开(公告)日:2017-02-02

    申请号:US14811808

    申请日:2015-07-28

    IPC分类号: G06F17/30 G06N3/08

    摘要: A system may comprise one or more processors and memory storing instructions that, when executed by one or more processors, configure one or more processors to perform a number of operations or tasks, such as receiving a query or a document, and mapping the query or the document into a lower dimensional representation by performing at least one operational layer that shares at least two disparate tasks.

    摘要翻译: 系统可以包括一个或多个处理器和存储器存储指令,当由一个或多个处理器执行时,配置一个或多个处理器来执行多个操作或任务,诸如接收查询或文档,以及映射查询或 通过执行共享至少两个不同任务的至少一个操作层来将文档转换成较低维度的表示。

    GRAPH REPRESENTATIONS FOR IDENTIFYING A NEXT WORD

    公开(公告)号:US20190377792A1

    公开(公告)日:2019-12-12

    申请号:US16022001

    申请日:2018-06-28

    IPC分类号: G06F17/27 G06F17/30

    摘要: Systems, methods, and computer-executable instructions for approximating a softmax layer are disclosed. A small world graph that includes a plurality of nodes is constructed for a vocabulary of a natural language model. A context vector is transformed. The small world graph is searched using the transformed context vector to identify a top-K hypothesis. A distance from the context vector for each of the top-K hypothesis is determined. The distance is transformed to an original inner product space. A softmax distribution is computed for the softmax layer over the inner product space of the top-K hypothesis. The softmax layer is useful for determining a next word in a speech recognition or machine translation.

    Learning graph representations using hierarchical transformers for content recommendation

    公开(公告)号:US11676001B2

    公开(公告)日:2023-06-13

    申请号:US17093426

    申请日:2020-11-09

    IPC分类号: G06N3/045

    CPC分类号: G06N3/045

    摘要: Knowledge graphs can greatly improve the quality of content recommendation systems. There is a broad variety of knowledge graphs in the domain including clicked user-ad graphs, clicked query-ad graphs, keyword-display URL graphs etc. A hierarchical Transformer model learns entity embeddings in knowledge graphs. The model consists of two different Transformer blocks where the bottom block generates relation-dependent embeddings for the source entity and its neighbors, and the top block aggregates the outputs from the bottom block to produce the target entity embedding. To balance the information from contextual entities and the source entity itself, a masked entity model (MEM) task is combined with a link prediction task in model training.