DUAL ENCODER RETRIEVAL EFFICIENCY WITH PARAMETER SHARING IN PROJECTION LAYER

    公开(公告)号:US20240346290A1

    公开(公告)日:2024-10-17

    申请号:US18299841

    申请日:2023-04-13

    申请人: Google LLC

    IPC分类号: G06N3/0455

    CPC分类号: G06N3/0455

    摘要: Aspects of the technology provide systems and methods for implementing an asymmetric dual encoder architecture. The architecture includes a token embedder layer section having a first token embedding section associated with a first input and a second token embedding section associated with a second input, and an encoder layer section having a first encoder section receiving token embeddings from the first token embedding section and a second encoder section receiving token embeddings from the second token embedding section. A shared projection layer receives encodings from both the first and second encoder sections and generates a set of projections. An embedding space is configured, based on the set of projections, to generate a question embedding and an answer embedding, in which the question and answer embeddings are used in identifying a set of candidate answers to an input answer.

    Sentence compression using recurrent neural networks

    公开(公告)号:US10229111B1

    公开(公告)日:2019-03-12

    申请号:US15423852

    申请日:2017-02-03

    申请人: Google LLC

    IPC分类号: G06N3/04 G06F17/21 G06F17/27

    摘要: Methods, systems, apparatus, including computer programs encoded on computer storage medium, for generating a sentence summary. In one aspect, the method includes actions of tokenizing the sentence into a plurality of tokens, processing data representative of each token in a first order using an LSTM neural network to initialize an internal state of a second LSTM neural network, processing data representative of each token in a second order using the second LSTM neural network, comprising, for each token in the sentence: processing the data representative of the token using the second LSTM neural network in accordance with a current internal state of the second LSTM neural network to (i) generate an LSTM output for the token, and (ii) to update the current internal state of the second LSTM neural network, and generating the summarized version of the sentence using the outputs of the second LSTM neural network for the tokens.