Systems and methods for machine-learned prediction of semantic similarity between documents

    公开(公告)号:US12210837B2

    公开(公告)日:2025-01-28

    申请号:US18321424

    申请日:2023-05-22

    Applicant: Google LLC

    Abstract: Systems and methods of the present disclosure are directed to a method for predicting semantic similarity between documents. The method can include obtaining a first document and a second document. The method can include parsing the first document into a plurality of first textual blocks and the second document into a plurality of second textual blocks. The method can include processing each of the plurality of first textual blocks and the second textual blocks with a machine-learned semantic document encoding model to obtain a first document encoding and a second document encoding. The method can include determining a similarity metric descriptive of a semantic similarity between the first document and the second document based on the first document encoding and the second document encoding.

    Processing large-scale textual inputs using neural networks

    公开(公告)号:US12182509B2

    公开(公告)日:2024-12-31

    申请号:US17336093

    申请日:2021-06-01

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing a machine learning task on a tuple of respective input sequences to generate an output. In one aspect, one of the systems includes a neural network comprising a plurality of encoder neural networks and a head neural network, each encoder neural network configured to: receive a respective input sequence from the tuple; process the respective input sequence using one or more encoder network layers to generate an encoded representation comprising a sequence of tokens; and process each of some or all of the tokens in the sequence of tokens using a projection layer to generate a lower-dimensional representation, and the head neural network configured to: receive lower-dimensional representations of a respective proper subset of the sequence of tokens generated by the encoder neural network; and process the lower-dimensional representations to generate the output.

    PROCESSING LARGE-SCALE TEXTUAL INPUTS USING NEURAL NETWORKS

    公开(公告)号:US20210374345A1

    公开(公告)日:2021-12-02

    申请号:US17336093

    申请日:2021-06-01

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing a machine learning task on a tuple of respective input sequences to generate an output. In one aspect, one of the systems includes a neural network comprising a plurality of encoder neural networks and a head neural network, each encoder neural network configured to: receive a respective input sequence from the tuple; process the respective input sequence using one or more encoder network layers to generate an encoded representation comprising a sequence of tokens; and process each of some or all of the tokens in the sequence of tokens using a projection layer to generate a lower-dimensional representation, and the head neural network configured to: receive lower-dimensional representations of a respective proper subset of the sequence of tokens generated by the encoder neural network; and process the lower-dimensional representations to generate the output.

    SORTING ATTENTION NEURAL NETWORKS

    公开(公告)号:US20210248450A1

    公开(公告)日:2021-08-12

    申请号:US17169718

    申请日:2021-02-08

    Applicant: Google LLC

    Abstract: A system for performing a machine learning task on a network input is described. The system includes one or more computers and one or more storage devices storing instructions that, when executed by the one or more computers, cause the one or more computers to implement (i) multiple sorting networks in which each sorting network is configured to sort vector blocks in a sequence of vector blocks to generate a sorted sequence of vector blocks; and (ii) a sorting attention neural network configured to perform the machine learning task on the input sequence by executing multiple sorting attention mechanisms using the sorting networks.

    Systems and Methods for Machine-Learned Prediction of Semantic Similarity Between Documents

    公开(公告)号:US20220129638A1

    公开(公告)日:2022-04-28

    申请号:US17078569

    申请日:2020-10-23

    Applicant: Google LLC

    Abstract: Systems and methods of the present disclosure are directed to a method for predicting semantic similarity between documents. The method can include obtaining a first document and a second document. The method can include parsing the first document into a plurality of first textual blocks and the second document into a plurality of second textual blocks. The method can include processing each of the plurality of first textual blocks and the second textual blocks with a machine-learned semantic document encoding model to obtain a first document encoding and a second document encoding. The method can include determining a similarity metric descriptive of a semantic similarity between the first document and the second document based on the first document encoding and the second document encoding.

Patent Agency Ranking