SYSTEMS AND METHODS FOR MULTILINGUAL SENTENCE EMBEDDINGS

    公开(公告)号:US20220067279A1

    公开(公告)日:2022-03-03

    申请号:US17008569

    申请日:2020-08-31

    IPC分类号: G06F40/263

    摘要: Disclosed embodiments relate to natural language processing. Techniques can include obtaining an encoding model, obtaining a first sentence in a first language and a label associated with the first sentence, obtaining a second sentence in a second language, encoding the first sentence and second sentence using the encoding model, determining the intent of the first encoded sentence, determining the language of the first encoded sentence and the language of the second encoded sentence, and updating the encoding model based on the determined intent of the first encoded sentence, the label, the determined language of the first encoded sentence, and the determined language of the second encoded sentence

    SYSTEMS AND METHODS FOR UNSUPERVISED PARAPHRASE MINING

    公开(公告)号:US20220067298A1

    公开(公告)日:2022-03-03

    申请号:US17008563

    申请日:2020-08-31

    摘要: Disclosed embodiments relate to aligning pairs of sentences. Techniques can include receiving a plurality of sentences; generating a graph for each of at least two sentences of the plurality of sentences, wherein generating a graph for each sentence of the at least two sentences comprises: identifying one or more tokens for the sentence; and connecting via edges the one or more tokens; generating a combined graph for the at least two sentences wherein generating a combined graph comprises: aligning the identified tokens of the at least two sentences of the plurality of sentences; identifying matching and non-matching tokens between the at least two sentences based on the alignment; and merging matching tokens into a combined graph node.