-
公开(公告)号:US20200125673A1
公开(公告)日:2020-04-23
申请号:US16167552
申请日:2018-10-23
发明人: RANIT AHARONOV , Liat Ein Dor , Alon Halfon , Yosi Mass , IIya Shnayderman , Noam Slonim , ELAD VENEZIAN
摘要: A method of estimating a thematic similarity of sentences, comprising receiving a corpus of a plurality of documents describing a plurality of topics where each document comprises a plurality of sentences arranged in a plurality of sections, constructing sentence triplets for at least some of the sentences, each sentence triplet comprising a respective sentence, a respective positive sentence selected randomly from the section comprising the respective sentence and a respective negative sentence selected randomly from another section, training a first neural network with the sentence triplets to identify sentence-sentence vectors mapping each sentence with a shorter distance to its respective positive sentence compared to the distance to its respective negative sentence and outputting the first neural network for estimating thematic similarity between a pair of sentences by computing a distance between the sentence-sentence vectors produced for each sentence of the pair by the first neural network.