-
1.
公开(公告)号:US20230376779A1
公开(公告)日:2023-11-23
申请号:US18316744
申请日:2023-05-12
申请人: Hongyu LI , Bin DONG , Shanshan JIANG , Lei DING
发明人: Hongyu LI , Bin DONG , Shanshan JIANG , Lei DING
IPC分类号: G06N3/09 , G06F40/30 , G06F40/166 , G06F40/40
CPC分类号: G06N3/09 , G06F40/30 , G06F40/166 , G06F40/40
摘要: A method and an apparatus for training a machine reading comprehension model, and a non-transitory computer-readable recording medium are provided. A training process is repeatedly performed using a training sample set to obtain a machine reading comprehension model. The training process includes inputting a sample article and a sample question into the machine reading comprehension model, generating a first predicted answer, and calculating a first loss between the first predicted answer and a sample answer; replacing the sample question with a mask to obtain a mask question, inputting the sample article and the mask question into the machine reading comprehension model, generating a second predicted answer corresponding to the mask question, and calculating a second loss between the second predicted answer and the sample answer; and updating the machine reading comprehension model so as to minimize a total loss.
-
2.
公开(公告)号:US20190251164A1
公开(公告)日:2019-08-15
申请号:US16242365
申请日:2019-01-08
申请人: Lei DING , Yixuan TONG , Bin DONG , Shanshan JIANG , Yongwei ZHANG
发明人: Lei DING , Yixuan TONG , Bin DONG , Shanshan JIANG , Yongwei ZHANG
CPC分类号: G06F17/2765 , G06F16/313 , G06F16/353 , G06K9/6256
摘要: A method, an apparatus and an electronic device for performing entity linking, and a non-transitory computer-readable recording medium are provided. The method includes constructing training data including a plurality of sets of labeled data using an existing unambiguous entity database where unambiguous entities corresponding to respective entity words are stored, each set of the labeled data including a text having an entity word and an unambiguous entity linked with the entity word; training an unambiguous entity recognition model whose output is a matching probability between an entity word in a text and an unambiguous entity using the training data; and inputting a text having an entity word to be recognized into the unambiguous entity recognition model, and determining an unambiguous entity linked with the entity word to be recognized based on an output result of the unambiguous entity recognition model.
-
3.
公开(公告)号:US20240220523A1
公开(公告)日:2024-07-04
申请号:US18390496
申请日:2023-12-20
申请人: Rui CHENG , Bin DONG , Shanshan JIANG , Lu LUO , Lei DING
发明人: Rui CHENG , Bin DONG , Shanshan JIANG , Lu LUO , Lei DING
CPC分类号: G06F16/3347 , G06N3/045
摘要: Disclosed are a semantic matching and retrieval method and apparatus. The semantic matching and retrieval method includes steps of obtaining both the vector representation of a query text and the vector representation of a document text; obtaining the final vector representation of the query text; obtaining the final vector representation of the document text; calculating, based on the final vector representation of the query text and the final vector representation of the document text, the similarity score between the query text and the document text; and selecting, based on the similarity scores between the query text and a plurality of document texts, a document text matching the query text from the plurality of document texts.
-
公开(公告)号:US20210027178A1
公开(公告)日:2021-01-28
申请号:US16934112
申请日:2020-07-21
申请人: Lei DING , Yixuan TONG , Bin DONG , Shanshan JIANG , Yongwei ZHANG
发明人: Lei DING , Yixuan TONG , Bin DONG , Shanshan JIANG , Yongwei ZHANG
IPC分类号: G06N5/04 , G06F16/9535 , G06K9/62 , G06N3/04 , G06F40/289
摘要: A recommendation method and a recommendation apparatus based on deep reinforcement learning, and a non-transitory computer-readable recording medium are provided. In the method, entity semantic information representation vectors of products are generated based on a product knowledge graph; browsing context information representation vectors of the products are generated based on historical browsing behavior of a user with respect to products; the entity semantic information representation vectors and the browsing context information representation vectors of the respective products are merged to obtain vectors of the products; a recommendation model based on deep reinforcement learning is constructed, and the recommendation model based on the deep reinforcement learning is offline-trained using historical behavior data of the user to obtain the offline-trained recommendation model, the products in the historical behavior data of the user are represented by the vectors of the products; and products are online-recommended using the offline-trained recommendation model.
-
5.
公开(公告)号:US20200242302A1
公开(公告)日:2020-07-30
申请号:US16750182
申请日:2020-01-23
申请人: Liang LIANG , Lei DING , Bin DONG , Shanshan JIANG , Yixuan TONG
发明人: Liang LIANG , Lei DING , Bin DONG , Shanshan JIANG , Yixuan TONG
IPC分类号: G06F40/216 , G06F40/30 , G06F40/211 , G06F40/56 , G06F16/242
摘要: An intention identification method includes generating a heterogeneous text network based on a language material sample; using a graph embedding algorithm to perform learning with respect to the heterogeneous text network and obtain a vector representation of the language material sample and a word, and determining keywords of the language material sample based on a similarity in terms of a vector between the language material sample and the word in the language material sample; training an intention identification model until a predetermined training termination condition is satisfied, by using the keywords of the language material samples, and obtaining the trained intention identification model; and receiving a language material query, and using the trained intention identification model to identify an intention of the language material query.
-
公开(公告)号:US20220164536A1
公开(公告)日:2022-05-26
申请号:US17455967
申请日:2021-11-22
申请人: Yixuan TONG , Yongwei ZHANG , Bin DONG , Shanshan JIANG , Jiashi ZHANG
发明人: Yixuan TONG , Yongwei ZHANG , Bin DONG , Shanshan JIANG , Jiashi ZHANG
IPC分类号: G06F40/295
摘要: A method and an apparatus for sequence labeling on an entity text, and a non-transitory computer-readable recording medium are provided. In the method, a start position of an entity text within a target text is determined. Then, a first matrix is generated based on the start position of the entity text. Elements in the first matrix indicates focusable weights of each word with respect to other words in the target text. Then, a named entity recognition model is generated using the first matrix. The named entity recognition model is obtained by training using first training data, the first training data includes word embeddings corresponding to respective texts in a training text set, and the texts are texts whose entity label has been labeled. Then, the target text is input to the named entity recognition model, and probability distribution of the entity label is output.
-
公开(公告)号:US20210390454A1
公开(公告)日:2021-12-16
申请号:US17343955
申请日:2021-06-10
申请人: Tianxiong XIAO , Yixuan TONG , Bin DONG , Shanshan JIANG , Jiashi ZHANG
发明人: Tianxiong XIAO , Yixuan TONG , Bin DONG , Shanshan JIANG , Jiashi ZHANG
摘要: Disclosed is an apparatus for training a machine reading comprehension model. The apparatus is inclusive of a distance calculation part configured to calculate, based on a position of each word within a training text and a position of an answer label within the training text, a distance between the same word and the answer label; a label smoothing part configured to input the distance between the same word and the answer label into a smooth function to obtain a probability value corresponding to the same word, outputted from the smooth function; and a model training part configured to make the probability value corresponding to the same word serve as a smoothed label of the same word so as to train the machine reading comprehension model.
-
公开(公告)号:US20200311353A1
公开(公告)日:2020-10-01
申请号:US16809844
申请日:2020-03-05
申请人: Yihan LI , Boyan LIU , Shanshan JIANG , Yixuan TONG , Bin DONG
发明人: Yihan LI , Boyan LIU , Shanshan JIANG , Yixuan TONG , Bin DONG
摘要: A method and an apparatus for processing word vectors of a neural machine translation model, and a non-transitory computer-readable recording medium are provided. In the method, word vectors that are input to an encoder and a decoder of a neural machine translation model are updated using semantic information among head representations at the same time and semantic information among head representations at different times, and the model is trained or translation is performed using the updated word vectors, thereby improving the model performance of the neural machine translation model.
-
9.
公开(公告)号:US20190198014A1
公开(公告)日:2019-06-27
申请号:US16218693
申请日:2018-12-13
申请人: Yihan LI , Yixuan TONG , Shanshan JIANG , Bin DONG
发明人: Yihan LI , Yixuan TONG , Shanshan JIANG , Bin DONG
CPC分类号: G10L15/1815 , G06F16/3347 , G10L15/063 , G10L15/14
摘要: A method and an apparatus for ranking responses of a dialog model, and a non-transitory computer-readable recording medium are provided. The dialog model is trained based on a sample data set. The method includes obtaining, from the sample data set, at least one similar dialog whose content is semantically similar to content of a target dialog; obtaining a probability of at least one target response generated by the dialog model when inputting the target dialog, and obtaining a probability of a target response generated by the dialog model when inputting the similar dialog; statistically analyzing, based on the probabilities of the respective generated target responses, scores of the target responses, the scores of the target responses being positively correlated with the probabilities of the target responses; and ranking the target responses in a descending order of the scores.
-
公开(公告)号:US20230073746A1
公开(公告)日:2023-03-09
申请号:US17821227
申请日:2022-08-22
申请人: Tianxiong XIAO , Rui CHENG , Bin DONG , Shanshan JIANG , Jiashi ZHANG
发明人: Tianxiong XIAO , Rui CHENG , Bin DONG , Shanshan JIANG , Jiashi ZHANG
IPC分类号: G06F40/47 , G06F40/30 , G06F40/205
摘要: A method and an apparatus for machine reading comprehension, and a non-transitory computer-readable recording medium are provided. In the method, a paragraph-question pair is obtained, and subword vectors corresponding to subwords in the paragraph-question pair are generated. Then, for each subword, relative positions of the subword with respect to the other subwords are determined based on distances, and self-attention information of the subword in a first part and mutual attention information of the subword in a second part are calculated by using the relative positions and the subword vector. Then, a fusion vector of the subword is generated based on the self-attention information and the mutual attention information. Then, the fusion vectors of the subwords are input to a decoder of a machine reading comprehension model so as to obtain an answer predicted by the decoder.
-
-
-
-
-
-
-
-
-