-
公开(公告)号:US20220164536A1
公开(公告)日:2022-05-26
申请号:US17455967
申请日:2021-11-22
申请人: Yixuan TONG , Yongwei ZHANG , Bin DONG , Shanshan JIANG , Jiashi ZHANG
发明人: Yixuan TONG , Yongwei ZHANG , Bin DONG , Shanshan JIANG , Jiashi ZHANG
IPC分类号: G06F40/295
摘要: A method and an apparatus for sequence labeling on an entity text, and a non-transitory computer-readable recording medium are provided. In the method, a start position of an entity text within a target text is determined. Then, a first matrix is generated based on the start position of the entity text. Elements in the first matrix indicates focusable weights of each word with respect to other words in the target text. Then, a named entity recognition model is generated using the first matrix. The named entity recognition model is obtained by training using first training data, the first training data includes word embeddings corresponding to respective texts in a training text set, and the texts are texts whose entity label has been labeled. Then, the target text is input to the named entity recognition model, and probability distribution of the entity label is output.
-
公开(公告)号:US20210303777A1
公开(公告)日:2021-09-30
申请号:US17215068
申请日:2021-03-29
申请人: Yixuan TONG , Yongwei ZHANG , Bin DONG , Shanshan JIANG , Jiashi ZHANG
发明人: Yixuan TONG , Yongwei ZHANG , Bin DONG , Shanshan JIANG , Jiashi ZHANG
IPC分类号: G06F40/166 , G06K9/62 , G06N3/08
摘要: A method and an apparatus for fusing position information, and a non-transitory computer-readable recording medium are provided. In the method, words of an input sentence are segmented to obtain a first sequence of words in the input sentence, and absolute position information of the words in the first sequence is generated. Then, subwords of the words in the first sequence are segmented to obtain a second sequence including subwords, and position information of the subwords in the second sequence are generated, based on the absolute position information of the words in the first sequence, to which the respective subwords belong. Then, the position information of the subwords in the second sequence are fused into a self-attention model to perform model training or model prediction.
-
公开(公告)号:US20210081788A1
公开(公告)日:2021-03-18
申请号:US17015560
申请日:2020-09-09
申请人: Lei DING , Yixuan TONG , Jiashi ZHANG , Shanshan JIANG , Yongwei ZHANG
发明人: Lei DING , Yixuan TONG , Jiashi ZHANG , Shanshan JIANG , Yongwei ZHANG
摘要: A method and an apparatus for generating sample data, and a non-transitory computer-readable recording medium are provided. In the method, at least two weak supervision recommendation models of a recommendation system are generated; a dependency relation between the at least two weak supervision recommendation models is learned by training a neural network model; and the sample data is re-labelled using the trained neural network model to obtain updated sample data.
-
公开(公告)号:US20210390454A1
公开(公告)日:2021-12-16
申请号:US17343955
申请日:2021-06-10
申请人: Tianxiong XIAO , Yixuan TONG , Bin DONG , Shanshan JIANG , Jiashi ZHANG
发明人: Tianxiong XIAO , Yixuan TONG , Bin DONG , Shanshan JIANG , Jiashi ZHANG
摘要: Disclosed is an apparatus for training a machine reading comprehension model. The apparatus is inclusive of a distance calculation part configured to calculate, based on a position of each word within a training text and a position of an answer label within the training text, a distance between the same word and the answer label; a label smoothing part configured to input the distance between the same word and the answer label into a smooth function to obtain a probability value corresponding to the same word, outputted from the smooth function; and a model training part configured to make the probability value corresponding to the same word serve as a smoothed label of the same word so as to train the machine reading comprehension model.
-
5.
公开(公告)号:US20190251164A1
公开(公告)日:2019-08-15
申请号:US16242365
申请日:2019-01-08
申请人: Lei DING , Yixuan TONG , Bin DONG , Shanshan JIANG , Yongwei ZHANG
发明人: Lei DING , Yixuan TONG , Bin DONG , Shanshan JIANG , Yongwei ZHANG
CPC分类号: G06F17/2765 , G06F16/313 , G06F16/353 , G06K9/6256
摘要: A method, an apparatus and an electronic device for performing entity linking, and a non-transitory computer-readable recording medium are provided. The method includes constructing training data including a plurality of sets of labeled data using an existing unambiguous entity database where unambiguous entities corresponding to respective entity words are stored, each set of the labeled data including a text having an entity word and an unambiguous entity linked with the entity word; training an unambiguous entity recognition model whose output is a matching probability between an entity word in a text and an unambiguous entity using the training data; and inputting a text having an entity word to be recognized into the unambiguous entity recognition model, and determining an unambiguous entity linked with the entity word to be recognized based on an output result of the unambiguous entity recognition model.
-
6.
公开(公告)号:US20170337182A1
公开(公告)日:2017-11-23
申请号:US15597501
申请日:2017-05-17
申请人: Shanshan JIANG , Bin DONG , Jichuan ZHENG , Jiashi ZHANG , Yixuan TONG
发明人: Shanshan JIANG , Bin DONG , Jichuan ZHENG , Jiashi ZHANG , Yixuan TONG
IPC分类号: G06F17/27
CPC分类号: G06F17/2785 , G06F17/2765 , G06F17/2818
摘要: A method, an apparatus and a system for recognizing an evaluation element are provided. The method includes receiving an input text; performing, using a first conditional random field model, first recognition for the input text to obtain a first recognition result, the first recognition result including a pre-evaluation element that is recognized by using the first conditional random field model; performing, using a second conditional random field model, second recognition for the input text to obtain a second recognition result, the second recognition result including a false positive evaluation element that is recognized by using the second conditional random field model, the false positive evaluation element being an element erroneously detected as an evaluation element; and recognizing, based on the first recognition result and the second recognition result, an evaluation element in the input text.
-
公开(公告)号:US20210027178A1
公开(公告)日:2021-01-28
申请号:US16934112
申请日:2020-07-21
申请人: Lei DING , Yixuan TONG , Bin DONG , Shanshan JIANG , Yongwei ZHANG
发明人: Lei DING , Yixuan TONG , Bin DONG , Shanshan JIANG , Yongwei ZHANG
IPC分类号: G06N5/04 , G06F16/9535 , G06K9/62 , G06N3/04 , G06F40/289
摘要: A recommendation method and a recommendation apparatus based on deep reinforcement learning, and a non-transitory computer-readable recording medium are provided. In the method, entity semantic information representation vectors of products are generated based on a product knowledge graph; browsing context information representation vectors of the products are generated based on historical browsing behavior of a user with respect to products; the entity semantic information representation vectors and the browsing context information representation vectors of the respective products are merged to obtain vectors of the products; a recommendation model based on deep reinforcement learning is constructed, and the recommendation model based on the deep reinforcement learning is offline-trained using historical behavior data of the user to obtain the offline-trained recommendation model, the products in the historical behavior data of the user are represented by the vectors of the products; and products are online-recommended using the offline-trained recommendation model.
-
公开(公告)号:US20200311353A1
公开(公告)日:2020-10-01
申请号:US16809844
申请日:2020-03-05
申请人: Yihan LI , Boyan LIU , Shanshan JIANG , Yixuan TONG , Bin DONG
发明人: Yihan LI , Boyan LIU , Shanshan JIANG , Yixuan TONG , Bin DONG
摘要: A method and an apparatus for processing word vectors of a neural machine translation model, and a non-transitory computer-readable recording medium are provided. In the method, word vectors that are input to an encoder and a decoder of a neural machine translation model are updated using semantic information among head representations at the same time and semantic information among head representations at different times, and the model is trained or translation is performed using the updated word vectors, thereby improving the model performance of the neural machine translation model.
-
9.
公开(公告)号:US20190198014A1
公开(公告)日:2019-06-27
申请号:US16218693
申请日:2018-12-13
申请人: Yihan LI , Yixuan TONG , Shanshan JIANG , Bin DONG
发明人: Yihan LI , Yixuan TONG , Shanshan JIANG , Bin DONG
CPC分类号: G10L15/1815 , G06F16/3347 , G10L15/063 , G10L15/14
摘要: A method and an apparatus for ranking responses of a dialog model, and a non-transitory computer-readable recording medium are provided. The dialog model is trained based on a sample data set. The method includes obtaining, from the sample data set, at least one similar dialog whose content is semantically similar to content of a target dialog; obtaining a probability of at least one target response generated by the dialog model when inputting the target dialog, and obtaining a probability of a target response generated by the dialog model when inputting the similar dialog; statistically analyzing, based on the probabilities of the respective generated target responses, scores of the target responses, the scores of the target responses being positively correlated with the probabilities of the target responses; and ranking the target responses in a descending order of the scores.
-
10.
公开(公告)号:US20200242302A1
公开(公告)日:2020-07-30
申请号:US16750182
申请日:2020-01-23
申请人: Liang LIANG , Lei DING , Bin DONG , Shanshan JIANG , Yixuan TONG
发明人: Liang LIANG , Lei DING , Bin DONG , Shanshan JIANG , Yixuan TONG
IPC分类号: G06F40/216 , G06F40/30 , G06F40/211 , G06F40/56 , G06F16/242
摘要: An intention identification method includes generating a heterogeneous text network based on a language material sample; using a graph embedding algorithm to perform learning with respect to the heterogeneous text network and obtain a vector representation of the language material sample and a word, and determining keywords of the language material sample based on a similarity in terms of a vector between the language material sample and the word in the language material sample; training an intention identification model until a predetermined training termination condition is satisfied, by using the keywords of the language material samples, and obtaining the trained intention identification model; and receiving a language material query, and using the trained intention identification model to identify an intention of the language material query.
-
-
-
-
-
-
-
-
-