-
公开(公告)号:US20230290130A1
公开(公告)日:2023-09-14
申请号:US18165459
申请日:2023-02-07
申请人: Lei DING , Bin Dong , Shanshan Jiang , Jiashi Zhang , Yongwei Zhang
发明人: Lei DING , Bin Dong , Shanshan Jiang , Jiashi Zhang , Yongwei Zhang
IPC分类号: G06V10/80 , G06V30/414 , G06V30/413 , G06V10/82 , G06V10/77 , G06V30/18
CPC分类号: G06V10/806 , G06V30/414 , G06V30/413 , G06V10/82 , G06V10/7715 , G06V30/18162
摘要: Disclosed are a table recognition method and apparatus. The table recognition method includes steps of obtaining an image vision feature and a character content feature of a table image; fusing the image vision feature and the character content feature of the table image to acquire a first fusion feature, and carrying out recognition based on the first fusion feature to acquire a table structure; and performing, based on the table structure, character recognition on the table image to acquire table character contents.
-
2.
公开(公告)号:US20230394240A1
公开(公告)日:2023-12-07
申请号:US18326292
申请日:2023-05-31
申请人: Yongwei ZHANG , Bin Dong , Shanshan Jiang , Lei Ding , Jiashi Zhang
发明人: Yongwei ZHANG , Bin Dong , Shanshan Jiang , Lei Ding , Jiashi Zhang
IPC分类号: G06F40/295 , G06F40/40
CPC分类号: G06F40/295 , G06F40/40
摘要: A method and an apparatus for named entity recognition, and a non-transitory computer-readable recording medium are provided. In the method, text elements are traversed according to a text span to obtain candidate entity words. Then, a class to which the candidate entity word belongs is recognized. The recognizing of the class includes generating a prompt template corresponding to the candidate entity word, and concatenating the text to be recognized and the prompt template to obtain a concatenated text; generating vector representations of the text elements in the concatenated text; generating the vector representation of the candidate entity word according to the vector representations of the text elements of each candidate entity word in the concatenated text, and the vector representation of the text element of the mask word; and classifying the vector representation of the candidate entity word to obtain the class of the candidate entity word.
-
公开(公告)号:US11562123B2
公开(公告)日:2023-01-24
申请号:US17215068
申请日:2021-03-29
申请人: Yixuan Tong , Yongwei Zhang , Bin Dong , Shanshan Jiang , Jiashi Zhang
发明人: Yixuan Tong , Yongwei Zhang , Bin Dong , Shanshan Jiang , Jiashi Zhang
IPC分类号: G06F40/166 , G06K9/62 , G06N3/08
摘要: A method and an apparatus for fusing position information, and a non-transitory computer-readable recording medium are provided. In the method, words of an input sentence are segmented to obtain a first sequence of words in the input sentence, and absolute position information of the words in the first sequence is generated. Then, subwords of the words in the first sequence are segmented to obtain a second sequence including subwords, and position information of the subwords in the second sequence are generated, based on the absolute position information of the words in the first sequence, to which the respective subwords belong. Then, the position information of the subwords in the second sequence are fused into a self-attention model to perform model training or model prediction.
-
公开(公告)号:US11907661B2
公开(公告)日:2024-02-20
申请号:US17455967
申请日:2021-11-22
申请人: Yixuan Tong , Yongwei Zhang , Bin Dong , Shanshan Jiang , Jiashi Zhang
发明人: Yixuan Tong , Yongwei Zhang , Bin Dong , Shanshan Jiang , Jiashi Zhang
IPC分类号: G06F40/279 , G06F40/295
CPC分类号: G06F40/295
摘要: A method and an apparatus for sequence labeling on an entity text, and a non-transitory computer-readable recording medium are provided. In the method, a start position of an entity text within a target text is determined. Then, a first matrix is generated based on the start position of the entity text. Elements in the first matrix indicates focusable weights of each word with respect to other words in the target text. Then, a named entity recognition model is generated using the first matrix. The named entity recognition model is obtained by training using first training data, the first training data includes word embeddings corresponding to respective texts in a training text set, and the texts are texts whose entity label has been labeled. Then, the target text is input to the named entity recognition model, and probability distribution of the entity label is output.
-
公开(公告)号:US11270212B2
公开(公告)日:2022-03-08
申请号:US15919355
申请日:2018-03-13
申请人: Lei Ding , Yixuan Tong , Bin Dong , Shanshan Jiang , Yongwei Zhang
发明人: Lei Ding , Yixuan Tong , Bin Dong , Shanshan Jiang , Yongwei Zhang
IPC分类号: G06N5/02 , G06N3/04 , G06N3/08 , G06F16/28 , G06F16/31 , G06F16/951 , G06F16/22 , G06F16/901 , G06F40/279 , G06F40/295
摘要: Knowledge graph processing method and device are disclosed. The method includes steps of obtaining an entity set containing a first entity, a second entity, and relation information; acquiring text information and image information related to the first entity and the second entity; generating a first structural information vector of the first entity and a second structural information vector of the second entity, and creating a first text information vector of the first entity, a first image information vector of the first entity, a second text information vector of the second entity, and a second image information vector of the second entity; and building a joint loss function so as to attain a first target vector of the first entity, a second target vector of the second entity, and a target relation vector of the relation information between the first entity and the second entity.
-
公开(公告)号:US10909319B2
公开(公告)日:2021-02-02
申请号:US16242365
申请日:2019-01-08
申请人: Lei Ding , Yixuan Tong , Bin Dong , Shanshan Jiang , Yongwei Zhang
发明人: Lei Ding , Yixuan Tong , Bin Dong , Shanshan Jiang , Yongwei Zhang
摘要: A method, an apparatus and an electronic device for performing entity linking, and a non-transitory computer-readable recording medium are provided. The method includes constructing training data including a plurality of sets of labeled data using an existing unambiguous entity database where unambiguous entities corresponding to respective entity words are stored, each set of the labeled data including a text having an entity word and an unambiguous entity linked with the entity word; training an unambiguous entity recognition model whose output is a matching probability between an entity word in a text and an unambiguous entity using the training data; and inputting a text having an entity word to be recognized into the unambiguous entity recognition model, and determining an unambiguous entity linked with the entity word to be recognized based on an output result of the unambiguous entity recognition model.
-
公开(公告)号:US20180341863A1
公开(公告)日:2018-11-29
申请号:US15919355
申请日:2018-03-13
申请人: Lei Ding , Yixuan Tong , Bin Dong , Shanshan Jiang , Yongwei Zhang
发明人: Lei Ding , Yixuan Tong , Bin Dong , Shanshan Jiang , Yongwei Zhang
摘要: Knowledge graph processing method and device are disclosed. The method includes steps of obtaining an entity set containing a first entity, a second entity, and relation information; acquiring text information and image information related to the first entity and the second entity; generating a first structural information vector of the first entity and a second structural information vector of the second entity, and creating a first text information vector of the first entity, a first image information vector of the first entity, a second text information vector of the second entity, and a second image information vector of the second entity; and building a joint loss function so as to attain a first target vector of the first entity, a second target vector of the second entity, and a target relation vector of the relation information between the first entity and the second entity.
-
公开(公告)号:US10282420B2
公开(公告)日:2019-05-07
申请号:US15597501
申请日:2017-05-17
申请人: Shanshan Jiang , Bin Dong , Jichuan Zheng , Jiashi Zhang , Yixuan Tong
发明人: Shanshan Jiang , Bin Dong , Jichuan Zheng , Jiashi Zhang , Yixuan Tong
摘要: A method, an apparatus and a system for recognizing an evaluation element are provided. The method includes receiving an input text; performing, using a first conditional random field model, first recognition for the input text to obtain a first recognition result, the first recognition result including a pre-evaluation element that is recognized by using the first conditional random field model; performing, using a second conditional random field model, second recognition for the input text to obtain a second recognition result, the second recognition result including a false positive evaluation element that is recognized by using the second conditional random field model, the false positive evaluation element being an element erroneously detected as an evaluation element; and recognizing, based on the first recognition result and the second recognition result, an evaluation element in the input text.
-
9.
公开(公告)号:US20240338523A1
公开(公告)日:2024-10-10
申请号:US18623332
申请日:2024-04-01
申请人: Yuming ZHANG , Bin Dong , Shanshan Jiang , Yongwei Zhang
发明人: Yuming ZHANG , Bin Dong , Shanshan Jiang , Yongwei Zhang
IPC分类号: G06F40/295 , G06F40/284 , G06N20/00
CPC分类号: G06F40/295 , G06F40/284 , G06N20/00
摘要: A method and an apparatus are provided for training a named entity recognition (NER) model. By constructing tag annotations for tags and causing the tag annotations to contain information for indicating the positions of tokens in named entities, corresponding to the tags, respectively, in the process of training the NER model, the NER model can better understand the different positions of different tokens in the same named entity, so that the trained NER model can more accurately recognize named entities.
-
10.
公开(公告)号:US20200242486A1
公开(公告)日:2020-07-30
申请号:US16739311
申请日:2020-01-10
申请人: Liang LIANG , Lei Ding , Bin Dong , Shanshan Jiang , Yixuan Tong
发明人: Liang LIANG , Lei Ding , Bin Dong , Shanshan Jiang , Yixuan Tong
IPC分类号: G06N5/02 , G06K9/62 , G06N20/10 , G06F16/9032
摘要: A method and an apparatus for recognizing an intention, and a non-transitory computer-readable recording medium are provided. The method includes learning vectors of knowledge base elements in corpus samples, and converting the corpus samples into row vectors composed of the vectors of the knowledge base elements in a knowledge base; extracting feature vectors from respective pooling windows in the corpus samples by hierarchical pooling, determining weights positively correlated with similarities between texts within the respective pooling windows and the respective corpus samples, performing weighting on the extracted feature vectors to obtain feature vectors of the respective pooling windows, and obtaining feature vectors of the respective corpus samples composed of the feature vectors of the pooling windows; training a vector-based intention recognition classifier, based on the feature vectors of the corpus samples; and recognizing an intention in querying a corpus, using the trained intention recognition classifier.
-
-
-
-
-
-
-
-
-