-
公开(公告)号:US20230073994A1
公开(公告)日:2023-03-09
申请号:US17988107
申请日:2022-11-16
Inventor: Han LIU , Teng Hu , Shikun Feng , Yongfeng Chen
Abstract: A method for extracting text information includes: acquiring a text to be extracted and a target field name; extracting candidate text information matching the target field name from the text to be extracted based on the text to be extracted and the target field name; and acquiring target text information matching fusion semantics of the text to be extracted, the target field name and the candidate text information by filtering the candidate text information based on the fusion semantics. Therefore, when the candidate text information matching the target field name is extracted from the text to be extracted, the candidate text information is filtered based on the fusion semantics of the text to be extracted, the target field name and the candidate text information, which improves the accuracy of extracting text information.
-
公开(公告)号:US20230177359A1
公开(公告)日:2023-06-08
申请号:US18063348
申请日:2022-12-08
Inventor: Sijin WU , Han LIU , Teng HU , Shikun FENG , Yongfeng CHEN
IPC: G06N5/022 , G06F40/174 , G06F40/205
CPC classification number: G06N5/022 , G06F40/174 , G06F40/205 , G06F40/30
Abstract: The present disclosure provides a method and apparatus for training a document information extraction model and method and apparatus for extracting document information, and relates to the field of artificial intelligence, and more particularly to the field of natural language processing. A specific implementation solution is: acquiring training data labeled with an answer corresponding to a preset question and a document information extraction model, the training data includes layout document training data and streaming document training data; extracting at least one feature from the training data; fusing at least one feature to obtain a fused feature; inputting the preset question, the fused feature and the training data into the document information extraction model to obtain a predicted result; and adjusting network parameters of the document information extraction model based on the predicted result and the answer.
-
公开(公告)号:US20230097986A1
公开(公告)日:2023-03-30
申请号:US18058640
申请日:2022-11-23
Inventor: Han LIU , Teng HU , Yongfeng CHEN
IPC: G06F40/279 , G06F40/30
Abstract: A data processing method is provided. The method includes: determining fusion information based on a text to be processed and a plurality of reference text fragments; executing the following matching operation for each of the plurality of reference text fragments: determining a first coefficient of each feature vector of the fusion information respectively; determining a second coefficient of each feature vector of the fusion information respectively; determining a result feature vector of the reference text fragment using each feature vector included in the fusion information and a weight of the feature vector; and determining a matching degree of the reference text fragment and the text to be processed based on the result feature vector.
-
4.
公开(公告)号:US20230005283A1
公开(公告)日:2023-01-05
申请号:US17577531
申请日:2022-01-18
Inventor: Han LIU , Teng HU , Yongfeng CHEN
IPC: G06V30/18 , G06V30/19 , G06V30/262 , G06F40/20
Abstract: The present disclosure provides an information extraction method and apparatus, an electronic device and a readable storage medium, and relates to the field of natural language processing technologies. The information extraction method includes: acquiring a to-be-extracted text; acquiring a sample set, the sample set including a plurality of sample texts and labels of sample characters in the plurality of sample texts; determining a prediction label of each character in the to-be-extracted text according to a semantic feature vector of each character in the to-be-extracted text and a semantic feature vector of each sample character in the sample set; and extracting, according to the prediction label of each character, a character meeting a preset requirement from the to-be-extracted text as an extraction result of the to-be-extracted text. The present disclosure can simplify steps of information extraction, reduce costs of information extraction and improve flexibility and accuracy of information extraction.
-
公开(公告)号:US20230073550A1
公开(公告)日:2023-03-09
申请号:US17988065
申请日:2022-11-16
Inventor: Han LIU , Teng Hu , Shikun Feng , Yongfeng Chen
IPC: G06F40/30 , G06F40/40 , G06F40/284
Abstract: A method for extracting text information includes: acquiring a text to be extracted and a target field name; extracting candidate text information matching the target field name from the text to be extracted based on the text to be extracted and the target field name; and acquiring target text information matching fusion semantics of the text to be extracted, the target field name and the candidate text information by filtering the candidate text information based on the fusion semantics. Therefore, when the candidate text information matching the target field name is extracted from the text to be extracted, the candidate text information is filtered based on the fusion semantics of the text to be extracted, the target field name and the candidate text information, which improves the accuracy of extracting text information.
-
-
-
-