-
公开(公告)号:US20230177821A1
公开(公告)日:2023-06-08
申请号:US18063564
申请日:2022-12-08
Inventor: Qiming PENG , Bin LUO , Yuhui CAO , Shikun FENG , Yongfeng CHEN
CPC classification number: G06V10/82 , G06V30/19147 , G06V30/1444
Abstract: A neural network training method and a document image understanding method is provided. The neural network training method includes: acquiring text comprehensive features of a plurality of first texts in an original image; replacing at least one original region in the original image to obtain a sample image including a plurality of first regions and a ground truth label for indicating whether each first region is a replaced region; acquiring image comprehensive features of the plurality of first regions; inputting the text comprehensive features of the plurality of first texts and the image comprehensive features of the plurality of first regions into a neural network model together to obtain text representation features of the plurality of first texts; determining a predicted label based on the text representation features of the plurality of first texts; and training the neural network model based on the ground truth label and the predicted label.
-
2.
公开(公告)号:US20220382991A1
公开(公告)日:2022-12-01
申请号:US17883908
申请日:2022-08-09
Inventor: Qiming PENG , Bin LUO , Yuhui CAO , Shikun FENG , Yongfeng CHEN
IPC: G06F40/30 , G06V30/414 , G06V30/14
Abstract: The present disclosure provides a training method and apparatus for a document processing model, a device, a storage medium and a program, which relate to the field of artificial intelligence, and in particular, to technologies such as deep learning, natural language processing and text recognition. The specific implementation is: acquiring a first sample document; determining element features of a plurality of document elements in the first sample document and positions corresponding to M position types of each document element according to the first sample document; where the document element corresponds to a character or a document area in the first sample document; and performing training on a basic model according to the element features of the plurality of document elements and the positions corresponding to the M position types of each document element to obtain the document processing model.
-