-
1.
公开(公告)号:US20240021000A1
公开(公告)日:2024-01-18
申请号:US18113178
申请日:2023-02-23
Inventor: Xiameng QIN , Yulin LI , Xiaoqiang ZHANG , Ju HUANG , Qunyi XIE , Kun YAO
IPC: G06V30/19 , G06V30/148
CPC classification number: G06V30/1918 , G06V30/15 , G06V30/19127 , G06V30/19147
Abstract: There is provided an image-based information extraction model, method, and apparatus, a device, and a storage medium, which relates to the field of artificial intelligence (AI) technologies, specifically to fields of deep learning, image processing, computer vision technologies, and is applicable to optical character recognition (OCR) and other scenarios. A specific implementation solution involves: acquiring a to-be-extracted first image and a category of to-be-extracted information; and inputting the first image and the category into a pre-trained information extraction model to perform information extraction on the first image to obtain text information corresponding to the category.
-
公开(公告)号:US20230048495A1
公开(公告)日:2023-02-16
申请号:US17974183
申请日:2022-10-26
Inventor: Qunyi XIE , Xiameng QIN , Mengyi EN , Dongdong ZHANG , Ju HUANG , Yangliu XU , Yi CHEN , Kun YAO
IPC: G06V30/413 , G06V10/764 , G06V10/24 , G06V10/75 , G06V30/414
Abstract: A method and a platform of generating a document, an electronic device, and a storage medium are provided, which relate to a field of an artificial intelligence technology, in particular to fields of computer vision and deep learning technologies, and may be applied to a text recognition scenario and other scenarios. The method includes: performing a category recognition on a document picture to obtain a target category result; determining a target structured model matched with the target category result; and performing, by using the target structured model, a structure recognition on the document picture to obtain a structure recognition result, so as to generate an electronic document based on the structure recognition result, wherein the structure recognition result includes a field attribute recognition result and a field position recognition result.
-
公开(公告)号:US20220392242A1
公开(公告)日:2022-12-08
申请号:US17819838
申请日:2022-08-15
Abstract: A method for training a text positioning model includes: obtaining a sample image, where the sample image contains a sample text to be positioned and a text marking box for the sample text; inputting the sample image into a text positioning model to be trained to position the sample text, and outputting a prediction text box for the sample image; obtaining a sample prior anchor box corresponding to the sample image; and adjusting model parameters of the text positioning model based on the sample prior anchor box, the text marking box and the prediction text box, and continuing training the adjusted text positioning model based on a next sample image until model training is completed, to generate a target text positioning model.
-
公开(公告)号:US20220148324A1
公开(公告)日:2022-05-12
申请号:US17581047
申请日:2022-01-21
Inventor: Xiameng QIN , Yulin Li , Ju Huang , Qunyi Xie , Chengquan Zhang , Kun Yao , Jingtuo Liu , Junyu Han
IPC: G06V30/18 , G06V30/24 , G06V30/148 , G06V30/19
Abstract: Provided are a method and apparatus for extracting information about a negotiable instrument, an electronic device and a storage medium. The method includes inputting a to-be-recognized negotiable instrument into a pretrained deep learning network and obtaining a visual image corresponding to the to-be-recognized negotiable instrument through the deep learning network;
matching the visual image corresponding to the to-be-recognized negotiable instrument with a visual image corresponding to each negotiable-instrument template in a preconstructed base template library; and in response to the visual image corresponding to the to-be-recognized negotiable instrument successfully matching a visual image corresponding to one negotiable-instrument template in the base template library, extracting structured information of the to-be-recognized negotiable instrument by using the negotiable-instrument template.-
公开(公告)号:US20230401828A1
公开(公告)日:2023-12-14
申请号:US17905965
申请日:2022-04-08
Inventor: Meina QIAO , Shanshan LIU , Xiameng QIN , Chengquan ZHANG , Kun YAO
IPC: G06V10/774 , G06V30/14 , G06V10/764
CPC classification number: G06V10/774 , G06V10/764 , G06V30/1444
Abstract: A method for training an image recognition model includes: obtaining a training data set, in which the training data set includes first text images of each vertical category in a non-target scene and second text images of each vertical category in a target scene, and a type of text content involved in the first text images is the same as a type of text content involved in the second text image; training an initial recognition model by using the first text images, to obtain a basic recognition model; and modifying the basic recognition model by using the second text images, to obtain an image recognition model corresponding to the target scene.
-
公开(公告)号:US20220101642A1
公开(公告)日:2022-03-31
申请号:US17545765
申请日:2021-12-08
Inventor: Qunyi XIE , Yangliu XU , Xiameng QIN , Chengquan ZHANG
Abstract: The disclosure discloses a method for character recognition, an electronic device, and a storage medium. The technical solution includes: obtaining a test sample image and a test sample character both corresponding to a test task; performing fine-tuning on a trained meta-learning model based on the test sample image and the test sample character to obtain a test task model; obtaining a test image corresponding to the test task; and generating a test character corresponding to the test image by inputting the test image into the test task model.
-
公开(公告)号:US20210192696A1
公开(公告)日:2021-06-24
申请号:US17151783
申请日:2021-01-19
Inventor: Qunyi XIE , Xiameng QIN , Yulin LI , Junyu HAN , Shengxian ZHU
Abstract: Embodiments of the present disclosure provide a method and apparatus for correcting a distorted document image, where the method for correcting a distorted document image includes: obtaining a distorted document image; and inputting the distorted document image into a correction model, and obtaining a corrected image corresponding to the distorted document image; where the correction model is a model obtained by training with a set of image samples as inputs and a corrected image corresponding to each image sample in the set of image samples as an output, and the image samples are distorted. By inputting the distorted document image to be corrected into the correction model, the corrected image corresponding to the distorted document image can be obtained through the correction model, which realizes document image correction end-to-end, improves accuracy of the document image correction, and extends application scenarios of the document image correction.
-
8.
公开(公告)号:US20230196805A1
公开(公告)日:2023-06-22
申请号:US18168089
申请日:2023-02-13
Inventor: Ju HUANG , Xiaoqiang ZHANG , Xiameng QIN , Chengquan ZHANG , Kun YAO
Abstract: The present disclosure provides a character detection method and apparatus, a model training method and apparatus, a device and a storage medium. The specific implementation is: acquiring a training sample, where the training sample includes a sample image and a marked image, and the marked image is an image obtained by marking a text instance in the sample image; inputting the sample image into a character detection model, to obtain segmented images and image types of the segmented images output by the character detection model, where the image type indicates that the segmented image includes a text instance, or the segmented image does not include a text instance; and adjusting a parameter of the character detection model according to the segmented images, the image types of the segmented images and the marked image.
-
公开(公告)号:US20220027611A1
公开(公告)日:2022-01-27
申请号:US17498226
申请日:2021-10-11
Inventor: Yuechen YU , Chengquan ZHANG , Yulin LI , Xiaoqiang ZHANG , Ju HUANG , Xiameng QIN , Kun YAO , Jingtuo LIU , Junyu HAN , Errui DING
Abstract: Provided are an image classification method and apparatus, an electronic device and a storage medium, relating to the field of artificial intelligence and, in particular, to computer vision and deep learning. The method includes inputting a to-be-classified document image into a pretrained neural network and obtaining a feature submap of each text box of the to-be-classified document image by use of the neural network; inputting the feature submap of each text box, a semantic feature corresponding to preobtained text information of each text box and a position feature corresponding to preobtained position information of each text box into a pretrained multimodal feature fusion model and fusing, by use of the multimodal feature fusion model, the three into a multimodal feature corresponding to each text box; and classifying the to-be-classified document image based on the multimodal feature corresponding to each text box.
-
公开(公告)号:US20210406592A1
公开(公告)日:2021-12-30
申请号:US17182987
申请日:2021-02-23
Inventor: Yulin LI , Xiameng QIN , Ju HUANG , Qunyi XIE , Junyu HAN
IPC: G06K9/62 , G06K9/46 , G06F40/279
Abstract: The present disclosure provides a method for visual question answering. The method includes: acquiring an input image and an input question; constructing a visual graph based on the input image, wherein the visual graph comprises a first node feature and a first edge feature; constructing a question graph based on the input question, wherein the question graph comprises a second node feature and a second edge feature; performing a multimodal fusion on the visual graph and the question graph to obtain an updated visual graph and an updated question graph; determining a question feature based on the input question; determining a fusion feature based on the updated visual graph, the updated question graph and the question feature; and generating a predicted answer for the input image and the input question. The present disclosure further provides an apparatus for visual question answering, a computer device and a medium.
-
-
-
-
-
-
-
-
-