-
公开(公告)号:US20210390294A1
公开(公告)日:2021-12-16
申请号:US17139403
申请日:2020-12-31
Inventor: Xiangkai Huang , Qiaoyi LI , Yulin LI , Ju Huang , Duohao Qin , Xiameng Qin , Minghao Liu , Junyu Han , Jiangliang Guo
Abstract: Embodiments of the present disclosure disclose an image table extraction method and apparatus, an electronic device, a storage media, and a training method for a table extraction model, which relate to the field of artificial intelligence technologies and cloud computing technologies, including: acquiring an image to be processed;
generating a table of the image to be processed according to a table extraction model, where the table extraction model is obtained according to a field position feature, an image feature, and a text feature of a sample image; and filling text information of the image to be processed into the table.-
2.
公开(公告)号:US11687704B2
公开(公告)日:2023-06-27
申请号:US17207179
申请日:2021-03-19
Inventor: Qiaoyi Li , Xiangkai Huang , Yulin Li , Ju Huang , Xiameng Qin , Duohao Qin , Minghao Liu , Junyu Han
CPC classification number: G06F40/174 , G06F16/93 , G06V30/19013 , G06V30/19173 , G06V30/40 , G06V30/10
Abstract: Disclosed are a method, apparatus and electronic device for annotating information of a structured document. A specific implementation is: obtaining a template image of a structured document and at least one piece of annotation information of a field to be filled in the template image, where the annotation information includes attribute value and historical content of the field to be filled, and historical position of the field to be filled in the template image; generating, according to the attribute value of the field to be filled, the historical content of the field to be filled and the historical position of the field to be filled in the template image, target filling information of the field to be filled; obtaining, according to the target filling information of the field to be filled, an image of an annotated structured document.
-
3.
公开(公告)号:US20230123327A1
公开(公告)日:2023-04-20
申请号:US18068149
申请日:2022-12-19
Inventor: Chengquan Zhang , Pengyuan Lv , Kun Yao , Junyu Han , Jingtuo Liu
Abstract: A method for recognizing text includes: obtaining an image sequence feature of an image to be recognized; obtaining a full text string of the image to be recognized by decoding the image sequence feature; obtaining a text sequence feature by performing a semantic enhancement process on the full text string, in which the image sequence feature, the full text string and the text sequence feature are of the same length; and determining text content of the image to be recognized based on the full text string and the text sequence feature.
-
公开(公告)号:US20230213388A1
公开(公告)日:2023-07-06
申请号:US17998881
申请日:2020-10-14
Inventor: Haocheng Feng , Haixiao Yue , Keyao Wang , Gang Zhang , Yanwen Fan , Xiyu Yu , Junyu Han , Jingtuo Liu , Errui Ding , Haifeng Wang
IPC: G01J5/00
CPC classification number: G01J5/0025 , G01J2005/0077
Abstract: A method and an apparatus for measuring temperature, and a computer-readable storage medium includes detecting a target position of an object in an input image; determining key points of the target position and weight information of each key point based on a detection result of the target position, in which the weight information is configured to indicate a probability of each key point being covered; acquiring temperature information of each key point; and determining a temperature of the target position at least based on the temperature information and the weight information of each key point.
-
公开(公告)号:US20230065675A1
公开(公告)日:2023-03-02
申请号:US17982616
申请日:2022-11-08
Inventor: Tianshu Hu , Shengyi He , Junyu Han , Zhibin Hong
IPC: G06T13/40
Abstract: A method of processing an image, a method of training a model, an electronic device and a medium, which relate to a field of artificial intelligence technology, in particular to deep learning, computer vision and other technical fields. A solution includes: generating a first face image, wherein a definition difference and an authenticity difference between the first face image and a reference face image are within a set range; adjusting, according to a target voice used to drive the first face image, a facial action information related to pronunciation in the first face image to generate a second face image with a facial tissue position conforming to a pronunciation rule of the target voice; and determining the second face image as a face image driven by the target voice.
-
公开(公告)号:US20220148324A1
公开(公告)日:2022-05-12
申请号:US17581047
申请日:2022-01-21
Inventor: Xiameng QIN , Yulin Li , Ju Huang , Qunyi Xie , Chengquan Zhang , Kun Yao , Jingtuo Liu , Junyu Han
IPC: G06V30/18 , G06V30/24 , G06V30/148 , G06V30/19
Abstract: Provided are a method and apparatus for extracting information about a negotiable instrument, an electronic device and a storage medium. The method includes inputting a to-be-recognized negotiable instrument into a pretrained deep learning network and obtaining a visual image corresponding to the to-be-recognized negotiable instrument through the deep learning network;
matching the visual image corresponding to the to-be-recognized negotiable instrument with a visual image corresponding to each negotiable-instrument template in a preconstructed base template library; and in response to the visual image corresponding to the to-be-recognized negotiable instrument successfully matching a visual image corresponding to one negotiable-instrument template in the base template library, extracting structured information of the to-be-recognized negotiable instrument by using the negotiable-instrument template.-
公开(公告)号:US11768876B2
公开(公告)日:2023-09-26
申请号:US17161466
申请日:2021-01-28
Inventor: Xiameng Qin , Yulin Li , Qunyi Xie , Ju Huang , Junyu Han
IPC: G06F16/9032 , G06F16/583 , G06F16/532 , G06F40/279 , G06N3/04 , G06N3/088 , G06F18/213 , G06F18/25 , G06V10/25 , G06V10/764 , G06V10/80 , G06V10/82 , G06V10/44
CPC classification number: G06F16/90332 , G06F16/532 , G06F16/583 , G06F18/213 , G06F18/253 , G06F40/279 , G06N3/04 , G06N3/088 , G06V10/25 , G06V10/454 , G06V10/764 , G06V10/806 , G06V10/82 , G06V2201/07
Abstract: The present disclosure provides a method for visual question answering, which relates to a field of computer vision and natural language processing. The method includes: acquiring an input image and an input question; constructing a Visual Graph based on the input image, wherein the Visual Graph comprises a Node Feature and an Edge Feature; updating the Node Feature by using the Node Feature and the Edge Feature to obtain an updated Visual Graph; determining a question feature based on the input question; fusing the updated Visual Graph and the question feature to obtain a fused feature; and generating a predicted answer for the input image and the input question based on the fused feature. The present disclosure further provides an apparatus for visual question answering, a computer device and a non-transitory computer-readable storage medium.
-
公开(公告)号:US11756170B2
公开(公告)日:2023-09-12
申请号:US17151783
申请日:2021-01-19
Inventor: Qunyi Xie , Xiameng Qin , Yulin Li , Junyu Han , Shengxian Zhu
CPC classification number: G06T5/006 , G06N3/08 , G06T5/30 , G06T2207/20081 , G06T2207/20084 , G06T2207/30176
Abstract: Embodiments of the present disclosure provide a method and apparatus for correcting a distorted document image, where the method for correcting a distorted document image includes: obtaining a distorted document image; and inputting the distorted document image into a correction model, and obtaining a corrected image corresponding to the distorted document image; where the correction model is a model obtained by training with a set of image samples as inputs and a corrected image corresponding to each image sample in the set of image samples as an output, and the image samples are distorted. By inputting the distorted document image to be corrected into the correction model, the corrected image corresponding to the distorted document image can be obtained through the correction model, which realizes document image correction end-to-end, improves accuracy of the document image correction, and extends application scenarios of the document image correction.
-
公开(公告)号:US20230215203A1
公开(公告)日:2023-07-06
申请号:US18168759
申请日:2023-02-14
Inventor: Pengyuan LV , Chengquan ZHANG , Shanshan LIU , Meina QIAO , Yangliu XU , Liang WU , Xiaoyan WANG , Kun YAO , Junyu Han , Errui DING , Jingdong WANG , Tian WU , Haifeng WANG
IPC: G06V30/19
CPC classification number: G06V30/19147 , G06V30/19167
Abstract: The present disclosure provides a character recognition model training method and apparatus, a character recognition method and apparatus, a device and a medium, relating to the technical field of artificial intelligence, and specifically to the technical fields of deep learning, image processing and computer vision, which can be applied to scenarios such as character detection and recognition technology. The specific implementing solution is: partitioning an untagged training sample into at least two sub-sample images; dividing the at least two sub-sample images into a first training set and a second training set; where the first training set includes a first sub-sample image with a visible attribute, and the second training set includes a second sub-sample image with an invisible attribute; performing self-supervised training on a to-be-trained encoder by taking the second training set as a tag of the first training set, to obtain a target encoder.
-
公开(公告)号:US20230120985A1
公开(公告)日:2023-04-20
申请号:US18083313
申请日:2022-12-16
Inventor: Yanwen Fan , Xiyu Yu , Gang Zhang , Jingtuo Liu , Haifeng Wang , Errui Ding , Junyu Han
IPC: G06V10/774 , G06V40/16 , G06V10/26 , G06V10/77
Abstract: A method for training a face recognition model includes: acquiring a plurality of first training images being uncovered face images, and acquiring a plurality of covering object images; generating a plurality of second training images by separately fusing the plurality of covering object images with the uncovered face images; and training the face recognition model by inputting the plurality of first training images and the plurality of second training images into the face recognition model.
-
-
-
-
-
-
-
-
-