Patent search ap:("Beijing Baidu Netcom Science Technology Co. Page Ltd.") AND inv:"Jingtuo LIU"

11.

发明申请
TRAINING METHOD OF TEXT RECOGNITION MODEL, TEXT RECOGNITION METHOD, AND APPARATUS 有权

公开(公告)号：US20220415071A1

公开(公告)日：2022-12-29

申请号：US17899712

申请日：2022-08-31

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Chengquan ZHANG , Pengyuan LV , Shanshan LIU , Meina QIAO , Yangliu XU , Liang WU , Jingtuo LIU , Junyu HAN , Errui DING , Jingdong WANG

IPC: G06V30/19 , G06V30/18 , G06T9/00 , G06V30/262 , G06N20/00

Abstract: The present disclosure provides a training method of a text recognition model, a text recognition method, and an apparatus, relating to the technical field of artificial intelligence, and specifically, to the technical field of deep learning and computer vision, which can be applied in scenarios such as optional character recognition, etc. The specific implementation solution is: performing mask prediction on visual features of an acquired sample image, to obtain a predicted visual feature; performing mask prediction on semantic features of acquired sample text, to obtain a predicted semantic feature, where the sample image includes text; determining a first loss value of the text of the sample image according to the predicted visual feature; determining a second loss value of the sample text according to the predicted semantic feature; training, according to the first loss value and the second loss value, to obtain the text recognition model.

12.

发明申请
IMAGE PROCESSING METHOD, ELECTRONIC DEVICE AND STORAGE MEDIUM 有权

公开(公告)号：US20220253631A1

公开(公告)日：2022-08-11

申请号：US17501221

申请日：2021-10-14

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Yulin LI , Ju HUANG , Qunyi XIE , Xiameng QIN , Chengquan ZHANG , Jingtuo LIU

IPC: G06K9/00 , G06K9/62 , G06N3/04 , G06F40/30

Abstract: The present disclosure discloses an image processing method, an electronic device and a storage medium, and relates to the field of artificial intelligence technologies, and particularly to the fields of computer vision technologies, deep learning technologies, or the like. The image processing method includes: acquiring a multi-modal feature of each of at least one text region in an image, the multi-modal feature including features in plural dimensions; performing a global attention processing operation on the multi-modal feature of each text region to obtain a global attention feature of each text region; determining a category of each text region based on the global attention feature of each text region; and constructing structured information based on text content and the category of each text region.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification