Patent search ap:("BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO. Page LTD.") AND inv:"Chengquan ZHANG"

11.

发明申请
TEXT RECOGNITION METHOD, ELECTRONIC DEVICE, AND NON-TRANSITORY STORAGE MEDIUM 有权

公开(公告)号：US20230050079A1

公开(公告)日：2023-02-16

申请号：US17974630

申请日：2022-10-27

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Pengyuan LV , Xiaoyan WANG , Liang WU , Shanshan LIU , Yuechen YU , Meina QIAO , Jie LU , Chengquan ZHANG , Kun YAO

IPC: G06V30/18 , G06V30/148

Abstract: Provided are a text recognition method, an electronic device, and a non-transitory computer-readable storage medium, which are applicable in an OCR scenario. In the particular solution, a text image to be recognized is acquired. Feature extraction is performed on the text image, to obtain an image feature corresponding to the text image, where a height-wise feature and a width-wise feature of the image feature each have a dimension greater than 1. According to the image feature, sampling features corresponding to multiple sampling points in the text image are determined. According to the sampling features corresponding to the multiple sampling points, a character recognition result corresponding to the text image is determined.

12.

发明申请
METHOD FOR RECOGNIZING TEXT, ELECTRONIC DEVICE AND STORAGE MEDIUM 有权

公开(公告)号：US20230010031A1

公开(公告)日：2023-01-12

申请号：US17946464

申请日：2022-09-16

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Pengyuan LYU , Sen FAN , Xiaoyan WANG , Yuechen YU , Chengquan ZHANG , Kun YAO , Junyu HAN

IPC: G06V10/77 , G06V20/62 , G06V10/74

Abstract: A method for recognizing a text, an electronic device and a storage medium. An implementation of the method comprises: obtaining a multi-dimensional first feature map of a to-be-recognized image; performing, based on feature values in the first feature map, feature enhancement processing on each feature value in the first feature map; and performing a text recognition on the to-be-recognized image based on the first feature map after the enhancement processing.

13.

发明申请
TABLE GENERATING METHOD AND APPARATUS, ELECTRONIC DEVICE, STORAGE MEDIUM AND PRODUCT 有权

公开(公告)号：US20220301334A1

公开(公告)日：2022-09-22

申请号：US17832735

申请日：2022-06-06

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Yuechen YU , Yulin LI , Chengquan ZHANG , Kun YAO

IPC: G06V30/416 , G06F40/18 , G06V30/413

Abstract: The present disclosure provides a table generating method and apparatus, an electronic device, a storage medium and a product. A specific implementation is: recognizing at least one table object in a to-be-recognized image and obtaining a table property respectively corresponding to the at least one table object, where the table property of any table object includes a cell property or a non-cell property; determining at least one target object with the cell property in the at least one table object; determining a cell region respectively corresponding to the at least one target object to obtain cell position information respectively corresponding to the at least one target object; generating a spreadsheet corresponding to the to-be-recognized image according to the cell position information respectively corresponding to the at least one target object.

14.

发明申请
METHOD AND APPARATUS FOR VISUAL QUESTION ANSWERING, COMPUTER DEVICE AND MEDIUM 有权

公开(公告)号：US20210406619A1

公开(公告)日：2021-12-30

申请号：US17169112

申请日：2021-02-05

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Pengyuan LV , Xiaoqiang ZHANG , Shanshan LIU , Chengquan ZHANG , Qiming PENG , Sijin WU , Hua LU , Yongfeng CHEN

IPC: G06K9/72 , G06T7/70 , G06F40/30 , G06K9/46 , G06K9/00 , G06K9/32 , G06K9/20 , G06K9/62 , G06N20/00 , G06N5/04

Abstract: The present disclosure provides a method for visual question answering, which relates to fields of computer vision and natural language processing. The method includes: acquiring an input image and an input question; detecting visual information and position information of each of at least one text region in the input image; determining semantic information and attribute information of each of the at least one text region based on the visual information and the position information; determining a global feature of the input image based on the visual information, the position information, the semantic information, and the attribute information; determining a question feature based on the input question; and generating a predicted answer for the input image and the input question based on the global feature and the question feature. The present disclosure further provides a device for visual question answering, a computer device and a medium.

15.

发明公开
TRAINING METHOD, METHOD OF DISPLAYING TRANSLATION, ELECTRONIC DEVICE AND STORAGE MEDIUM 审中-公开

公开(公告)号：US20240282024A1

公开(公告)日：2024-08-22

申请号：US18041206

申请日：2022-04-22

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Liang WU , Shanshan LIU , Chengquan ZHANG , Kun YAO

IPC: G06T11/60 , G06F40/58 , G06N3/094 , G06T3/02 , G06V10/774

CPC classification number: G06T11/60 , G06F40/58 , G06N3/094 , G06T3/02 , G06V10/774

Abstract: A method of training a text erasure model, a method of display a translation, an electronic device, and a storage medium. The training method includes: processing a set of original text block images by using a generator of a generative adversarial network model to obtain a set of simulated text block-erased images; alternately training the generator and a discriminator of the generative adversarial network model by using a set of real text block-erased images and the set of simulated text block-erased images, so as to obtain a trained generator and a trained discriminator; and determining the trained generator as the text erasure model, wherein a pixel value of a text-erased region in a real text block-erased image contained in the set of real text block-erased images is determined based on a pixel value of another region in the real text block-erased image other than the text-erased region.

16.

发明公开
METHOD OF TRAINING TEXT DETECTION MODEL, METHOD OF DETECTING TEXT, AND DEVICE 审中-公开

公开(公告)号：US20240265718A1

公开(公告)日：2024-08-08

申请号：US18041370

申请日：2022-04-22

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Xiaoqiang ZHANG , Xiameng QIN , Chengquan ZHANG , Kun YAO

IPC: G06V30/19 , G06V10/77 , G06V10/82

CPC classification number: G06V30/19127 , G06V10/7715 , G06V10/82

Abstract: A method training a text detection model and a method of detecting a text. The training method includes: inputting a sample image into a text feature extraction sub-model of a text detection model to obtain a text feature of a text in the sample image, the sample image having a label indicating an actual position information and an actual category; inputting a predetermined text vector into a text encoding sub-model of the text detection model to obtain a text reference feature; inputting the text feature and the text reference feature into a decoding sub-model of the text detection model to obtain a text sequence vector; inputting the text sequence vector into an output sub-model of the text detection model to obtain a predicted position information and a predicted category; and training the text detection model based on the predicted and actual categories, the predicted and actual position information.

17.

发明公开
Method and Apparatus for Recognizing Document Image, Storage Medium and Electronic Device 审中-公开

公开(公告)号：US20230260306A1

公开(公告)日：2023-08-17

申请号：US17884264

申请日：2022-08-09

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Yuechen YU , Chengquan ZHANG , Kun YAO

IPC: G06V30/413 , G06V30/414 , G06V30/416 , G06V30/18

CPC classification number: G06V30/413 , G06V30/414 , G06V30/416 , G06V30/18143

Abstract: A method and an apparatus is provided for recognizing a document image, a storage medium and an electronic device, relates to the technical field of artificial intelligent recognition, particularly relates to the technical fields of deep learning and computer vision. The method includes that a document image to be recognized is transformed into an image feature map, where the document image at least includes at least one text box and text information including multiple characters; a first recognition content of the document image to be recognized is predicted based on the image feature map, the multiple characters and the text box; the document image to be recognized is recognized based on an optical character recognition algorithm to obtain a second recognition content; and the first recognition content is matched with the second recognition content to obtain a target recognition content.

18.

发明公开
CHARACTER RECOGNITION MODEL TRAINING METHOD AND APPARATUS, CHARACTER RECOGNITION METHOD AND APPARATUS, DEVICE AND STORAGE MEDIUM 审中-公开

公开(公告)号：US20230215203A1

公开(公告)日：2023-07-06

申请号：US18168759

申请日：2023-02-14

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Pengyuan LV , Chengquan ZHANG , Shanshan LIU , Meina QIAO , Yangliu XU , Liang WU , Xiaoyan WANG , Kun YAO , Junyu Han , Errui DING , Jingdong WANG , Tian WU , Haifeng WANG

IPC: G06V30/19

CPC classification number: G06V30/19147 , G06V30/19167

Abstract: The present disclosure provides a character recognition model training method and apparatus, a character recognition method and apparatus, a device and a medium, relating to the technical field of artificial intelligence, and specifically to the technical fields of deep learning, image processing and computer vision, which can be applied to scenarios such as character detection and recognition technology. The specific implementing solution is: partitioning an untagged training sample into at least two sub-sample images; dividing the at least two sub-sample images into a first training set and a second training set; where the first training set includes a first sub-sample image with a visible attribute, and the second training set includes a second sub-sample image with an invisible attribute; performing self-supervised training on a to-be-trained encoder by taking the second training set as a tag of the first training set, to obtain a target encoder.

19.

发明公开
METHOD FOR RECOGNIZING TEXT, DEVICE, AND STORAGE MEDIUM 审中-公开

公开(公告)号：US20230206667A1

公开(公告)日：2023-06-29

申请号：US18147806

申请日：2022-12-29

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Pengyuan LV , Liang WU , Shanshan LIU , Meina QIAO , Chengquan ZHANG , Kun YAO , Junyu HAN

IPC: G06V30/19 , G06V30/16

CPC classification number: G06V30/19127 , G06V30/16

Abstract: A method for recognizing text includes: obtaining a first feature map of an image; for each target feature unit, performing a feature enhancement process on a plurality of feature values of the target feature unit respectively based on the plurality of feature values of the target feature unit, in which the target feature unit is a feature unit in the first feature map along a feature enhancement direction; and performing a text recognition process on the image based on the first feature map after the feature enhancement process.

20.

发明申请
TRAINING METHOD OF TEXT RECOGNITION MODEL, TEXT RECOGNITION METHOD, AND APPARATUS 有权

公开(公告)号：US20220415071A1

公开(公告)日：2022-12-29

申请号：US17899712

申请日：2022-08-31

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Chengquan ZHANG , Pengyuan LV , Shanshan LIU , Meina QIAO , Yangliu XU , Liang WU , Jingtuo LIU , Junyu HAN , Errui DING , Jingdong WANG

IPC: G06V30/19 , G06V30/18 , G06T9/00 , G06V30/262 , G06N20/00

Abstract: The present disclosure provides a training method of a text recognition model, a text recognition method, and an apparatus, relating to the technical field of artificial intelligence, and specifically, to the technical field of deep learning and computer vision, which can be applied in scenarios such as optional character recognition, etc. The specific implementation solution is: performing mask prediction on visual features of an acquired sample image, to obtain a predicted visual feature; performing mask prediction on semantic features of acquired sample text, to obtain a predicted semantic feature, where the sample image includes text; determining a first loss value of the text of the sample image according to the predicted visual feature; determining a second loss value of the sample text according to the predicted semantic feature; training, according to the first loss value and the second loss value, to obtain the text recognition model.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification