Patent search ap:("BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO. Page LTD.") AND inv:"Kun YAO"

11.

发明申请
TEXT DETECTION METHOD, TEXT RECOGNITION METHOD AND APPARATUS 有权

公开(公告)号：US20230045715A1

公开(公告)日：2023-02-09

申请号：US17966112

申请日：2022-10-14

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Chengquan ZHANG , Pengyuan LV , Sen FAN , Kun YAO , Junyu HAN , Jingtuo LIU

IPC: G06V30/19 , G06V30/16 , G06V30/14

Abstract: The present disclosure provides a text detection method, a text recognition method and an apparatus, which relate to the field of artificial intelligence technology, in particular to the field of deep learning and computer vision technologies, and can be applied to scenarios such as optical character recognition. The text detection method is: acquiring an image feature of a text strip in a to-be-recognized image; performing visual enhancement processing on the to-be-recognized image to obtain an enhanced feature map of the to-be-recognized image; comparing the image feature of the text strip with the enhanced feature map for similarity to obtain a target bounding box of the text strip on the enhanced feature map.

12.

发明申请
IMAGE CLASSIFICATION METHOD, ELECTRONIC DEVICE AND STORAGE MEDIUM 有权

公开(公告)号：US20220027611A1

公开(公告)日：2022-01-27

申请号：US17498226

申请日：2021-10-11

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Yuechen YU , Chengquan ZHANG , Yulin LI , Xiaoqiang ZHANG , Ju HUANG , Xiameng QIN , Kun YAO , Jingtuo LIU , Junyu HAN , Errui DING

IPC: G06K9/00 , G06K9/62 , G06N3/08

Abstract: Provided are an image classification method and apparatus, an electronic device and a storage medium, relating to the field of artificial intelligence and, in particular, to computer vision and deep learning. The method includes inputting a to-be-classified document image into a pretrained neural network and obtaining a feature submap of each text box of the to-be-classified document image by use of the neural network; inputting the feature submap of each text box, a semantic feature corresponding to preobtained text information of each text box and a position feature corresponding to preobtained position information of each text box into a pretrained multimodal feature fusion model and fusing, by use of the multimodal feature fusion model, the three into a multimodal feature corresponding to each text box; and classifying the to-be-classified document image based on the multimodal feature corresponding to each text box.

13.

发明公开
IMAGE-BASED INFORMATION EXTRACTION MODEL, METHOD, AND APPARATUS, DEVICE, AND STORAGE MEDIUM 审中-公开

公开(公告)号：US20240021000A1

公开(公告)日：2024-01-18

申请号：US18113178

申请日：2023-02-23

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Xiameng QIN , Yulin LI , Xiaoqiang ZHANG , Ju HUANG , Qunyi XIE , Kun YAO

IPC: G06V30/19 , G06V30/148

CPC classification number: G06V30/1918 , G06V30/15 , G06V30/19127 , G06V30/19147

Abstract: There is provided an image-based information extraction model, method, and apparatus, a device, and a storage medium, which relates to the field of artificial intelligence (AI) technologies, specifically to fields of deep learning, image processing, computer vision technologies, and is applicable to optical character recognition (OCR) and other scenarios. A specific implementation solution involves: acquiring a to-be-extracted first image and a category of to-be-extracted information; and inputting the first image and the category into a pre-trained information extraction model to perform information extraction on the first image to obtain text information corresponding to the category.

14.

发明公开
METHOD FOR TEXT RECOGNITION 审中-公开

公开(公告)号：US20230186664A1

公开(公告)日：2023-06-15

申请号：US18169032

申请日：2023-02-14

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Shanshan LIU , Meina QIAO , Liang WU , Pengyuan LV , Sen FAN , Chengquan ZHANG , Kun YAO

IPC: G06V30/19 , G06V30/30

CPC classification number: G06V30/19173 , G06V30/19147 , G06V30/30

Abstract: A method for text recognition is disclosed. The method includes obtaining a whole-image scenario for an image to be processed and a text image in the image to be processed. The method further includes determining a first text recognition model corresponding to the whole-image scenario. The method further includes performing text recognition on the text image according to the first text recognition model to obtain text information.

15.

发明申请
TEXT RECOGNITION METHOD, ELECTRONIC DEVICE, AND NON-TRANSITORY STORAGE MEDIUM 有权

公开(公告)号：US20230050079A1

公开(公告)日：2023-02-16

申请号：US17974630

申请日：2022-10-27

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Pengyuan LV , Xiaoyan WANG , Liang WU , Shanshan LIU , Yuechen YU , Meina QIAO , Jie LU , Chengquan ZHANG , Kun YAO

IPC: G06V30/18 , G06V30/148

Abstract: Provided are a text recognition method, an electronic device, and a non-transitory computer-readable storage medium, which are applicable in an OCR scenario. In the particular solution, a text image to be recognized is acquired. Feature extraction is performed on the text image, to obtain an image feature corresponding to the text image, where a height-wise feature and a width-wise feature of the image feature each have a dimension greater than 1. According to the image feature, sampling features corresponding to multiple sampling points in the text image are determined. According to the sampling features corresponding to the multiple sampling points, a character recognition result corresponding to the text image is determined.

16.

发明申请
METHOD FOR RECOGNIZING TEXT, ELECTRONIC DEVICE AND STORAGE MEDIUM 有权

公开(公告)号：US20230010031A1

公开(公告)日：2023-01-12

申请号：US17946464

申请日：2022-09-16

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Pengyuan LYU , Sen FAN , Xiaoyan WANG , Yuechen YU , Chengquan ZHANG , Kun YAO , Junyu HAN

IPC: G06V10/77 , G06V20/62 , G06V10/74

Abstract: A method for recognizing a text, an electronic device and a storage medium. An implementation of the method comprises: obtaining a multi-dimensional first feature map of a to-be-recognized image; performing, based on feature values in the first feature map, feature enhancement processing on each feature value in the first feature map; and performing a text recognition on the to-be-recognized image based on the first feature map after the enhancement processing.

17.

发明申请
METHOD FOR TRAINING IMAGE RECOGNITION MODEL BASED ON SEMANTIC ENHANCEMENT 有权

公开(公告)号：US20220392205A1

公开(公告)日：2022-12-08

申请号：US17892669

申请日：2022-08-22

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Yipeng SUN , Rongqiao AN , Xiang WEI , Longchao WANG , Kun YAO , Junyu HAN , Jingtuo LIU , Errui DING

IPC: G06V10/80 , G06V10/77

Abstract: Embodiments of the present disclosure provide a method and apparatus for training an image recognition model based on a semantic enhancement, a method and apparatus for recognizing an image, an electronic device, and a computer readable storage medium. The method for training an image recognition model based on a semantic enhancement comprises: extracting, from an inputted first image being unannotated and having no textual description, a first feature representation of the first image; calculating a first loss function based on the first feature representation; extracting, from an inputted second image being unannotated and having an original textual description, a second feature representation of the second image; calculating a second loss function based on the second feature representation, and training an image recognition model based on a fusion of the first loss function and the second loss function.

18.

发明申请
TABLE GENERATING METHOD AND APPARATUS, ELECTRONIC DEVICE, STORAGE MEDIUM AND PRODUCT 有权

公开(公告)号：US20220301334A1

公开(公告)日：2022-09-22

申请号：US17832735

申请日：2022-06-06

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Yuechen YU , Yulin LI , Chengquan ZHANG , Kun YAO

IPC: G06V30/416 , G06F40/18 , G06V30/413

Abstract: The present disclosure provides a table generating method and apparatus, an electronic device, a storage medium and a product. A specific implementation is: recognizing at least one table object in a to-be-recognized image and obtaining a table property respectively corresponding to the at least one table object, where the table property of any table object includes a cell property or a non-cell property; determining at least one target object with the cell property in the at least one table object; determining a cell region respectively corresponding to the at least one target object to obtain cell position information respectively corresponding to the at least one target object; generating a spreadsheet corresponding to the to-be-recognized image according to the cell position information respectively corresponding to the at least one target object.

19.

发明公开
TRAINING METHOD, METHOD OF DISPLAYING TRANSLATION, ELECTRONIC DEVICE AND STORAGE MEDIUM 审中-公开

公开(公告)号：US20240282024A1

公开(公告)日：2024-08-22

申请号：US18041206

申请日：2022-04-22

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Liang WU , Shanshan LIU , Chengquan ZHANG , Kun YAO

IPC: G06T11/60 , G06F40/58 , G06N3/094 , G06T3/02 , G06V10/774

CPC classification number: G06T11/60 , G06F40/58 , G06N3/094 , G06T3/02 , G06V10/774

Abstract: A method of training a text erasure model, a method of display a translation, an electronic device, and a storage medium. The training method includes: processing a set of original text block images by using a generator of a generative adversarial network model to obtain a set of simulated text block-erased images; alternately training the generator and a discriminator of the generative adversarial network model by using a set of real text block-erased images and the set of simulated text block-erased images, so as to obtain a trained generator and a trained discriminator; and determining the trained generator as the text erasure model, wherein a pixel value of a text-erased region in a real text block-erased image contained in the set of real text block-erased images is determined based on a pixel value of another region in the real text block-erased image other than the text-erased region.

20.

发明公开
METHOD OF TRAINING TEXT DETECTION MODEL, METHOD OF DETECTING TEXT, AND DEVICE 审中-公开

公开(公告)号：US20240265718A1

公开(公告)日：2024-08-08

申请号：US18041370

申请日：2022-04-22

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Xiaoqiang ZHANG , Xiameng QIN , Chengquan ZHANG , Kun YAO

IPC: G06V30/19 , G06V10/77 , G06V10/82

CPC classification number: G06V30/19127 , G06V10/7715 , G06V10/82

Abstract: A method training a text detection model and a method of detecting a text. The training method includes: inputting a sample image into a text feature extraction sub-model of a text detection model to obtain a text feature of a text in the sample image, the sample image having a label indicating an actual position information and an actual category; inputting a predetermined text vector into a text encoding sub-model of the text detection model to obtain a text reference feature; inputting the text feature and the text reference feature into a decoding sub-model of the text detection model to obtain a text sequence vector; inputting the text sequence vector into an output sub-model of the text detection model to obtain a predicted position information and a predicted category; and training the text detection model based on the predicted and actual categories, the predicted and actual position information.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification