Patent search ap:("BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO. Page LTD.") AND inv:"Kun YAO"

21.

发明公开
PRE-TRAINING METHOD, IMAGE AND TEXT RETRIEVAL METHOD FOR A VISION AND SCENE TEXT AGGREGATION MODEL, ELECTRONIC DEVICE, AND STORAGE MEDIUM 审中-公开

公开(公告)号：US20230386168A1

公开(公告)日：2023-11-30

申请号：US18192393

申请日：2023-03-29

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Yipeng SUN , Mengjun CHENG , Longchao WANG , Xiongwei ZHU , Kun YAO , Junyu HAN , Jingtuo LIU , Errui DING , Jingdong WANG , Haifeng Wang

IPC: G06V10/42 , G06F16/583 , H04N19/176

CPC classification number: G06V10/42 , G06F16/5846 , H04N19/176

Abstract: A pre-training method for a Vision and Scene Text Aggregation model includes: acquiring a sample image-text pair; extracting a sample scene text from a sample image; inputting a sample text into a text encoding network to obtain a sample text feature; inputting the sample image and an initial sample aggregation feature into a visual encoding subnetwork and inputting the initial sample aggregation feature and the sample scene text into a scene encoding subnetwork to obtain a global image feature of the sample image and a learned sample aggregation feature; and pre-training the Vision and Scene Text Aggregation model according to the sample text feature, the global image feature of the sample image, and the learned sample aggregation feature.

22.

发明公开
Method and Apparatus for Recognizing Document Image, Storage Medium and Electronic Device 审中-公开

公开(公告)号：US20230260306A1

公开(公告)日：2023-08-17

申请号：US17884264

申请日：2022-08-09

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Yuechen YU , Chengquan ZHANG , Kun YAO

IPC: G06V30/413 , G06V30/414 , G06V30/416 , G06V30/18

CPC classification number: G06V30/413 , G06V30/414 , G06V30/416 , G06V30/18143

Abstract: A method and an apparatus is provided for recognizing a document image, a storage medium and an electronic device, relates to the technical field of artificial intelligent recognition, particularly relates to the technical fields of deep learning and computer vision. The method includes that a document image to be recognized is transformed into an image feature map, where the document image at least includes at least one text box and text information including multiple characters; a first recognition content of the document image to be recognized is predicted based on the image feature map, the multiple characters and the text box; the document image to be recognized is recognized based on an optical character recognition algorithm to obtain a second recognition content; and the first recognition content is matched with the second recognition content to obtain a target recognition content.

23.

发明公开
CHARACTER RECOGNITION MODEL TRAINING METHOD AND APPARATUS, CHARACTER RECOGNITION METHOD AND APPARATUS, DEVICE AND STORAGE MEDIUM 审中-公开

公开(公告)号：US20230215203A1

公开(公告)日：2023-07-06

申请号：US18168759

申请日：2023-02-14

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Pengyuan LV , Chengquan ZHANG , Shanshan LIU , Meina QIAO , Yangliu XU , Liang WU , Xiaoyan WANG , Kun YAO , Junyu Han , Errui DING , Jingdong WANG , Tian WU , Haifeng WANG

IPC: G06V30/19

CPC classification number: G06V30/19147 , G06V30/19167

Abstract: The present disclosure provides a character recognition model training method and apparatus, a character recognition method and apparatus, a device and a medium, relating to the technical field of artificial intelligence, and specifically to the technical fields of deep learning, image processing and computer vision, which can be applied to scenarios such as character detection and recognition technology. The specific implementing solution is: partitioning an untagged training sample into at least two sub-sample images; dividing the at least two sub-sample images into a first training set and a second training set; where the first training set includes a first sub-sample image with a visible attribute, and the second training set includes a second sub-sample image with an invisible attribute; performing self-supervised training on a to-be-trained encoder by taking the second training set as a tag of the first training set, to obtain a target encoder.

24.

发明公开
METHOD FOR RECOGNIZING TEXT, DEVICE, AND STORAGE MEDIUM 审中-公开

公开(公告)号：US20230206667A1

公开(公告)日：2023-06-29

申请号：US18147806

申请日：2022-12-29

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Pengyuan LV , Liang WU , Shanshan LIU , Meina QIAO , Chengquan ZHANG , Kun YAO , Junyu HAN

IPC: G06V30/19 , G06V30/16

CPC classification number: G06V30/19127 , G06V30/16

Abstract: A method for recognizing text includes: obtaining a first feature map of an image; for each target feature unit, performing a feature enhancement process on a plurality of feature values of the target feature unit respectively based on the plurality of feature values of the target feature unit, in which the target feature unit is a feature unit in the first feature map along a feature enhancement direction; and performing a text recognition process on the image based on the first feature map after the feature enhancement process.

25.

发明申请
METHOD OF PROCESSING TASK, ELECTRONIC DEVICE, AND STORAGE MEDIUM 有权

公开(公告)号：US20230134615A1

公开(公告)日：2023-05-04

申请号：US18146839

申请日：2022-12-27

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Qunyi XIE , Dongdong ZHANG , Xiameng QIN , Mengyi EN , Yangliu XU , Yi CHEN , Ju HUANG , Kun YAO

IPC: G06F9/48 , G06F40/205 , G06F9/50

Abstract: A method of processing a task, an electronic device, and a storage medium are provided, which relate to a field of artificial intelligence, in particular to fields of deep learning and computer vision, and may be applied to OCR optical character recognition and other scenarios. The method includes: parsing labeled data to be processed according to a task type identification, to obtain task labeled data, a tag information of the task labeled data is matched with the task type identification, and the task labeled data includes first task labeled data and second task labeled data; training a model using the first task labeled data, to obtain candidate models, the model is determined according to the task type identification; and determining a target model from the candidate models according to a performance evaluation result obtained by performing performance evaluation on the plurality of candidate models using the second task labeled data.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification