Patent search ap:("Beijing Baidu Netcom Science Technology Co. Page Ltd.") AND inv:"Qiming PENG"

1.

发明公开
DOCUMENT IMAGE UNDERSTANDING 审中-公开

公开(公告)号：US20230177821A1

公开(公告)日：2023-06-08

申请号：US18063564

申请日：2022-12-08

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Qiming PENG , Bin LUO , Yuhui CAO , Shikun FENG , Yongfeng CHEN

IPC: G06V10/82 , G06V30/19 , G06V30/14

CPC classification number: G06V10/82 , G06V30/19147 , G06V30/1444

Abstract: A neural network training method and a document image understanding method is provided. The neural network training method includes: acquiring text comprehensive features of a plurality of first texts in an original image; replacing at least one original region in the original image to obtain a sample image including a plurality of first regions and a ground truth label for indicating whether each first region is a replaced region; acquiring image comprehensive features of the plurality of first regions; inputting the text comprehensive features of the plurality of first texts and the image comprehensive features of the plurality of first regions into a neural network model together to obtain text representation features of the plurality of first texts; determining a predicted label based on the text representation features of the plurality of first texts; and training the neural network model based on the ground truth label and the predicted label.

2.

发明申请
TRAINING METHOD AND APPARATUS FOR DOCUMENT PROCESSING MODEL, DEVICE, STORAGE MEDIUM AND PROGRAM 有权

公开(公告)号：US20220382991A1

公开(公告)日：2022-12-01

申请号：US17883908

申请日：2022-08-09

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Qiming PENG , Bin LUO , Yuhui CAO , Shikun FENG , Yongfeng CHEN

IPC: G06F40/30 , G06V30/414 , G06V30/14

Abstract: The present disclosure provides a training method and apparatus for a document processing model, a device, a storage medium and a program, which relate to the field of artificial intelligence, and in particular, to technologies such as deep learning, natural language processing and text recognition. The specific implementation is: acquiring a first sample document; determining element features of a plurality of document elements in the first sample document and positions corresponding to M position types of each document element according to the first sample document; where the document element corresponds to a character or a document area in the first sample document; and performing training on a basic model according to the element features of the plurality of document elements and the positions corresponding to the M position types of each document element to obtain the document processing model.

3.

发明申请
METHOD AND APPARATUS FOR VISUAL QUESTION ANSWERING, COMPUTER DEVICE AND MEDIUM 有权

公开(公告)号：US20210406619A1

公开(公告)日：2021-12-30

申请号：US17169112

申请日：2021-02-05

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Pengyuan LV , Xiaoqiang ZHANG , Shanshan LIU , Chengquan ZHANG , Qiming PENG , Sijin WU , Hua LU , Yongfeng CHEN

IPC: G06K9/72 , G06T7/70 , G06F40/30 , G06K9/46 , G06K9/00 , G06K9/32 , G06K9/20 , G06K9/62 , G06N20/00 , G06N5/04

Abstract: The present disclosure provides a method for visual question answering, which relates to fields of computer vision and natural language processing. The method includes: acquiring an input image and an input question; detecting visual information and position information of each of at least one text region in the input image; determining semantic information and attribute information of each of the at least one text region based on the visual information and the position information; determining a global feature of the input image based on the visual information, the position information, the semantic information, and the attribute information; determining a question feature based on the input question; and generating a predicted answer for the input image and the input question based on the global feature and the question feature. The present disclosure further provides a device for visual question answering, a computer device and a medium.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification