Patent search ap:("BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO. Page LTD.") AND inv:"Xiameng Qin"

1.

发明授权
Method and apparatus for visual question answering, computer device and medium 有权

公开(公告)号：US11775574B2

公开(公告)日：2023-10-03

申请号：US17182987

申请日：2021-02-23

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Yulin Li , Xiameng Qin , Ju Huang , Qunyi Xie , Junyu Han

IPC: G06F16/00 , G06F16/36 , G06F40/279 , G06F18/25 , G06V10/764 , G06V10/80 , G06V10/82 , G06V10/44 , G06V10/426 , G06N3/02

CPC classification number: G06F16/367 , G06F18/253 , G06F40/279 , G06V10/426 , G06V10/454 , G06V10/764 , G06V10/811 , G06V10/82 , G06N3/02

Abstract: A method for visual question answering, a computer device implementing the method and a medium for storing instructions on performing the method are provided. The method includes: acquiring an input image and an input question; constructing a visual graph based on the input image, wherein the visual graph comprises a first node feature and a first edge feature; constructing a question graph based on the input question, wherein the question graph comprises a second node feature and a second edge feature; performing a multimodal fusion on the visual graph and the question graph to obtain an updated visual graph and an updated question graph; determining a question feature based on the input question; determining a fusion feature based on the updated visual graph, the updated question graph and the question feature; and generating a predicted answer for the input image and the input question.

2.

发明授权
Method, apparatus and electronic device for annotating information of structured document 有权

公开(公告)号：US11687704B2

公开(公告)日：2023-06-27

申请号：US17207179

申请日：2021-03-19

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Qiaoyi Li , Xiangkai Huang , Yulin Li , Ju Huang , Xiameng Qin , Duohao Qin , Minghao Liu , Junyu Han

IPC: G06F7/02 , G06F16/00 , G06F40/174 , G06F16/93 , G06V30/40 , G06V30/19 , G06V30/10

CPC classification number: G06F40/174 , G06F16/93 , G06V30/19013 , G06V30/19173 , G06V30/40 , G06V30/10

Abstract: Disclosed are a method, apparatus and electronic device for annotating information of a structured document. A specific implementation is: obtaining a template image of a structured document and at least one piece of annotation information of a field to be filled in the template image, where the annotation information includes attribute value and historical content of the field to be filled, and historical position of the field to be filled in the template image; generating, according to the attribute value of the field to be filled, the historical content of the field to be filled and the historical position of the field to be filled in the template image, target filling information of the field to be filled; obtaining, according to the target filling information of the field to be filled, an image of an annotated structured document.

3.

发明申请
METHOD FOR TRAINING MODEL, DEVICE, AND STORAGE MEDIUM 有权

公开(公告)号：US20230042234A1

公开(公告)日：2023-02-09

申请号：US17972253

申请日：2022-10-24

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Yangliu XU , Qunyi Xie , Yi Chen , Xiameng Qin , Chengquan Zhang , Kun Yao

IPC: G06N3/08

Abstract: A method for training a model includes: obtaining a scene image, second actual characters in the scene image and a second construct image; obtaining first features and first recognition characters of characters obtained by performing character recognition on the scene image using the model to be trained; obtaining second features of characters obtained by performing character recognition on the second construct image using the training auxiliary model; and obtaining a character recognition model by adjusting model parameters of the model to be trained based on the first recognition characters, the second actual characters, the first features and the second features.

4.

发明申请
Image Table Extraction Method And Apparatus, Electronic Device, And Storgage Medium 有权

公开(公告)号：US20210390294A1

公开(公告)日：2021-12-16

申请号：US17139403

申请日：2020-12-31

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Xiangkai Huang , Qiaoyi LI , Yulin LI , Ju Huang , Duohao Qin , Xiameng Qin , Minghao Liu , Junyu Han , Jiangliang Guo

IPC: G06K9/00 , G06N3/04 , G06N3/08

Abstract: Embodiments of the present disclosure disclose an image table extraction method and apparatus, an electronic device, a storage media, and a training method for a table extraction model, which relate to the field of artificial intelligence technologies and cloud computing technologies, including: acquiring an image to be processed;
generating a table of the image to be processed according to a table extraction model, where the table extraction model is obtained according to a field position feature, an image feature, and a text feature of a sample image; and filling text information of the image to be processed into the table.

5.

发明授权
Method and device for visual question answering, computer apparatus and medium 有权

公开(公告)号：US11768876B2

公开(公告)日：2023-09-26

申请号：US17161466

申请日：2021-01-28

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Xiameng Qin , Yulin Li , Qunyi Xie , Ju Huang , Junyu Han

IPC: G06F16/9032 , G06F16/583 , G06F16/532 , G06F40/279 , G06N3/04 , G06N3/088 , G06F18/213 , G06F18/25 , G06V10/25 , G06V10/764 , G06V10/80 , G06V10/82 , G06V10/44

CPC classification number: G06F16/90332 , G06F16/532 , G06F16/583 , G06F18/213 , G06F18/253 , G06F40/279 , G06N3/04 , G06N3/088 , G06V10/25 , G06V10/454 , G06V10/764 , G06V10/806 , G06V10/82 , G06V2201/07

Abstract: The present disclosure provides a method for visual question answering, which relates to a field of computer vision and natural language processing. The method includes: acquiring an input image and an input question; constructing a Visual Graph based on the input image, wherein the Visual Graph comprises a Node Feature and an Edge Feature; updating the Node Feature by using the Node Feature and the Edge Feature to obtain an updated Visual Graph; determining a question feature based on the input question; fusing the updated Visual Graph and the question feature to obtain a fused feature; and generating a predicted answer for the input image and the input question based on the fused feature. The present disclosure further provides an apparatus for visual question answering, a computer device and a non-transitory computer-readable storage medium.

6.

发明授权
Method and apparatus for correcting distorted document image 有权

公开(公告)号：US11756170B2

公开(公告)日：2023-09-12

申请号：US17151783

申请日：2021-01-19

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Qunyi Xie , Xiameng Qin , Yulin Li , Junyu Han , Shengxian Zhu

IPC: G06T5/00 , G06N3/08 , G06T5/30

CPC classification number: G06T5/006 , G06N3/08 , G06T5/30 , G06T2207/20081 , G06T2207/20084 , G06T2207/30176

Abstract: Embodiments of the present disclosure provide a method and apparatus for correcting a distorted document image, where the method for correcting a distorted document image includes: obtaining a distorted document image; and inputting the distorted document image into a correction model, and obtaining a corrected image corresponding to the distorted document image; where the correction model is a model obtained by training with a set of image samples as inputs and a corrected image corresponding to each image sample in the set of image samples as an output, and the image samples are distorted. By inputting the distorted document image to be corrected into the correction model, the corrected image corresponding to the distorted document image can be obtained through the correction model, which realizes document image correction end-to-end, improves accuracy of the document image correction, and extends application scenarios of the document image correction.

7.

发明申请
METHOD AND DEVICE FOR TRAINING IMAGE RECOGNITION MODEL, EQUIPMENT AND MEDIUM 有权

公开(公告)号：US20220092353A1

公开(公告)日：2022-03-24

申请号：US17540207

申请日：2021-12-01

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Ruixue Liu , Xiameng Qin , Mengyi En , Kun Yao , Chengquan Zhang , Shengxian Zhu , Yunhao Li , Junyu Han , Hao Sun

IPC: G06K9/62 , G06V30/30 , G06V30/14

Abstract: A computer-implemented method includes: acquiring training data, the training data includes training images for a preset vertical type, and the training images include a first training image containing real data of the preset vertical type and a second training image containing virtual data of the preset vertical type ; building a basic model, the basic model includes a deep learning network, and the deep learning network is configured to recognize the training images to extract text data in the training image; and training the basic model by using the training data to obtain the image recognition model.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification