Patent search ap:("BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO. Page LTD.") AND inv:"Feng HE"

1.

发明公开
METHOD AND APPARATUS FOR TRAINING QUESTION SOLVING MODEL, QUESTION SOLVING METHOD AND APPARATUS 审中-公开

公开(公告)号：US20240354658A1

公开(公告)日：2024-10-24

申请号：US18745529

申请日：2024-06-17

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Feng HE , Jianhua WANG , Junjie OU , Pingxuan HUANG , Zhifan FENG , Xiaopeng CUI , Qiaoqiao SHE , Hua WU

IPC: G06N20/00 , G06N5/04

CPC classification number: G06N20/00 , G06N5/04

Abstract: A method and apparatus for training a question solving model, a question solving method and apparatus, an electronic device and a readable storage medium are disclosed. The method for training a question solving model includes: acquiring a first sample question; inputting the first sample question and a solving step grabbing template into a large language model to obtain a first sample solving step; inputting the first sample question, the first sample solving step and an answer grabbing template into the large language model to obtain a first sample answer; pre-training a step planning model according to the first sample question and the first sample solving step; pre-training the large language model according to the first sample question, the first sample solving step and the first sample answer; and acquiring the question solving model according to the step planning model and the large language model obtained by pre-training. The question solving method includes: acquiring a to-be-solved question; inputting the to-be-solved question into a step planning model to obtain a solving step; and inputting the to-be-solved question and the solving step into a large language model to obtain an answer.

2.

发明申请
VIDEO CLASSIFICATION METHOD, ELECTRONIC DEVICE AND STORAGE MEDIUM 有权

公开(公告)号：US20220284218A1

公开(公告)日：2022-09-08

申请号：US17502173

申请日：2021-10-15

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Hu YANG , Feng HE , Qi WANG , Zhifan FENG , Chunguang CHAI , Yong ZHU

IPC: G06K9/00 , G06K9/62 , G06K9/32 , G10L15/08

Abstract: The present disclosure discloses a video classification method, an electronic device and a storage medium, and relates to the field of computer technologies, and particularly to the field of artificial intelligence technologies, such as knowledge graph technologies, computer vision technologies, deep learning technologies, or the like. The video classification method includes: extracting a keyword in a video according to multi-modal information of the video; acquiring background knowledge corresponding to the keyword, and determining a text to be recognized according to the keyword and the background knowledge; and classifying the text to be recognized to obtain a class of the video.

3.

发明申请
MULTIMODAL DATA PROCESSING 有权

公开(公告)号：US20230010160A1

公开(公告)日：2023-01-12

申请号：US17945415

申请日：2022-09-15

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Shuai CHEN , Qi WANG , Hu YANG , Feng HE , Zhifan FENG , Chunguang CHAI , Yong ZHU

IPC: G06V10/82 , G06V10/80 , G06N3/08

Abstract: Disclosed are a method for processing multimodal data using a neural network, a device, and a medium, and relates to the field of artificial intelligence and, in particular to multimodal data processing, video classification, and deep learning. The neural network includes: an input subnetwork configured to receive the multimodal data to output respective first features of a plurality of modalities; a plurality of cross-modal feature subnetworks, each of which is configured to receive respective first features of two corresponding modalities to output a cross-modal feature corresponding to the two modalities; a plurality of cross-modal fusion subnetworks, each of which is configured to receive at least one cross-modal feature corresponding to a corresponding target modality and other modalities to output a second feature of the target modality; and an output subnetwork configured to receive respective second features of the plurality of modalities to output a processing result of the multimodal data.

4.

发明申请
METHOD FOR TRAINING CROSS-MODAL RETRIEVAL MODEL, ELECTRONIC DEVICE AND STORAGE MEDIUM 有权

公开(公告)号：US20220284246A1

公开(公告)日：2022-09-08

申请号：US17502385

申请日：2021-10-15

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Feng HE , Qi WANG , Zhifan FENG , Hu YANG , Chunguang CHAI

IPC: G06K9/62

Abstract: The present disclosure discloses a method for training a cross-modal retrieval model, an electronic device and a storage medium, and relates to the field of computer technologies, and particularly to the field of artificial intelligence technologies, such as knowledge graph technologies, computer vision technologies, deep learning technologies, or the like. The method for training a cross-modal retrieval model includes: determining similarity of a cross-modal sample pair according to the cross-modal sample pair, the cross-modal sample pair including a sample of a first modal and a sample of a second modal, and the first modal being different from the second modal; determining a soft margin based on the similarity, and determining a soft margin loss function based on the soft margin; and determining a total loss function based on the soft margin loss function, and training a cross-modal retrieval model according to the total loss function.

5.

发明申请
METHOD FOR TRAINING IMAGE-TEXT MATCHING MODEL, COMPUTING DEVICE, AND STORAGE MEDIUM 有权

公开(公告)号：US20230005284A1

公开(公告)日：2023-01-05

申请号：US17943458

申请日：2022-09-13

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Feng HE , Qi WANG , Hu YANG , Shuai CHEN , Zhifan FENG , Chunguang CHAI

IPC: G06V30/19 , G06F16/583

Abstract: A computer-implemented method is provided. The method includes: obtaining a sample text and a sample image corresponding to the sample text; labeling a true semantic tag for the sample text according to a first preset rule; obtaining a text feature representation of the sample text and a predicted semantic tag output by a text coding sub-model; obtaining an image feature representation of the sample image output by an image coding sub-model; calculating a first loss based on the true semantic tag and the predicted semantic tag; calculating a contrast loss based on the text feature representation of the sample text and the image feature representation of the sample image; adjusting parameters of the text coding sub-model based on the first loss and the contrast loss; and adjusting parameters of the image coding sub-model based on the contrast loss.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification