Patent search ap:("BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO. Page LTD.") AND inv:"Guodong Guo"

1.

发明申请
METHOD AND APPARATUS FOR TRAINING SEMANTIC SEGMENTATION MODEL, AND METHOD AND APPARATUS FOR PERFORMING SEMANTIC SEGMENTATION ON VIDEO 有权

公开(公告)号：US20230079275A1

公开(公告)日：2023-03-16

申请号：US17985000

申请日：2022-11-10

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Tianyi Wu , Yu Zhu , Guodong Guo

IPC: G06V20/40 , G06V10/26 , G06V10/75 , G06V10/62

Abstract: The present disclosure provides a method and apparatus for training a semantic segmentation model and a method and apparatus for performing a semantic segmentation on a video. The method comprises: acquiring a training sample set, wherein a training sample in the training sample set comprises at least one sample video stream and a pixel-level annotation result of the sample video stream; modeling a spatiotemporal context between video frames in the sample video stream using an initial semantic segmentation model to obtain a context representation of the sample video stream; calculating a temporal contrastive loss based on the context representation of the sample video stream and the pixel-level annotation result of the sample video stream; and updating a parameter of the initial semantic segmentation model based on the temporal contrastive loss to obtain a trained semantic segmentation model.

2.

发明申请
VIDEO GENERATION METHOD, APPARATUS, ELECTRONIC DEVICE, STORAGE MEDIUM AND PROGRAM PRODUCT 有权

公开(公告)号：US20220392493A1

公开(公告)日：2022-12-08

申请号：US17887179

申请日：2022-08-12

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Jimin Pi , Xin Wang , Guodong Guo

IPC: G11B27/036 , G06V30/416 , G10L13/08 , G06F16/735

Abstract: This disclosure provides a video generation method, a video generation apparatus, an electronic device, a storage medium and a program product, and relates to the field of artificial intelligence technology, and in particular to the field of computer vision technology and deep learning technology. A specific implementation includes: obtaining document content information of a document; extracting, from the document content information, populating information for multiple scenes in a preset video template; populating the populating information for the multiple scenes into corresponding scenes in the preset video template, respectively, to obtain image information of the multiple scenes; generating audio information of the multiple scenes according to the populating information for the multiple scenes; generating a video of the document based on the image information and audio information of the multiple scenes.

3.

发明申请
METHOD FOR PROCESSING SIGNAL, ELECTRONIC DEVICE, AND STORAGE MEDIUM 有权

公开(公告)号：US20230135109A1

公开(公告)日：2023-05-04

申请号：US18050672

申请日：2022-10-28

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Tianyi Wu , Sitong Wu , Guodong Guo

IPC: G06K9/62

Abstract: A method for processing a signal includes: in response to receiving an input feature map of the signal, dividing the input feature map into patches of a plurality of rows and patches of a plurality of columns, in which the input feature map represents features of the signal; selecting a row subset from the plurality of rows and a column subset from the plurality of columns, in which rows in the row subset are at least one row apart from each other, and columns in the column subset are at least one column apart from each other; and obtaining aggregated features by performing self-attention calculation on patches of the row subset and patches of the column subset.

4.

发明授权
Video generation method, apparatus, electronic device, storage medium and program product 有权

公开(公告)号：US11929100B2

公开(公告)日：2024-03-12

申请号：US17887179

申请日：2022-08-12

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Jimin Pi , Xin Wang , Guodong Guo

IPC: G11B27/036 , G06F16/735 , G06V30/416 , G10L13/08

CPC classification number: G11B27/036 , G06F16/735 , G06V30/416 , G10L13/08

Abstract: This disclosure provides a video generation method, a video generation apparatus, an electronic device, a storage medium and a program product, and relates to the field of artificial intelligence technology, and in particular to the field of computer vision technology and deep learning technology. A specific implementation includes: obtaining document content information of a document; extracting, from the document content information, populating information for multiple scenes in a preset video template; populating the populating information for the multiple scenes into corresponding scenes in the preset video template, respectively, to obtain image information of the multiple scenes; generating audio information of the multiple scenes according to the populating information for the multiple scenes; generating a video of the document based on the image information and audio information of the multiple scenes.

5.

发明申请
METHOD FOR TRAINING STUDENT NETWORK AND METHOD FOR RECOGNIZING IMAGE 有权

公开(公告)号：US20230046088A1

公开(公告)日：2023-02-16

申请号：US17975874

申请日：2022-10-28

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Tianyi Wu , Yu Zhu , Guodong Guo

IPC: G06V10/77 , G06V10/774

Abstract: Disclosed are a method for training a Student Network and a method for recognizing an image. The method includes: acquiring first prediction feature information of a sample image on the first granularity and second prediction feature information of the sample image on the second granularity by inputting the sample image into a Student Network, and acquiring first feature information of the sample image on the first granularity and second feature information of the sample image on the second granularity by inputting the sample image into a Teacher Network, and acquiring a target Student Network.

Patent Agency Ranking