Patent search ap:("Beijing Baidu Netcom Science Technology Co. Page Ltd.") AND inv:"Dongliang HE"

1.

发明公开
METHOD FOR TRAINING MULTI-MODAL DATA MATCHING DEGREE CALCULATION MODEL, METHOD FOR CALCULATING MULTI-MODAL DATA MATCHING DEGREE, AND RELATED APPARATUSES 审中-公开

公开(公告)号：US20230215136A1

公开(公告)日：2023-07-06

申请号：US18113826

申请日：2023-02-24

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Haoran WANG , Dongliang HE , Fu LI , Errui DING

IPC: G06V10/74 , G06V10/77

CPC classification number: G06V10/761 , G06V10/7715

Abstract: The present disclosure provides a method and apparatus for training a multi-modal data matching degree calculation model, a method and apparatus for calculating a multi-modal data matching degree, an electronic device, a computer readable storage medium and a computer program product, and relates to the field of artificial intelligence technology such as deep learning, image processing and computer vision. The method comprises: acquiring first sample data and second sample data that are different in modalities; constructing a contrastive learning loss function comprising a semantic perplexity parameter, the semantic perplexity parameter being determined based on a semantic feature distance between the first sample data and the second sample data; and training, by using the contrastive learning loss function, an initial multi-modal data matching degree calculation model through a contrastive learning approach, to obtain a target multi-modal data matching degree calculation model.

2.

发明申请
VIDEO REPAIRING METHODS, APPARATUS, DEVICE, MEDIUM AND PRODUCTS 有权

公开(公告)号：US20230008473A1

公开(公告)日：2023-01-12

申请号：US17944745

申请日：2022-09-14

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Xin LI , He ZHENG , Fanglong LIU , Dongliang HE

IPC: H04N19/159 , H04N19/182 , G06V20/40

Abstract: A video repairing method, apparatus, device, medium, and product are provided. The method includes: acquiring a to-be-repaired video frame sequence; determining a target category corresponding to each pixel in the to-be-repaired video frame sequence based on the to-be-repaired video frame sequence and a preset category detection model; determining, from the to-be-repaired video frame sequence, to-be-repaired pixels each with a target category being a to-be-repaired category; and performing repairing on to-be-repaired areas corresponding to the to-be-repaired pixels to obtain a target video frame sequence.

3.

发明申请
METHOD FOR PROCESSING IMAGE, DEVICE AND STORAGE MEDIUM 有权

公开(公告)号：US20220319141A1

公开(公告)日：2022-10-06

申请号：US17845843

申请日：2022-06-21

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Fanglong LIU , Xin LI , Dongliang HE

IPC: G06V10/22 , G06T7/11 , H04N19/174

Abstract: A methods for processing an image, a device, and a storage medium are provided. The method may include: inputting a target image into a pre-trained image segmentation model, the target image including at least one sub-image; extracting high-level semantic features and low-level features of the target image through the image segmentation model, and determining target location information of the sub-image in the target image based on the high-level semantic features and the low-level features; and performing a preset processing operation on the sub-image, based on the target location information of the sub-image.

4.

发明公开
CROSS-MODAL FEATURE EXTRACTION, RETRIEVAL, AND MODEL TRAINING METHOD AND APPARATUS, AND MEDIUM 审中-公开

公开(公告)号：US20240013558A1

公开(公告)日：2024-01-11

申请号：US18113266

申请日：2023-02-23

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Haoran WANG , Dongliang HE , Fu LI , Errui DING

IPC: G06V20/70 , G06V10/774 , G06V20/40 , G06F40/30 , G06F40/279

CPC classification number: G06V20/70 , G06V10/774 , G06V20/46 , G06F40/30 , G06F40/279

Abstract: There is provided cross-modal feature extraction, retrieval, and model training methods and apparatuses, and a medium, which relates to the field of artificial intelligence (AI) technologies, and specifically to fields of deep learning, image processing, and computer vision technologies. A specific implementation solution involves: acquiring to-be-processed data, the to-be-processed data corresponding to at least two types of first modalities; determining first data of a second modality in the to-be-processed data, the second modality being any of the types of the first modalities; performing semantic entity extraction on the first data to obtain semantic entities; and acquiring semantic coding features of the first data based on the first data and the semantic entities and by using a pre-trained cross-modal feature extraction model.

5.

发明申请
METHOD OF PROCESSING VIDEO, METHOD OF QUERING VIDEO, AND METHOD OF TRAINING MODEL 有权

公开(公告)号：US20230130006A1

公开(公告)日：2023-04-27

申请号：US18145724

申请日：2022-12-22

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Dongliang HE , Errui DING , Haifeng WANG

IPC: G06V20/40 , G06V10/774 , G06V10/86 , G06F16/73 , G06F16/783

Abstract: The present application provides a method of processing a video, a method of querying a video, and a method of training a video processing model. A specific implementation solution of the method of processing the video includes: extracting, for a video to be processed, a plurality of video features under a plurality of receptive fields; extracting a local feature of the video to be processed according to a video feature under a target receptive field in the plurality of receptive fields; obtaining a global feature of the video to be processed according to a video feature under a largest receptive field in the plurality of receptive fields; and merging the local feature and the global feature to obtain a target feature of the video to be processed.

6.

发明公开
METHOD AND APPARATUS FOR PRE-TRAINING SEMANTIC REPRESENTATION MODEL AND ELECTRONIC DEVICE 审中-公开

公开(公告)号：US20230147550A1

公开(公告)日：2023-05-11

申请号：US18051594

申请日：2022-11-01

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Dongliang HE , Errui DING

IPC: G06V10/774 , G06V20/40 , G06F40/30 , G06V30/19

CPC classification number: G06V10/774 , G06V20/41 , G06F40/30 , G06V30/19147

Abstract: A method for pre-training a semantic representation model includes: for each video-text pair in pre-training data, determining a mask image sequence, a mask character sequence, and a mask image-character sequence of the video-text pair; determining a plurality of feature sequences and mask position prediction results respectively corresponding to the plurality of feature sequences by inputting the mask image sequence, the mask character sequence, and the mask image-character sequence into an initial semantic representation model; and building a loss function based on the plurality of feature sequences, the mask position prediction results respectively corresponding to the plurality of feature sequences and true mask position results, and adjusting coefficients of the semantic representation model to realize training.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification