Patent search ap:("Samsung Electronics Co. Page Ltd.") AND inv:"Saurabh Sahu"

1.

发明申请
SYSTEM AND METHOD FOR ENHANCING MACHINE LEARNING MODEL FOR AUDIO/VIDEO UNDERSTANDING USING GATED MULTI-LEVEL ATTENTION AND TEMPORAL ADVERSARIAL TRAINING 有权

公开(公告)号：US20220300740A1

公开(公告)日：2022-09-22

申请号：US17387889

申请日：2021-07-28

Applicant: Samsung Electronics Co., Ltd.

Inventor： Saurabh Sahu , Palash Goyal

IPC: G06K9/00 , G06K9/62

Abstract: A method includes obtaining, using at least one processor, audio/video content. The method also includes processing, using the at least one processor, the audio/video content with a trained attention-based machine learning model to classify the audio/video content. Processing the audio/video content includes, using the trained attention-based machine learning model, generating a global representation of the audio/video content based on the audio/video content, generating a local representation of the audio/video content based on different portions of the audio/video content, and combining the global representation of the audio/video content and the local representation of the audio/video content to generate an output representation of the audio/video content. The audio/video content is classified based on the output representation.

2.

发明授权
System and method for enhancing machine learning model for audio/video understanding using gated multi-level attention and temporal adversarial training 有权

公开(公告)号：US11989939B2

公开(公告)日：2024-05-21

申请号：US17387889

申请日：2021-07-28

Applicant: Samsung Electronics Co., Ltd.

Inventor： Saurabh Sahu , Palash Goyal

IPC: G06V20/40 , G06F18/214

CPC classification number: G06V20/41 , G06F18/214

Abstract: A method includes obtaining, using at least one processor, audio/video content. The method also includes processing, using the at least one processor, the audio/video content with a trained attention-based machine learning model to classify the audio/video content. Processing the audio/video content includes, using the trained attention-based machine learning model, generating a global representation of the audio/video content based on the audio/video content, generating a local representation of the audio/video content based on different portions of the audio/video content, and combining the global representation of the audio/video content and the local representation of the audio/video content to generate an output representation of the audio/video content. The audio/video content is classified based on the output representation.

3.

发明申请
MICROGENRE-BASED HYPER-PERSONALIZATION WITH MULTI-MODAL MACHINE LEARNING 有权

公开(公告)号：US20220245424A1

公开(公告)日：2022-08-04

申请号：US17368683

申请日：2021-07-06

Applicant: Samsung Electronics Co., Ltd.

Inventor： Palash Goyal , Saurabh Sahu , Shalini Ghosh , Hyun Chul Lee

IPC: G06N3/04 , G06N3/08

Abstract: A method includes accessing video data that includes at least two different modalities. The method also includes using a convolutional neural network layer to incorporate temporal coherence into a machine learning model architecture configured to process the video data. The method further includes learning dependency among the at least two different modalities in an attention space of the machine learning model architecture. In addition, the method includes predicting one or more correlations among the at least two different modalities.

Patent Agency Ranking