Patent search ap:("Google LLC") AND inv:"Sourish Chaudhuri" Page 2

11.

发明申请
GATING MODEL FOR VIDEO ANALYSIS 审中-公开

公开(公告)号：US20200293783A1

公开(公告)日：2020-09-17

申请号：US16352605

申请日：2019-03-13

Applicant: Google LLC

Inventor： Sharadh Ramaswamy , Sourish Chaudhuri , Joseph Roth

IPC: G06K9/00 , G06K9/62 , G06N3/08 , G06F17/24

Abstract: Implementations described herein relate to methods, devices, and computer-readable media to perform gating for video analysis. In some implementations, a computer-implemented method includes obtaining a video comprising a plurality of frames and corresponding audio. The method further includes performing sampling to select a subset of the plurality of frames based on a target frame rate and extracting a respective audio spectrogram for each frame in the subset of the plurality of frames. The method further includes reducing resolution of the subset of the plurality of frames. The method further includes applying a machine-learning based gating model to the subset of the plurality of frames and corresponding audio spectrograms and obtaining, as output of the gating model, an indication of whether to analyze the video to add one or more video annotations.

12.

发明授权
Audio classifier 有权

公开(公告)号：US10566009B1

公开(公告)日：2020-02-18

申请号：US16520633

申请日：2019-07-24

Applicant: Google LLC

Inventor： Sourish Chaudhuri , Achal D. Dave , Bryan Andrew Seybold

IPC: G10L25/57 , G06F16/638 , G10L17/00 , G10L15/06 , G10L17/04 , G10L17/26 , G10L15/04 , G10L15/01 , G06K9/00

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for audio classifiers. In one aspect, a method includes obtaining a plurality of video frames from a plurality of videos, wherein each of the plurality of video frames is associated with one or more image labels of a plurality of image labels determined based on image recognition; obtaining a plurality of audio segments corresponding to the plurality of video frames, wherein each audio segment has a specified duration relative to the corresponding video frame; and generating an audio classifier trained using the plurality of audio segment and the associated image labels as input, wherein the audio classifier is trained such that the one or more groups of audio segments are determined to be associated with respective one or more audio labels.

Patent Agency Ranking