Patent search ap:("Google LLC") AND inv:"Sourish Chaudhuri" Page 1

1.

发明授权
Hot-word free adaptation of automated assistant function(s) 有权

公开(公告)号：US11688417B2

公开(公告)日：2023-06-27

申请号：US16622771

申请日：2019-05-02

Applicant: Google LLC

Inventor： Jaclyn Konzelmann , Kenneth Mixter , Sourish Chaudhuri , Tuan Nguyen , Hideaki Matsui , Caroline Pantofaru , Vinay Bettadapura

IPC: G10L25/78 , G06F3/16 , G06V40/18 , G06F40/30

CPC classification number: G10L25/78 , G06F3/167 , G06V40/18 , G06F40/30

Abstract: Hot-word free adaptation of one or more function(s) of an automated assistant. Sensor data, from one or more sensor components of an assistant device that provides an automated assistant interface (graphical and/or audible), is processed to determine occurrence and/or confidence metric(s) of various attributes of a user that is proximal to the assistant device. Whether to adapt each of one or more of the function(s) of the automated assistant is based on the occurrence and/or the confidence of one or more of the various attributes. For example, certain processing of at least some of the sensor data can be initiated, such as initiating previously dormant local processing of at least some of the sensor data and/or initiating transmission of at least some of the audio data to remote automated assistant component(s).

2.

发明公开
HOT-WORD FREE ADAPTATION OF AUTOMATED ASSISTANT FUNCTION(S) 审中-公开

公开(公告)号：US20230253009A1

公开(公告)日：2023-08-10

申请号：US18135611

申请日：2023-04-17

Applicant: GOOGLE LLC

Inventor： Jaclyn Konzelmann , Kenneth Mixter , Sourish Chaudhuri , Tuan Nguyen , Hideaki Matsui , Caroline Pantofaru , Vinay Bettadapura

IPC: G10L25/78 , G06F3/16 , G06V40/18

CPC classification number: G10L25/78 , G06F3/167 , G06V40/18 , G06F40/30

Abstract: Hot-word free adaptation of one or more function(s) of an automated assistant. Sensor data, from one or more sensor components of an assistant device that provides an automated assistant interface (graphical and/or audible), is processed to determine occurrence and/or confidence metric(s) of various attributes of a user that is proximal to the assistant device. Whether to adapt each of one or more of the function(s) of the automated assistant is based on the occurrence and/or the confidence of one or more of the various attributes. For example, certain processing of at least some of the sensor data can be initiated, such as initiating previously dormant local processing of at least some of the sensor data and/or initiating transmission of at least some of the audio data to remote automated assistant component(s).

3.

发明授权
Gating model for video analysis 有权

公开(公告)号：US11587319B2

公开(公告)日：2023-02-21

申请号：US17216925

申请日：2021-03-30

Applicant: Google LLC

Inventor： Sharadh Ramaswamy , Sourish Chaudhuri , Joseph Roth

IPC: G06V20/40 , G06F40/169 , G06K9/62 , G06N3/08

Abstract: Implementations described herein relate to methods, devices, and computer-readable media to perform gating for video analysis. In some implementations, a computer-implemented method includes obtaining a video comprising a plurality of frames and corresponding audio. The method further includes performing sampling to select a subset of the plurality of frames based on a target frame rate and extracting a respective audio spectrogram for each frame in the subset of the plurality of frames. The method further includes reducing resolution of the subset of the plurality of frames. The method further includes applying a machine-learning based gating model to the subset of the plurality of frames and corresponding audio spectrograms and obtaining, as output of the gating model, an indication of whether to analyze the video to add one or more video annotations.

4.

发明申请
GATING MODEL FOR VIDEO ANALYSIS 有权

公开(公告)号：US20210216778A1

公开(公告)日：2021-07-15

申请号：US17216925

申请日：2021-03-30

Applicant: Google LLC

Inventor： Sharadh Ramaswamy , Sourish Chaudhuri , Joseph Roth

IPC: G06K9/00 , G06F40/169 , G06K9/62 , G06N3/08

Abstract: Implementations described herein relate to methods, devices, and computer-readable media to perform gating for video analysis. In some implementations, a computer-implemented method includes obtaining a video comprising a plurality of frames and corresponding audio. The method further includes performing sampling to select a subset of the plurality of frames based on a target frame rate and extracting a respective audio spectrogram for each frame in the subset of the plurality of frames. The method further includes reducing resolution of the subset of the plurality of frames. The method further includes applying a machine-learning based gating model to the subset of the plurality of frames and corresponding audio spectrograms and obtaining, as output of the gating model, an indication of whether to analyze the video to add one or more video annotations.

5.

发明申请
CONTEXT-BASED SPEAKER COUNTER FOR A SPEAKER DIARIZATION SYSTEM 有权

公开(公告)号：US20230103060A1

公开(公告)日：2023-03-30

申请号：US17909879

申请日：2020-03-13

Applicant: Google LLC

Inventor： Sourish Chaudhuri , Lev Finkelstein

IPC: G06V20/40 , G06V40/16 , G06V10/762 , G10L25/57 , G10L21/028 , G10L17/02

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining the number of speakers in a video and a corresponding audio using visual context. In one aspect, a method includes detecting within the video multiple speakers, determining a bounding box for each detected speaker that includes the detected person and objects within a threshold distance of the detected person in an image frame, determining a unique descriptor for that person based in part on image information depicting the objects within the bounding box, determining a cardinality of unique speakers in the video, providing to the speaker diarization system the cardinality of unique speakers.

6.

发明申请
AUTOMATIC DETERMINATION OF TIMING WINDOWS FOR SPEECH CAPTIONS IN AN AUDIO STREAM 审中-公开

公开(公告)号：US20200090678A1

公开(公告)日：2020-03-19

申请号：US16685187

申请日：2019-11-15

Applicant: GOOGLE LLC

Inventor： Sourish Chaudhuri , Nebojsa Ciric , Khiem Pham

IPC: G10L25/27 , G10L19/022 , G10L25/93 , G11B27/28 , G10L25/87 , G10L25/48

Abstract: The technology disclosed herein may determine timing windows for speech captions of an audio stream. In one example, the technology may involve accessing audio data comprising a plurality of segments; determining, by a processing device, that one or more of the plurality of segments comprise speech sounds; identifying a time duration for the speech sounds; and providing a user interface element corresponding to the time duration for the speech sounds, wherein the user interface element indicates an estimate of a beginning and ending of the speech sounds and is configured to receive caption text associated with the speech sounds of the audio data.

7.

发明授权
Audio classifier 有权

公开(公告)号：US10381022B1

公开(公告)日：2019-08-13

申请号：US15041379

申请日：2016-02-11

Applicant: Google LLC

Inventor： Sourish Chaudhuri , Achal D. Dave , Bryan Andrew Seybold

IPC: G06K9/00 , G10L15/01 , G10L15/04 , G10L15/06 , G10L17/00 , G10L17/04 , G10L17/26 , G10L25/57 , G06F16/638

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for audio classifiers. In one aspect, a method includes obtaining a plurality of video frames from a plurality of videos, wherein each of the plurality of video frames is associated with one or more image labels of a plurality of image labels determined based on image recognition; obtaining a plurality of audio segments corresponding to the plurality of video frames, wherein each audio segment has a specified duration relative to the corresponding video frame; and generating an audio classifier trained using the plurality of audio segment and the associated image labels as input, wherein the audio classifier is trained such that the one or more groups of audio segments are determined to be associated with respective one or more audio labels.

8.

发明授权
Automatic determination of timing windows for speech captions in an audio stream 有权

公开(公告)号：US11011184B2

公开(公告)日：2021-05-18

申请号：US16685187

申请日：2019-11-15

Applicant: GOOGLE LLC

Inventor： Sourish Chaudhuri , Nebojsa Ciric , Khiem Pham

IPC: G10L21/00 , G10L25/27 , G11B27/28 , G10L25/87 , G10L25/48 , G10L19/022 , G10L25/93

Abstract: The technology disclosed herein may determine timing windows for speech captions of an audio stream. In one example, the technology may involve accessing audio data comprising a plurality of segments; determining, by a processing device, that one or more of the plurality of segments comprise speech sounds; identifying a time duration for the speech sounds; and providing a user interface element corresponding to the time duration for the speech sounds, wherein the user interface element indicates an estimate of a beginning and ending of the speech sounds and is configured to receive caption text associated with the speech sounds of the audio data.

9.

发明授权
Gating model for video analysis 有权

公开(公告)号：US10984246B2

公开(公告)日：2021-04-20

申请号：US16352605

申请日：2019-03-13

Applicant: Google LLC

Inventor： Sharadh Ramaswamy , Sourish Chaudhuri , Joseph Roth

IPC: G06K9/00 , G06F40/169 , G06K9/62 , G06N3/08

Abstract: Implementations described herein relate to methods, devices, and computer-readable media to perform gating for video analysis. In some implementations, a computer-implemented method includes obtaining a video comprising a plurality of frames and corresponding audio. The method further includes performing sampling to select a subset of the plurality of frames based on a target frame rate and extracting a respective audio spectrogram for each frame in the subset of the plurality of frames. The method further includes reducing resolution of the subset of the plurality of frames. The method further includes applying a machine-learning based gating model to the subset of the plurality of frames and corresponding audio spectrograms and obtaining, as output of the gating model, an indication of whether to analyze the video to add one or more video annotations.

10.

发明申请
HOT-WORD FREE ADAPTATION OF AUTOMATED ASSISTANT FUNCTION(S) 审中-公开

公开(公告)号：US20200349966A1

公开(公告)日：2020-11-05

申请号：US16622771

申请日：2019-05-02

Applicant: Google LLC

Inventor： Jaclyn Konzelmann , Kenneth Mixter , Sourish Chaudhuri , Tuan Nguyen , Hideaki Matsui , Caroline Pantofaru , Vinay Bettadapura

IPC: G10L25/78 , G06F3/16 , G06K9/00

Abstract: Hot-word free adaptation of one or more function(s) of an automated assistant. Sensor data, from one or more sensor components of an assistant device that provides an automated assistant interface (graphical and/or audible), is processed to determine occurrence and/or confidence metric(s) of various attributes of a user that is proximal to the assistant device. Whether to adapt each of one or more of the function(s) of the automated assistant is based on the occurrence and/or the confidence of one or more of the various attributes. For example, certain processing of at least some of the sensor data can be initiated, such as initiating previously dormant local processing of at least some of the sensor data and/or initiating transmission of at least some of the audio data to remote automated assistant component(s).

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification