-
公开(公告)号:US10566009B1
公开(公告)日:2020-02-18
申请号:US16520633
申请日:2019-07-24
Applicant: Google LLC
Inventor: Sourish Chaudhuri , Achal D. Dave , Bryan Andrew Seybold
IPC: G10L25/57 , G06F16/638 , G10L17/00 , G10L15/06 , G10L17/04 , G10L17/26 , G10L15/04 , G10L15/01 , G06K9/00
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for audio classifiers. In one aspect, a method includes obtaining a plurality of video frames from a plurality of videos, wherein each of the plurality of video frames is associated with one or more image labels of a plurality of image labels determined based on image recognition; obtaining a plurality of audio segments corresponding to the plurality of video frames, wherein each audio segment has a specified duration relative to the corresponding video frame; and generating an audio classifier trained using the plurality of audio segment and the associated image labels as input, wherein the audio classifier is trained such that the one or more groups of audio segments are determined to be associated with respective one or more audio labels.
-
公开(公告)号:US11669977B2
公开(公告)日:2023-06-06
申请号:US17214327
申请日:2021-03-26
Applicant: Google LLC
Inventor: Susanna Maria Ricco , Bryan Andrew Seybold
IPC: G06T7/73 , G06T7/215 , G06T7/246 , G06V10/764 , G06V20/40
CPC classification number: G06T7/215 , G06T7/248 , G06T7/74 , G06V10/764 , G06V20/40 , G06T2207/10016 , G06T2207/20081 , G06T2207/20084
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an optical flow object localization system and a novel object localization system. In a first aspect, the optical flow object localization system is trained to process an optical flow image to generate object localization data defining locations of objects depicted in a video frame corresponding to the optical flow image. In a second aspect, a novel object localization system is trained to process a video frame to generate object localization data defining locations of novel objects depicted in the video frame.
-
公开(公告)号:US10991122B2
公开(公告)日:2021-04-27
申请号:US16264222
申请日:2019-01-31
Applicant: Google LLC
Inventor: Susanna Maria Ricco , Bryan Andrew Seybold
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an optical flow object localization system and a novel object localization system. In a first aspect, the optical flow object localization system is trained to process an optical flow image to generate object localization data defining locations of objects depicted in a video frame corresponding to the optical flow image. In a second aspect, a novel object localization system is trained to process a video frame to generate object localization data defining locations of novel objects depicted in the video frame.
-
公开(公告)号:US10381022B1
公开(公告)日:2019-08-13
申请号:US15041379
申请日:2016-02-11
Applicant: Google LLC
Inventor: Sourish Chaudhuri , Achal D. Dave , Bryan Andrew Seybold
IPC: G06K9/00 , G10L15/01 , G10L15/04 , G10L15/06 , G10L17/00 , G10L17/04 , G10L17/26 , G10L25/57 , G06F16/638
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for audio classifiers. In one aspect, a method includes obtaining a plurality of video frames from a plurality of videos, wherein each of the plurality of video frames is associated with one or more image labels of a plurality of image labels determined based on image recognition; obtaining a plurality of audio segments corresponding to the plurality of video frames, wherein each audio segment has a specified duration relative to the corresponding video frame; and generating an audio classifier trained using the plurality of audio segment and the associated image labels as input, wherein the audio classifier is trained such that the one or more groups of audio segments are determined to be associated with respective one or more audio labels.
-
公开(公告)号:US20220383652A1
公开(公告)日:2022-12-01
申请号:US17775529
申请日:2020-11-04
Applicant: Google LLC
Inventor: Bryan Andrew Seybold , Shan Yang , Bo Hu , Kevin Patrick Murphy , David Alexander Ross
Abstract: A computing system comprising one or more computing devices can obtain one or more images of an animal. The computing system can determine, using at least one of one or more machine-learned models, a plurality of joint positions associated with the animal based on the one or more images. The computing system can determine a body model for the animal. The computing system can estimate a body pose for the animal based on the one or more images, the plurality of joint positions, and the determined body model.
-
公开(公告)号:US20200349722A1
公开(公告)日:2020-11-05
申请号:US16464608
申请日:2017-12-01
Applicant: Google LLC
Inventor: Cordelia Luise Schmid , Sudheendra Vijayanarasimhan , Susanna Maria Ricco , Bryan Andrew Seybold , Rahul Sukthankar , Aikaterini Fragkiadaki
Abstract: A system comprising an encoder neural network, a scene structure decoder neural network, and a motion decoder neural network. The encoder neural network is configured to: receive a first image and a second image; and process the first image and the second image to generate an encoded representation of the first image and the second image. The scene structure decoder neural network is configured to process the encoded representation to generate a structure output characterizing a structure of a scene depicted in the first image. The motion decoder neural network configured to process the encoded representation to generate a motion output characterizing motion between the first image and the second image.
-
公开(公告)号:US11763466B2
公开(公告)日:2023-09-19
申请号:US17132623
申请日:2020-12-23
Applicant: Google LLC
Inventor: Cordelia Luise Schmid , Sudheendra Vijayanarasimhan , Susanna Maria Ricco , Bryan Andrew Seybold , Rahul Sukthankar , Aikaterini Fragkiadaki
IPC: G06T7/269 , G06N3/02 , G06T3/40 , G06T9/00 , G06T7/215 , G06T7/70 , G06N3/045 , G06N3/048 , G06V10/82 , G06V10/44
CPC classification number: G06T7/269 , G06N3/045 , G06N3/048 , G06T7/215 , G06T7/70 , G06V10/454 , G06V10/82 , G06T2207/10028 , G06T2207/20081 , G06T2207/20084
Abstract: A system comprising an encoder neural network, a scene structure decoder neural network, and a motion decoder neural network. The encoder neural network is configured to: receive a first image and a second image; and process the first image and the second image to generate an encoded representation of the first image and the second image. The scene structure decoder neural network is configured to process the encoded representation to generate a structure output characterizing a structure of a scene depicted in the first image. The motion decoder neural network configured to process the encoded representation to generate a motion output characterizing motion between the first image and the second image.
-
公开(公告)号:US20210217197A1
公开(公告)日:2021-07-15
申请号:US17214327
申请日:2021-03-26
Applicant: Google LLC
Inventor: Susanna Maria Ricco , Bryan Andrew Seybold
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an optical flow object localization system and a novel object localization system. In a first aspect, the optical flow object localization system is trained to process an optical flow image to generate object localization data defining locations of objects depicted in a video frame corresponding to the optical flow image. In a second aspect, a novel object localization system is trained to process a video frame to generate object localization data defining locations of novel objects depicted in the video frame.
-
公开(公告)号:US20210118153A1
公开(公告)日:2021-04-22
申请号:US17132623
申请日:2020-12-23
Applicant: Google LLC
Inventor: Cordelia Luise Schmid , Sudheendra Vijayanarasimhan , Susanna Maria Ricco , Bryan Andrew Seybold , Rahul Sukthankar , Aikaterini Fragkiadaki
Abstract: A system comprising an encoder neural network, a scene structure decoder neural network, and a motion decoder neural network. The encoder neural network is configured to: receive a first image and a second image; and process the first image and the second image to generate an encoded representation of the first image and the second image. The scene structure decoder neural network is configured to process the encoded representation to generate a structure output characterizing a structure of a scene depicted in the first image. The motion decoder neural network configured to process the encoded representation to generate a motion output characterizing motion between the first image and the second image.
-
公开(公告)号:US20200151905A1
公开(公告)日:2020-05-14
申请号:US16264222
申请日:2019-01-31
Applicant: Google LLC
Inventor: Susanna Maria Ricco , Bryan Andrew Seybold
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an optical flow object localization system and a novel object localization system. In a first aspect, the optical flow object localization system is trained to process an optical flow image to generate object localization data defining locations of objects depicted in a video frame corresponding to the optical flow image. In a second aspect, a novel object localization system is trained to process a video frame to generate object localization data defining locations of novel objects depicted in the video frame.
-
-
-
-
-
-
-
-
-