Patent search ap:("Google LLC") AND inv:"Joonseok Lee" Page 1

1.

发明公开
Diffusion Models for Generation of Audio Data Based on Descriptive Textual Prompts 审中-公开

公开(公告)号：US20240282294A1

公开(公告)日：2024-08-22

申请号：US18651296

申请日：2024-04-30

Applicant: Google LLC

Inventor： Qingqing Huang , Daniel Sung-Joon Park , Aren Jansen , Timo Immanuel Denk , Yue Li , Ravi Ganti , Dan Ellis , Tao Wang , Wei Han , Joonseok Lee

IPC: G10L15/06 , G10L15/16

CPC classification number: G10L15/063 , G10L15/16

Abstract: A corpus of textual data is generated with a machine-learned text generation model. The corpus of textual data includes a plurality of sentences. Each sentence is descriptive of a type of audio. For each of a plurality of audio recordings, the audio recording is processed with a machine-learned audio classification model to obtain training data including the audio recording and one or more sentences of the plurality of sentences closest to the audio recording within a joint audio-text embedding space of the machine-learned audio classification model. The sentence(s) are processed with a machine-learned generation model to obtain an intermediate representation of the one or more sentences. The intermediate representation is processed with a machine-learned cascaded diffusion model to obtain audio data. The machine-learned cascaded diffusion model is trained based on a difference between the audio data and the audio recording.

2.

发明公开
Hierarchical Video Encoders 审中-公开

公开(公告)号：US20240114158A1

公开(公告)日：2024-04-04

申请号：US18529173

申请日：2023-12-05

Applicant: Google LLC

Inventor： Vihan Jain , Joonseok Lee , Ming Zhao , Sheide Chammas , Hexiang Hu , Bowen Zhang , Fei Sha , Tze Way Eugene Ie

IPC: H04N19/30 , G06N20/00 , H04N19/172

CPC classification number: H04N19/30 , G06N20/00 , H04N19/172

Abstract: A computer-implemented method for generating video representations utilizing a hierarchical video encoder includes obtaining a video, wherein the video includes a plurality of frames, processing each of the plurality of frames with a machine-learned frame-level encoder model to respectively generate a plurality of frame representations for the plurality of frames, the plurality of frame representations respective to the plurality of frames determining a plurality of segment representations representative of a plurality of video segments including one or more of the plurality of frames, the plurality of segment representations based at least in part on the plurality of frame representations, processing the plurality of segment representations with a machine-learned segment-level encoder model to generate a plurality of contextualized segment representations, determining a video representation based at least in part on the plurality of contextualized segment representations, and providing the video representation as an output.

3.

发明授权
Framework for training machine-learned models on extremely large datasets 有权

公开(公告)号：US11295171B2

公开(公告)日：2022-04-05

申请号：US16657042

申请日：2019-10-18

Applicant: Google LLC

Inventor： Joonseok Lee , Balakrishnan Varadarajan , Ariel Gordon , Apostol Ivanov Natsev , Seong Jae Hwang

IPC: G06K9/62 , G06N20/00 , G06K9/00 , G06K1/00

Abstract: A MapReduce-based training framework exploits both data parallelism and model parallelism to scale training of complex models. Particular model architectures facilitate and benefit from use of such training framework. As one example, a machine-learned model can include a shared feature extraction portion configured to receive and process a data input to produce an intermediate feature representation and a plurality of prediction heads that are configured to receive and process the intermediate feature representation to respectively produce a plurality of predictions. For example, the data input can be a video and the plurality of predictions can be a plurality of classifications for content of the video (e.g., relative to a plurality of classes).

4.

发明授权
Hierarchical video encoders 有权

公开(公告)号：US11876986B2

公开(公告)日：2024-01-16

申请号：US18070556

申请日：2022-11-29

Applicant: Google LLC

Inventor： Vihan Jain , Joonseok Lee , Ming Zhao , Sheide Chammas , Hexiang Hu , Bowen Zhang , Fei Sha , Tze Way Eugene Ie

IPC: H04N19/30 , H04N19/00 , H04N19/172 , G06N20/00

CPC classification number: H04N19/30 , G06N20/00 , H04N19/172

Abstract: A computer-implemented method for generating video representations utilizing a hierarchical video encoder includes obtaining a video, wherein the video includes a plurality of frames, processing each of the plurality of frames with a machine-learned frame-level encoder model to respectively generate a plurality of frame representations for the plurality of frames, the plurality of frame representations respective to the plurality of frames determining a plurality of segment representations representative of a plurality of video segments including one or more of the plurality of frames, the plurality of segment representations based at least in part on the plurality of frame representations, processing the plurality of segment representations with a machine-learned segment-level encoder model to generate a plurality of contextualized segment representations, determining a video representation based at least in part on the plurality of contextualized segment representations, and providing the video representation as an output.

5.

发明授权
Hierarchical video encoders 有权

公开(公告)号：US11533495B2

公开(公告)日：2022-12-20

申请号：US17162150

申请日：2021-01-29

Applicant: Google LLC

Inventor： Vihan Jain , Joonseok Lee , Ming Zhao , Sheide Chammas , Hexiang Hu , Bowen Zhang , Fei Sha , Tze Way Eugene Ie

IPC: H04N19/30 , H04N19/00 , H04N19/172 , G06N20/00

Abstract: A computer-implemented method for generating video representations utilizing a hierarchical video encoder includes obtaining a video, wherein the video includes a plurality of frames, processing each of the plurality of frames with a machine-learned frame-level encoder model to respectively generate a plurality of frame representations for the plurality of frames, the plurality of frame representations respective to the plurality of frames determining a plurality of segment representations representative of a plurality of video segments including one or more of the plurality of frames, the plurality of segment representations based at least in part on the plurality of frame representations, processing the plurality of segment representations with a machine-learned segment-level encoder model to generate a plurality of contextualized segment representations, determining a video representation based at least in part on the plurality of contextualized segment representations, and providing the video representation as an output.

6.

发明申请
Hierarchical Video Encoders 有权

公开(公告)号：US20230103148A1

公开(公告)日：2023-03-30

申请号：US18070556

申请日：2022-11-29

Applicant: Google LLC

Inventor： Vihan Jain , Joonseok Lee , Ming Zhao , Sheide Chammas , Hexiang Hu , Bowen Zhang , Fei Sha , Tze Way Eugene Ie

IPC: H04N19/30 , G06N20/00 , H04N19/172

Abstract: A computer-implemented method for generating video representations utilizing a hierarchical video encoder includes obtaining a video, wherein the video includes a plurality of frames, processing each of the plurality of frames with a machine-learned frame-level encoder model to respectively generate a plurality of frame representations for the plurality of frames, the plurality of frame representations respective to the plurality of frames determining a plurality of segment representations representative of a plurality of video segments including one or more of the plurality of frames, the plurality of segment representations based at least in part on the plurality of frame representations, processing the plurality of segment representations with a machine-learned segment-level encoder model to generate a plurality of contextualized segment representations, determining a video representation based at least in part on the plurality of contextualized segment representations, and providing the video representation as an output.

7.

发明申请
Hierarchical Video Encoders 有权

公开(公告)号：US20220256175A1

公开(公告)日：2022-08-11

申请号：US17162150

申请日：2021-01-29

Applicant: Google LLC

Inventor： Vihan Jain , Joonseok Lee , Ming Zhao , Sheide Chammas , Hexiang Hu , Bowen Zhang , Fei Sha , Tze Way Eugene Ie

IPC: H04N19/30 , H04N19/172 , G06N20/00

Abstract: A computer-implemented method for generating video representations utilizing a hierarchical video encoder includes obtaining a video, wherein the video includes a plurality of frames, processing each of the plurality of frames with a machine-learned frame-level encoder model to respectively generate a plurality of frame representations for the plurality of frames, the plurality of frame representations respective to the plurality of frames determining a plurality of segment representations representative of a plurality of video segments including one or more of the plurality of frames, the plurality of segment representations based at least in part on the plurality of frame representations, processing the plurality of segment representations with a machine-learned segment-level encoder model to generate a plurality of contextualized segment representations, determining a video representation based at least in part on the plurality of contextualized segment representations, and providing the video representation as an output.

8.

发明申请
Framework for Training Machine-Learned Models on Extremely Large Datasets 有权

公开(公告)号：US20210117728A1

公开(公告)日：2021-04-22

申请号：US16657042

申请日：2019-10-18

Applicant: Google LLC

Inventor： Joonseok Lee , Balakrishnan Varadarajan , Ariel Gordon , Apostol Ivanov Natsev , Seong Jae Hwang

IPC: G06K9/62 , G06K9/00 , G06N20/00

Abstract: A MapReduce-based training framework exploits both data parallelism and model parallelism to scale training of complex models. Particular model architectures facilitate and benefit from use of such training framework. As one example, a machine-learned model can include a shared feature extraction portion configured to receive and process a data input to produce an intermediate feature representation and a plurality of prediction heads that are configured to receive and process the intermediate feature representation to respectively produce a plurality of predictions. For example, the data input can be a video and the plurality of predictions can be a plurality of classifications for content of the video (e.g., relative to a plurality of classes).

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification