-
公开(公告)号:US12236676B2
公开(公告)日:2025-02-25
申请号:US17438687
申请日:2019-07-19
Applicant: Google LLC
Inventor: Mikael Pierre Bonnevie , Aaron Maschinot , Aaron Sarna , Shuchao Bi , Jingbin Wang , Michael Spencer Krainin , Wenchao Tong , Dilip Krishnan , Haifeng Gong , Ce Liu , Hossein Talebi , Raanan Sayag , Piotr Teterwak
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating realistic extensions of images. In one aspect, a method comprises providing an input that comprises a provided image to a generative neural network having a plurality of generative neural network parameters. The generative neural network processes the input in accordance with trained values of the plurality of generative neural network parameters to generate an extended image. The extended image has (i) more rows, more columns, or both than the provided image, and (ii) is predicted to be a realistic extension of the provided image. The generative neural network is trained using an adversarial loss objective function.
-
公开(公告)号:US20250111671A1
公开(公告)日:2025-04-03
申请号:US18900457
申请日:2024-09-27
Applicant: Google LLC
Inventor: Tao Zhu , Jiahui Yu , Jingchen Feng , Kai Chen , Pooya Abolghasemi , Gagan Bansal , Jieren Xu , Hui Miao , Yaping Zhang , Shuchao Bi , Yonghui Wu , Claire Cui , Rohan Anil
IPC: G06V20/40 , G06F40/284 , G10L25/57
Abstract: Methods and systems for media item characterization based on multimodal embeddings are provided herein. A media item including a sequence of video frames is identified. A set of video embeddings representing visual features of the sequence of video frames is obtained. A set of audio embeddings representing audio features of the sequence of video frames is obtained. A set of audiovisual embeddings is generated based on the set of video embeddings and the set of audio embeddings. Each of the set of audiovisual embeddings represents a visual feature and an audio feature of a respective video frame of the sequence of video frames. One or more media characteristics associated with the media item are determined based on the set of audiovisual embeddings.
-
公开(公告)号:US20220148299A1
公开(公告)日:2022-05-12
申请号:US17438687
申请日:2019-07-19
Applicant: Google LLC
Inventor: Mikael Pierre Bonnevie , Aaron Maschinot , Aaron Sarna , Shuchao Bi , Jingbin Wang , Michael Spencer Krainin , Wenchao Tong , Dilip Krishnan , Haifeng Gong , Ce Liu , Hossein Talebi , Raanan Sayag , Piotr Teterwak
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating realistic extensions of images. In one aspect, a method comprises providing an input that comprises a provided image to a generative neural network having a plurality of generative neural network parameters. The generative neural network processes the input in accordance with trained values of the plurality of generative neural network parameters to generate an extended image. The extended image has (i) more rows, more columns, or both than the provided image, and (ii) is predicted to be a realistic extension of the provided image. The generative neural network is trained using an adversarial loss objective function.
-
-