-
公开(公告)号:US20220375205A1
公开(公告)日:2022-11-24
申请号:US17664402
申请日:2022-05-20
Applicant: Google LLC
Inventor: Zizhao Zhang , Han Zhang , Long Zhao , Tomas Pfister
IPC: G06V10/77 , G06V10/764 , G06V10/22 , G06V10/44
Abstract: A method includes receiving image data including a series of image patches of an image. The method includes generating, using a first set of transformers of a vision transformer (V-T) model, a first set of higher order feature representations based on the series of image patches and aggregating the first set of higher order feature representations into a second set of higher order feature representations that is smaller than the first set. The method includes generating, using a second set of transformers of the V-T model, a third set of higher order feature representations based on the second set of higher order feature representations and aggregating the third set of higher order feature representations into a fourth set of higher order feature representations that is smaller than the third set. The method includes generating, using the V-T model, an image classification of the image based on the fourth set.
-
公开(公告)号:US20240265586A1
公开(公告)日:2024-08-08
申请号:US18564841
申请日:2022-05-27
Applicant: Google LLC
Inventor: Long Zhao , Han Zhang , Zizhao Zhang , Ting Chen
IPC: G06T11/00 , G06T3/4046
CPC classification number: G06T11/00 , G06T3/4046
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating high-resolution images using self-attention based neural networks. One of the systems includes a neural network configured to generate images, the neural network comprising a sequence of one or more first network blocks followed by a sequence of one or more second network blocks, wherein: each first network block is configured to perform operations comprising: applying a self-attention mechanism over at least a subset of first elements of a first block input to generate an updated first block input; and upsampling the updated first block input to generate a first block output; and each second network block is configured to perform operations comprising: processing a second block input using one or more neural network layers to generate an updated second block input; and upsampling the updated second block input to generate a second block output.
-
公开(公告)号:US20250111675A1
公开(公告)日:2025-04-03
申请号:US18900467
申请日:2024-09-27
Applicant: Google LLC
Inventor: Hui Miao , Chun-Te Chu , Mingyan Gao , Huanfen Yao , Ting Liu , Long Zhao , Liangzhe Yuan , Yukun Zhu , Vinay Kumar Bettadapura , Ye Jin
IPC: G06V20/40 , G06V10/74 , G06V10/75 , G06V10/762 , G06V10/80
Abstract: Methods and systems for media trend detection and maintenance are provided herein. A set of media items each having common media characteristics is identified. A set of pose values is determined for each respective media item of the set of media items. Each pose value is associated with a particular predefined pose for objects depicted by the set of media items. A set of distance scores is calculated. Each distance score represents a distance between the respective set of pose values determined for a media item and a respective set of pose values determined for an additional media item. A coherence score is determined for the set of media items based on the calculated set of distance scores. Responsive to a determination that the coherence score satisfies one or more coherence criteria, a determination is made that the set of media items corresponds to a media trend of a platform.
-
-