Aggregating Nested Vision Transformers

    公开(公告)号:US20220375205A1

    公开(公告)日:2022-11-24

    申请号:US17664402

    申请日:2022-05-20

    Applicant: Google LLC

    Abstract: A method includes receiving image data including a series of image patches of an image. The method includes generating, using a first set of transformers of a vision transformer (V-T) model, a first set of higher order feature representations based on the series of image patches and aggregating the first set of higher order feature representations into a second set of higher order feature representations that is smaller than the first set. The method includes generating, using a second set of transformers of the V-T model, a third set of higher order feature representations based on the second set of higher order feature representations and aggregating the third set of higher order feature representations into a fourth set of higher order feature representations that is smaller than the third set. The method includes generating, using the V-T model, an image classification of the image based on the fourth set.

    GENERATING HIGH-RESOLUTION IMAGES USING SELF-ATTENTION

    公开(公告)号:US20240265586A1

    公开(公告)日:2024-08-08

    申请号:US18564841

    申请日:2022-05-27

    Applicant: Google LLC

    CPC classification number: G06T11/00 G06T3/4046

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating high-resolution images using self-attention based neural networks. One of the systems includes a neural network configured to generate images, the neural network comprising a sequence of one or more first network blocks followed by a sequence of one or more second network blocks, wherein: each first network block is configured to perform operations comprising: applying a self-attention mechanism over at least a subset of first elements of a first block input to generate an updated first block input; and upsampling the updated first block input to generate a first block output; and each second network block is configured to perform operations comprising: processing a second block input using one or more neural network layers to generate an updated second block input; and upsampling the updated second block input to generate a second block output.

    MEDIA TREND DETECTION AND MAINTENANCE AT A CONTENT SHARING PLATFORM

    公开(公告)号:US20250111675A1

    公开(公告)日:2025-04-03

    申请号:US18900467

    申请日:2024-09-27

    Applicant: Google LLC

    Abstract: Methods and systems for media trend detection and maintenance are provided herein. A set of media items each having common media characteristics is identified. A set of pose values is determined for each respective media item of the set of media items. Each pose value is associated with a particular predefined pose for objects depicted by the set of media items. A set of distance scores is calculated. Each distance score represents a distance between the respective set of pose values determined for a media item and a respective set of pose values determined for an additional media item. A coherence score is determined for the set of media items based on the calculated set of distance scores. Responsive to a determination that the coherence score satisfies one or more coherence criteria, a determination is made that the set of media items corresponds to a media trend of a platform.

Patent Agency Ranking