Patent search ap:("Google LLC") AND inv:"Mario Lucic" Page 1

1.

发明申请
SELF-ATTENTION BASED NEURAL NETWORKS FOR PROCESSING NETWORK INPUTS FROM MULTIPLE MODALITIES 有权

公开(公告)号：US20240403636A1

公开(公告)日：2024-12-05

申请号：US18697257

申请日：2022-10-05

Applicant: GOOGLE LLC

Inventor： Valerii Likhosherstov , Mostafa Dehghani , Anurag Arnab , Krzysztof Marcin Choromanski , Mario Lucic , Yi Tay

IPC: G06N3/08 , G06N3/045

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for executing and training a multi-modal, multi-task self-attention neural network.

2.

发明申请
Systems and Methods for Improved Video Understanding 有权

公开(公告)号：US20240428587A1

公开(公告)日：2024-12-26

申请号：US18827133

申请日：2024-09-06

Applicant: Google LLC

Inventor： Anurag Arnab , Mostafa Dehghani , Georg Heigold , Chen Sun , Mario Lucic , Cordelia Luise Schmid

IPC: G06V20/40 , G06N20/00

Abstract: A computer-implemented method for classifying video data with improved accuracy includes obtaining, by a computing system comprising one or more computing devices, video data comprising a plurality of video frames; extracting, by the computing system, a plurality of video tokens from the video data, the plurality of video tokens comprising a representation of spatiotemporal information in the video data; providing, by the computing system, the plurality of video tokens as input to a video understanding model, the video understanding model comprising a video transformer encoder model; and receiving, by the computing system, a classification output from the video understanding model.

3.

发明授权
Systems and methods for improved video understanding 有权

公开(公告)号：US12112538B2

公开(公告)日：2024-10-08

申请号：US17370522

申请日：2021-07-08

Applicant: Google LLC

Inventor： Anurag Arnab , Mostafa Dehghani , Georg Heigold , Chen Sun , Mario Lucic , Cordelia Luise Schmid

IPC: G06V20/40 , G06N20/00

CPC classification number: G06V20/41 , G06N20/00 , G06V20/46 , G06V20/49

Abstract: A computer-implemented method for classifying video data with improved accuracy includes obtaining, by a computing system comprising one or more computing devices, video data comprising a plurality of video frames; extracting, by the computing system, a plurality of video tokens from the video data, the plurality of video tokens comprising a representation of spatiotemporal information in the video data; providing, by the computing system, the plurality of video tokens as input to a video understanding model, the video understanding model comprising a video transformer encoder model; and receiving, by the computing system, a classification output from the video understanding model.

4.

发明申请
Systems And Methods For Improved Video Understanding 有权

公开(公告)号：US20230017072A1

公开(公告)日：2023-01-19

申请号：US17370522

申请日：2021-07-08

Applicant: Google LLC

Inventor： Anurag Arnab , Mostafa Dehghani , Georg Heigold , Chen Sun , Mario Lucic , Cordelia Luise Schmid

IPC: G06K9/00 , G06N20/00

Abstract: A computer-implemented method for classifying video data with improved accuracy includes obtaining, by a computing system comprising one or more computing devices, video data comprising a plurality of video frames; extracting, by the computing system, a plurality of video tokens from the video data, the plurality of video tokens comprising a representation of spatiotemporal information in the video data; providing, by the computing system, the plurality of video tokens as input to a video understanding model, the video understanding model comprising a video transformer encoder model; and receiving, by the computing system, a classification output from the video understanding model.

5.

发明申请
MULTI-LAYER PERCEPTRON-BASED COMPUTER VISION NEURAL NETWORKS 有权

公开(公告)号：US20220375211A1

公开(公告)日：2022-11-24

申请号：US17737507

申请日：2022-05-05

Applicant: Google LLC

Inventor： Ilya Tolstikhin , Neil Matthew Tinmouth Houlsby , Alexander Kolesnikov , Lucas Klaus Beyer , Alexey Dosovitskiy , Mario Lucic , Xiaohua Zhai , Thomas Unterthiner , Daniel M. Keysers , Jakob D. Uszkoreit , Yin Ching Jessica Yung , Andreas Steiner

IPC: G06V10/82 , G06V10/764 , G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing images using mixer neural networks. One of the methods includes obtaining one or more images comprising a plurality of pixels; determining, for each image of the one or more images, a plurality of image patches of the image, wherein each image patch comprises a different subset of the pixels of the image; processing, for each image of the one or more images, the corresponding plurality of image patches to generate an input sequence comprising a respective input element at each of a plurality of input positions, wherein a plurality of the input elements correspond to respective different image patches; and processing the input sequences using a neural network to generate a network output that characterizes the one or more images, wherein the neural network comprises one or more mixer neural network layers.

6.

发明申请
Systems and Methods for Improved Video Understanding 有权

公开(公告)号：US20240428586A1

公开(公告)日：2024-12-26

申请号：US18827088

申请日：2024-09-06

Applicant: Google LLC

Inventor： Anurag Arnab , Mostafa Dehghani , Georg Heigold , Chen Sun , Mario Lucic , Cordelia Luise Schmid

IPC: G06V20/40 , G06N20/00

Abstract: A computer-implemented method for classifying video data with improved accuracy includes obtaining, by a computing system comprising one or more computing devices, video data comprising a plurality of video frames; extracting, by the computing system, a plurality of spatiotemporal representations from the video data, the plurality of spatiotemporal representations comprising a representation of spatiotemporal information in the video data; providing, by the computing system, the plurality of spatiotemporal representations as input to a video understanding model, the video understanding model comprising a video transformer encoder model; and receiving, by the computing system, a classification output from the video understanding model.

7.

发明公开
ACTION LOCALIZATION IN VIDEOS USING LEARNED QUERIES 审中-公开

公开(公告)号：US20240346824A1

公开(公告)日：2024-10-17

申请号：US18634794

申请日：2024-04-12

Applicant: Google LLC

Inventor： Alexey Alexeevich Gritsenko , Xuehan Xiong , Josip Djolonga , Mostafa Dehghani , Chen Sun , Mario Lucic , Cordelia Luise Schmid , Anurag Arnab

IPC: G06V20/40 , G06T7/73 , G06V10/62 , G06V10/764 , G06V10/77 , G06V10/774 , G06V10/776 , G06V10/82

CPC classification number: G06V20/46 , G06T7/73 , G06V10/62 , G06V10/764 , G06V10/7715 , G06V10/774 , G06V10/776 , G06V10/82 , G06T2207/10016 , G06T2207/20081 , G06T2207/20084

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing action localization on an input video. In particular, a system maintains a set of query vectors and uses the input video and the set of query vectors to generate an action localization output for the input video. The action localization output includes, for each of one or more agents depicted in the video, data specifying, for each of one or more video frames in the video, a respective bounding box in the video frame that depicts the agent and a respective action from a set of actions that is being performed by the agent in the video frame.

8.

发明公开
Latent Pose Queries for Machine-Learned Image View Synthesis 审中-公开

公开(公告)号：US20240169662A1

公开(公告)日：2024-05-23

申请号：US18517190

申请日：2023-11-22

Applicant: Google LLC

Inventor： Seyed Mohammad Mehdi Sajjadi , Klaus Greff , Etienne François Régis Pot , Daniel Christopher Duckworth , Mario Lucic , Aravindh Mahendran , Thomas Kipf

IPC: G06T15/20 , B25J9/16 , G06T7/73

CPC classification number: G06T15/205 , B25J9/1697 , G06T7/73 , G06T2207/20081 , G06T2207/20084

Abstract: An example method includes obtaining, by a computing system, one or more source images of a scene; obtaining, by the computing system, a query associated with a target view of the scene, wherein at least a portion of the query is parameterized in a latent pose space; and generating, by the computing system and using a machine-learned image view synthesis model, an output image of the scene associated with the target view.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification