Patent search ap:("DEEPMIND TECHNOLOGIES LIMITED") AND inv:"Andrew Zisserman" Page 2

11.

发明申请
ACTION CLASSIFICATION IN VIDEO CLIPS USING ATTENTION-BASED NEURAL NETWORKS 有权

公开(公告)号：US20220019807A1

公开(公告)日：2022-01-20

申请号：US17295329

申请日：2019-11-20

Applicant: DeepMind Technologies Limited

Inventor： Joao Carreira , Carl Doersch , Andrew Zisserman

IPC: G06K9/00 , G06N3/04 , G06K9/32

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for classifying actions in a video. One of the methods obtaining a feature representation of a video clip; obtaining data specifying a plurality of candidate agent bounding boxes in the key video frame; and for each candidate agent bounding box: processing the feature representation through an action transformer neural network.

12.

发明申请
ALIGNING SEQUENCES BY GENERATING ENCODED REPRESENTATIONS OF DATA ITEMS 有权

公开(公告)号：US20220004883A1

公开(公告)日：2022-01-06

申请号：US17295286

申请日：2019-11-21

Applicant: DeepMind Technologies Limited

Inventor： Yusuf Aytar , Debidatta Dwibedi , Andrew Zisserman , Jonathan Tompson , Pierre Sermanet

IPC: G06N3/08 , G06K9/62 , G06T7/00

Abstract: An encoder neural network is described which can encode a data item, such as a frame of a video, to form a respective encoded data item. Data items of a first data sequence are associated with respective data items of a second sequence, by determining which of the encoded data items of the second sequence is closest to the encoded data item produced from each data item of the first sequence. Thus, the two data sequences are aligned. The encoder neural network is trained automatically using a training set of data sequences, by an iterative process of successively increasing cycle consistency between pairs of the data sequences.

13.

发明申请
LOCAL CROSS-ATTENTION OPERATIONS IN NEURAL NETWORKS 有权

公开(公告)号：US20250103856A1

公开(公告)日：2025-03-27

申请号：US18832817

申请日：2023-01-30

Applicant: DeepMind Technologies Limited

Inventor： Joao Carreira , Andrew Coulter Jaegle , Skanda Kumar Koppula , Daniel Zoran , Adrià Recasens Continente , Catalin-Dumitru Ionescu , Olivier Jean Hénaff , Evan Gerard Shelhamer , Relja Arandjelovic , Matthew Botvinick , Oriol Vinyals , Karen Simonyan , Andrew Zisserman

IPC: G06N3/045

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for using a neural network to generate a network output that characterizes an entity. In one aspect, a method includes: obtaining a representation of the entity as a set of data element embeddings, obtaining a set of latent embeddings, and processing: (i) the set of data element embeddings, and (ii) the set of latent embeddings, using the neural network to generate the network output. The neural network includes a sequence of neural network blocks including: (i) one or more local cross-attention blocks, and (ii) an output block. Each local cross-attention block partitions the set of latent embeddings and the set of data element embeddings into proper subsets, and updates each proper subset of the set of latent embeddings using attention over only the corresponding proper subset of the set of data element embeddings.

14.

发明公开
ACTION CLASSIFICATION IN VIDEO CLIPS USING ATTENTION-BASED NEURAL NETWORKS 审中-公开

公开(公告)号：US20240029436A1

公开(公告)日：2024-01-25

申请号：US18375941

申请日：2023-10-02

Applicant: DeepMind Technologies Limited

Inventor： Joao Carreira , Carl Doersch , Andrew Zisserman

IPC: G06V20/40 , G06V10/25 , G06N3/045

CPC classification number: G06V20/46 , G06V10/25 , G06V20/41 , G06N3/045

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for classifying actions in a video. One of the methods obtaining a feature representation of a video clip; obtaining data specifying a plurality of candidate agent bounding boxes in the key video frame; and for each candidate agent bounding box: processing the feature representation through an action transformer neural network.

15.

发明公开
PARALLEL VIDEO PROCESSING SYSTEMS 审中-公开

公开(公告)号：US20230186625A1

公开(公告)日：2023-06-15

申请号：US18108873

申请日：2023-02-13

Applicant: DeepMind Technologies Limited

Inventor： Simon Osindero , Joao Carreira , Viorica Patraucean , Andrew Zisserman

IPC: G06V20/40 , G06N3/049 , G06T1/20 , G06N3/044 , G06N3/045

CPC classification number: G06V20/40 , G06N3/049 , G06T1/20 , G06N3/044 , G06N3/045 , G06T2200/28 , G06T2207/20084

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for parallel processing of video frames using neural networks. One of the methods includes receiving a video sequence comprising a respective video frame at each of a plurality of time steps; and processing the video sequence using a video processing neural network to generate a video processing output for the video sequence, wherein the video processing neural network includes a sequence of network components, wherein the network components comprise a plurality of layer blocks each comprising one or more neural network layers, wherein each component is active for a respective subset of the plurality of time steps, and wherein each layer block is configured to, at each time step at which the layer block is active, receive an input generated at a previous time step and to process the input to generate a block output.

16.

发明申请
NEURAL NETWORK SYSTEMS FOR DECOMPOSING VIDEO DATA INTO LAYERED REPRESENTATIONS 有权

公开(公告)号：US20220012898A1

公开(公告)日：2022-01-13

申请号：US17295321

申请日：2019-11-20

Applicant: DeepMind Technologies Limited

Inventor： Joao Carreira , Jean-Baptiste Alayrac , Andrew Zisserman

IPC: G06T7/215 , G06N3/04 , G06N3/08 , G06K9/62

Abstract: A computer-implemented neural network system for decomposing input video data. A video data input receives a sequence of video image frames. The sequence is encoded, using a 3D spatio-temporal encoder neural network, into a set of latent variables representing a compressed version of the sequence. A 3D spatio-temporal decoder neural network processes the set of latent variables to generate two or more sets of decomposed video data; these may be stored, communicated, and/or made available to a user interface. Input video including undesired features such as reflections, shadows, and occlusions may thus be decomposed into two or more video sequences, one in which the undesired features are suppressed, and another containing the undesired features.

17.

发明申请
SAMPLING LATENT VARIABLES TO GENERATE MULTIPLE SEGMENTATIONS OF AN IMAGE 审中-公开

公开(公告)号：US20200372654A1

公开(公告)日：2020-11-26

申请号：US16881775

申请日：2020-05-22

Applicant: DeepMind Technologies Limited

Inventor： Simon Kohl , Bernardino Romera-Paredes , Danilo Jimenez Rezende , Seyed Mohammadali Eslami , Pushmeet Kohli , Andrew Zisserman , Olaf Ronneberger

IPC: G06T7/10 , G06T7/00 , A61B5/00

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a plurality of possible segmentations of an image. In one aspect, a method comprises: receiving a request to generate a plurality of possible segmentations of an image; sampling a plurality of latent variables from a latent space, wherein each latent variable is sampled from the latent space in accordance with a respective probability distribution over the latent space that is determined based on the image; generating a plurality of possible segmentations of the image, comprising, for each latent variable, processing the image and the latent variable using a segmentation neural network having a plurality of segmentation neural network parameters to generate the possible segmentation of the image; and providing the plurality of possible segmentations of the image in response to the request.

18.

发明授权
Action recognition in videos using 3D spatio-temporal convolutional neural networks 有权

公开(公告)号：US10789479B2

公开(公告)日：2020-09-29

申请号：US16681671

申请日：2019-11-12

Applicant: DeepMind Technologies Limited

Inventor： Joao Carreira , Andrew Zisserman

IPC: G06K9/00 , G06K9/46 , G06K9/62 , G06N3/04 , G06N3/08 , G06T7/269

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing video data. An example system receives video data and generates optical flow data. An image sequence from the video data is provided to a first 3D spatio-temporal convolutional neural network to process the image data in at least three space-time dimensions and to provide a first convolutional neural network output. A corresponding sequence of optical flow image frames is provided to a second 3D spatio-temporal convolutional neural network to process the optical flow data in at least three space-time dimensions and to provide a second convolutional neural network output. The first and second convolutional neural network outputs are combined to provide a system output.

19.

发明公开
ANIMATING IMAGES USING POINT TRAJECTORIES 审中-公开

公开(公告)号：US20240303897A1

公开(公告)日：2024-09-12

申请号：US18600552

申请日：2024-03-08

Applicant: DeepMind Technologies Limited

Inventor： Carl Doersch , Yi Yang , Mel Vecerik , Dilara Gokay , Ankush Gupta , Yusuf Aytar , Joao Carreira , Andrew Zisserman

IPC: G06T13/80 , G06T3/18 , G06T7/70 , G06T9/00

CPC classification number: G06T13/80 , G06T3/18 , G06T7/70 , G06T9/00

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for animating images using point trajectories.

20.

发明授权
Spatial transformer modules 有权

公开(公告)号：US11734572B2

公开(公告)日：2023-08-22

申请号：US16995307

申请日：2020-08-17

Applicant: DeepMind Technologies Limited

Inventor： Maxwell Elliot Jaderberg , Karen Simonyan , Andrew Zisserman , Koray Kavukcuoglu

IPC: G06N3/045 , G06N3/088 , G06N3/084 , G06V10/44

CPC classification number: G06N3/084 , G06N3/045 , G06N3/088 , G06V10/454

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing inputs using an image processing neural network system that includes a spatial transformer module. One of the methods includes receiving an input feature map derived from the one or more input images, and applying a spatial transformation to the input feature map to generate a transformed feature map, comprising: processing the input feature map to generate spatial transformation parameters for the spatial transformation, and sampling from the input feature map in accordance with the spatial transformation parameters to generate the transformed feature map.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification