Patent search ap:("Google LLC") AND inv:"Yongqin Xian" Page 1

1.

发明申请
TEXT CONDITIONED VIDEO RESAMPLER FOR VIDEO UNDERSTANDING 有权

公开(公告)号：US20250166379A1

公开(公告)日：2025-05-22

申请号：US18949777

申请日：2024-11-15

Applicant: Google LLC

Inventor： Alessio Tonioni , Bruno Korbar , Federico Tombari , Andrew Zisserman , Yongqin Xian

IPC: G06V20/40 , G06F40/35 , G06V10/46

Abstract: Methods, systems, and apparatus for video understanding. In one aspect, a conditioned resampler model receives video features of multiple video frames of a video processed by a visual encoder and token embeddings for a specified task. The conditioned resampler model generates conditioned resampler embeddings according to the specified task in response to the video features and token embeddings provided as input. The conditioned resampler embeddings are provided to a large language model as input. The large language model generates, in response to the input conditioned resampler embeddings, a text response to the specified task.

Patent Agency Ranking