-
公开(公告)号:US12112538B2
公开(公告)日:2024-10-08
申请号:US17370522
申请日:2021-07-08
Applicant: Google LLC
Inventor: Anurag Arnab , Mostafa Dehghani , Georg Heigold , Chen Sun , Mario Lucic , Cordelia Luise Schmid
Abstract: A computer-implemented method for classifying video data with improved accuracy includes obtaining, by a computing system comprising one or more computing devices, video data comprising a plurality of video frames; extracting, by the computing system, a plurality of video tokens from the video data, the plurality of video tokens comprising a representation of spatiotemporal information in the video data; providing, by the computing system, the plurality of video tokens as input to a video understanding model, the video understanding model comprising a video transformer encoder model; and receiving, by the computing system, a classification output from the video understanding model.
-
公开(公告)号:US20230017072A1
公开(公告)日:2023-01-19
申请号:US17370522
申请日:2021-07-08
Applicant: Google LLC
Inventor: Anurag Arnab , Mostafa Dehghani , Georg Heigold , Chen Sun , Mario Lucic , Cordelia Luise Schmid
Abstract: A computer-implemented method for classifying video data with improved accuracy includes obtaining, by a computing system comprising one or more computing devices, video data comprising a plurality of video frames; extracting, by the computing system, a plurality of video tokens from the video data, the plurality of video tokens comprising a representation of spatiotemporal information in the video data; providing, by the computing system, the plurality of video tokens as input to a video understanding model, the video understanding model comprising a video transformer encoder model; and receiving, by the computing system, a classification output from the video understanding model.
-