LONG DURATION STRUCTURED VIDEO ACTION SEGMENTATION

发明公开

US20240104915A1 LONG DURATION STRUCTURED VIDEO ACTION SEGMENTATION 审中-公开

请登陆查看更多内容

专利标题： LONG DURATION STRUCTURED VIDEO ACTION SEGMENTATION
申请号： US18459824

申请日： 2023-09-01
公开(公告)号： US20240104915A1

公开(公告)日： 2024-03-28
发明人: Anthony Daniel Rhodes , Byungsu Min , Subarna Tripathi , Giuseppe Raffa , Sovan Biswas
申请人： Intel Corporation
申请人地址： US CA Santa Clara
专利权人： Intel Corporation
当前专利权人： Intel Corporation
当前专利权人地址： US CA Santa Clara
主分类号： G06V10/82
IPC分类号： G06V10/82 ; G06V10/75 ; G06V10/86 ; G06V20/40

LONG DURATION STRUCTURED VIDEO ACTION SEGMENTATION

摘要：

Machine learning models can process a video and generate outputs such as action segmentation assigning portions of the video to a particular action, or action classification assigning an action class for each frame of the video. Some machine learning models can accurately make predictions for short videos but may not be particularly suited for performing action segmentation for long duration, structured videos. An effective machine learning model may include a hybrid architecture involving a temporal convolutional network and a bi-directional graph neural network. The machine learning model can process long duration structured videos by using a temporal convolutional network as a first pass action segmentation model to generate rich, frame-wise features. The frame-wise features can be converted into a graph having forward edges and backward edges. A graph neural network can process the graph to refine a final fine-grain per-frame action prediction.

信息查询

Global Dossier Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06V	图像或视频识别或理解
G06V10/00	图像或视频识别或理解的安排（图像或视频中的字符识别 G06V30/10）
G06V10/70	.使用模式识别或机器学习（光学模式识别或电子计算 G06V10/88）
G06V10/82	..使用神经网络