-
公开(公告)号:US20230141037A1
公开(公告)日:2023-05-11
申请号:US17590379
申请日:2022-02-01
Applicant: Honda Motor Co., Ltd.
Inventor: Reza GHODDOOSIAN , Isht DWIVEDI , Nakul AGARWAL , Chiho CHOI , Behzad DARIUSH
IPC: G06V10/778 , G06V10/26 , G06V20/40 , G06V40/20 , G06V10/82
CPC classification number: G06V10/7792 , G06V10/26 , G06V20/49 , G06V40/20 , G06V10/82
Abstract: A system and method for providing weakly-supervised online action segmentation that include receiving image data associated with multi-view videos of a procedure, wherein the procedure involves a plurality of atomic actions. The system and method also include analyzing the image data using weakly-supervised action segmentation to identify each of the plurality of atomic actions by using an ordered sequence of action labels. The system and method additionally include training a neural network with data pertaining to the plurality of atomic actions based on the weakly-supervised action segmentation. The system and method further include executing online action segmentation to label atomic actions that are occurring in real-time based on the plurality of atomic actions trained to the neural network.
-
公开(公告)号:US20240371166A1
公开(公告)日:2024-11-07
申请号:US18308542
申请日:2023-04-27
Applicant: Honda Motor Co., Ltd.
Inventor: Reza GHODDOOSIAN , Isht DWIVEDI , Nakul AGARWAL , Behzad DARIUSH
IPC: G06V20/40 , G06V10/774
Abstract: According to one aspect, weakly-supervised action segmentation may include performing feature extraction to extract one or more features associated with a current frame of a video including a series of one or more actions, feeding one or more of the features to a recognition network to generate a predicted action score for the current frame of the video, feeding one or more of the features and the predicted action score to an action transition model to generate a potential subsequent action, feeding the potential subsequent action and the predicted action score to a hybrid segmentation model to generate a predicted sequence of actions from a first frame of the video to the current frame of the video, and segmenting or labeling one or more frames of the video based on the predicted sequence of actions from the first frame of the video to the current frame of the video.
-