-
公开(公告)号:US11669743B2
公开(公告)日:2023-06-06
申请号:US16874478
申请日:2020-05-14
申请人: Niamul Quader , Juwei Lu , Peng Dai , Wei Li
发明人: Niamul Quader , Juwei Lu , Peng Dai , Wei Li
IPC分类号: H04N21/462 , G06K9/62 , H04N21/466 , G06N3/08 , G06V20/40 , G06N3/04 , H04N21/4402 , G06F9/50
CPC分类号: H04N21/4621 , G06F9/5055 , G06K9/6277 , G06N3/0454 , G06N3/08 , G06V20/41 , H04N21/440227 , H04N21/4666 , G06V20/44
摘要: An adaptive action recognizer for video that performs multiscale spatiotemporal decomposition of video to generate lower complexity video. The adaptive action recognizer has a number of processing pathways, one for each level of video complexity with each processing pathway having a different computational cost. The adaptive action recognizer applies a decision making scheme that encourages using low average computational costs while retaining high accuracy.
-
公开(公告)号:US11902548B2
公开(公告)日:2024-02-13
申请号:US17203613
申请日:2021-03-16
申请人: Deepak Sridhar , Niamul Quader , Srikanth Muralidharan , Yaoxin Li , Juwei Lu , Peng Dai
发明人: Deepak Sridhar , Niamul Quader , Srikanth Muralidharan , Yaoxin Li , Juwei Lu , Peng Dai
摘要: Systems, methods, and computer media of processing a video are disclosed. An example method may include: receiving a plurality of video frames of a video; generating a plurality of first input features based on the plurality of video frames; generating a plurality of second input features based on reversing a temporal order of the plurality of first input features; generating a first set of joint attention features based on the plurality of first input features; generating a second set of joint attention features based on the plurality of second input features; and concatenating the first set of joint attention features and the second set of joint attention features to generate a final set of joint attention features.
-
公开(公告)号:US11698926B2
公开(公告)日:2023-07-11
申请号:US17524862
申请日:2021-11-12
申请人: Arnab Kumar Mondal , Deepak Sridhar , Niamul Quader , Juwei Lu , Peng Dai , Chao Xing
发明人: Arnab Kumar Mondal , Deepak Sridhar , Niamul Quader , Juwei Lu , Peng Dai , Chao Xing
IPC分类号: G06F16/30 , G06F16/732 , G06N3/04 , G06F16/783 , G06V20/40
CPC分类号: G06F16/7343 , G06F16/783 , G06N3/04 , G06V20/40
摘要: Methods and systems are described for performing video retrieval together with video grounding. A word-based query for a video is and encoded into a query representation using a trained query encoder. One or more similar video representations are identified, from a plurality of video representations that are similar to the query representation. Each similar video representation represents a respective relevant video. A grounding is generated for each relevant video by forward propagating each respective similar video representation together with the query representation through a trained grounding module. The relevant videos or identifiers of the relevant videos are outputted together with the grounding generated for each relevant video.
-
-