-
公开(公告)号:US20240144489A1
公开(公告)日:2024-05-02
申请号:US18480127
申请日:2023-10-03
申请人: VIETTEL GROUP
发明人: Hong Dang Nguyen , Thi Hanh Vu , Manh Quy Nguyen
CPC分类号: G06T7/248 , G06V10/82 , G06V20/46 , G06T2207/10016
摘要: A method for multi-object tracking from video. The method includes the following steps: (1) Capturing frames from the streaming source and preprocess the data; (2) Extract video features with three choices: a 3D-CNN backbone followed by a Transformer Encoder, a Video Transformer Encoder, a 2D-CNN Encoder with a stack of frames as input followed by a Transformer Encoder; (3) Multi-object tracking using a new end-to-end multi-task deep learning model named JDAT (Joint Detection Association Transformer), then post-processing and updating tracking state with Temporal Aggregation Module (TAM). The deep learning models in step 2 and step 3 are trained simultaneously end-to-end with a loss function that is accumulated over multiple timesteps (Collective Average Loss—CAL). Also, the model can be pretrained with weakly labeled image dataset in a self-supervised learning manner first, then finetuned on supervised video datasets with full tracking labels.
-
公开(公告)号:US20240144724A1
公开(公告)日:2024-05-02
申请号:US18476745
申请日:2023-09-28
申请人: VIETTEL GROUP
发明人: Hong Phuc Vu , Thi Hanh Vu , Hong Dang Nguyen , Manh Quy Nguyen
摘要: This invention proposes a method of crowd abnormal behavior detection from video using artificial intelligence, includes three steps: step 1: Data-preprocessing; step 2: Feature extraction and abnormal prediction using a three-dimensional convolution neural network (3D CNN), step 3: Post-processing and synthesizing information to issue warning.
-