-
公开(公告)号:US10713491B2
公开(公告)日:2020-07-14
申请号:US16047362
申请日:2018-07-27
Applicant: Google LLC
Inventor: Menglong Zhu , Mason Liu
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing object detection. In one aspect, a method includes receiving multiple video frames. The video frames are sequentially processed using an object detection neural network to generate an object detection output for each video frame. The object detection neural network includes a convolutional neural network layer and a recurrent neural network layer. For each video frame after an initial video frame, processing the video frame using the object detection neural network includes generating a spatial feature map for the video frame using the convolutional neural network layer and generating a spatio-temporal feature map for the video frame using the recurrent neural network layer.
-
公开(公告)号:US20220189170A1
公开(公告)日:2022-06-16
申请号:US17432221
申请日:2019-02-22
Applicant: Google LLC
Inventor: Menglong Zhu , Mason Liu , Marie Charisse White , Dmitry Kalenichenko , Yinxiao Li
IPC: G06V20/40 , G06V10/70 , G06V10/80 , G06V10/82 , G06V10/94 , G06V10/776 , G06V10/774
Abstract: Systems and methods for detecting objects in a video are provided. A method can include inputting a video comprising a plurality of frames into an interleaved object detection model comprising a plurality of feature extractor networks and a shared memory layer. For each of one or more frames, the operations can include selecting one of the plurality of feature extractor networks to analyze the one or more frames, analyzing the one or more frames by the selected feature extractor network to determine one or more features of the one or more frames, determining an updated set of features based at least in part on the one or more features and one or more previously extracted features extracted from a previous frame stored in the shared memory layer, and detecting an object in the one or more frames based at least in part on the updated set of features.
-
公开(公告)号:US20200034627A1
公开(公告)日:2020-01-30
申请号:US16047362
申请日:2018-07-27
Applicant: Google LLC
Inventor: Menglong Zhu , Mason Liu
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing object detection. In one aspect, a method includes receiving multiple video frames. The video frames are sequentially processed using an object detection neural network to generate an object detection output for each video frame. The object detection neural network includes a convolutional neural network layer and a recurrent neural network layer. For each video frame after an initial video frame, processing the video frame using the object detection neural network includes generating a spatial feature map for the video frame using the convolutional neural network layer and generating a spatio-temporal feature map for the video frame using the recurrent neural network layer.
-
公开(公告)号:US20240212347A1
公开(公告)日:2024-06-27
申请号:US18603946
申请日:2024-03-13
Applicant: Google LLC
Inventor: Dmitry Kalenichenko , Menglong Zhu , Marie Charisse White , Mason Liu , Yinxiao Li
IPC: G06V20/40 , G06V10/70 , G06V10/774 , G06V10/776 , G06V10/80 , G06V10/82 , G06V10/94
CPC classification number: G06V20/40 , G06V10/774 , G06V10/776 , G06V10/806 , G06V10/82 , G06V10/87 , G06V10/955 , G06V20/46
Abstract: Systems and methods for detecting objects in a video are provided. A method can include inputting a video comprising a plurality of frames into an interleaved object detection model comprising a plurality of feature extractor networks and a shared memory layer. For each of one or more frames, the operations can include selecting one of the plurality of feature extractor networks to analyze the one or more frames, analyzing the one or more frames by the selected feature extractor network to determine one or more features of the one or more frames, determining an updated set of features based at least in part on the one or more features and one or more previously extracted features extracted from a previous frame stored in the shared memory layer, and detecting an object in the one or more frames based at least in part on the updated set of features.
-
公开(公告)号:US11961298B2
公开(公告)日:2024-04-16
申请号:US17432221
申请日:2019-02-22
Applicant: Google LLC
Inventor: Menglong Zhu , Mason Liu , Marie Charisse White , Dmitry Kalenichenko , Yinxiao Li
IPC: G06V10/00 , G06V10/70 , G06V10/774 , G06V10/776 , G06V10/80 , G06V10/82 , G06V10/94 , G06V20/40
CPC classification number: G06V20/40 , G06V10/774 , G06V10/776 , G06V10/806 , G06V10/82 , G06V10/87 , G06V10/955 , G06V20/46
Abstract: Systems and methods for detecting objects in a video are provided. A method can include inputting a video comprising a plurality of frames into an interleaved object detection model comprising a plurality of feature extractor networks and a shared memory layer. For each of one or more frames, the operations can include selecting one of the plurality of feature extractor networks to analyze the one or more frames, analyzing the one or more frames by the selected feature extractor network to determine one or more features of the one or more frames, determining an updated set of features based at least in part on the one or more features and one or more previously extracted features extracted from a previous frame stored in the shared memory layer, and detecting an object in the one or more frames based at least in part on the updated set of features.
-
-
-
-