VIDEO PREDICTION USING SPATIALLY DISPLACED CONVOLUTION

    公开(公告)号:US20190297326A1

    公开(公告)日:2019-09-26

    申请号:US16360853

    申请日:2019-03-21

    Abstract: A neural network architecture is disclosed for performing video frame prediction using a sequence of video frames and corresponding pairwise optical flows. The neural network processes the sequence of video frames and optical flows utilizing three-dimensional convolution operations, where time (or multiple video frames in the sequence of video frames) provides the third dimension in addition to the two-dimensional pixel space of the video frames. The neural network generates a set of parameters used to predict a next video frame in the sequence of video frames by sampling a previous video frame utilizing spatially-displaced convolution operations. In one embodiment, the set of parameters includes a displacement vector and at least one convolution kernel per pixel. Generating a pixel value in the next video frame includes applying the convolution kernel to a corresponding patch of pixels in the previous video frame based on the displacement vector.

Patent Agency Ranking