IMAGE PROCESSING USING COUPLED SEGMENTATION AND EDGE LEARNING

    公开(公告)号:US20230015989A1

    公开(公告)日:2023-01-19

    申请号:US17365877

    申请日:2021-07-01

    Abstract: The disclosure provides a learning framework that unifies both semantic segmentation and semantic edge detection. A learnable recurrent message passing layer is disclosed where semantic edges are considered as explicitly learned gating signals to refine segmentation and improve dense prediction quality by finding compact structures for message paths. The disclosure includes a method for coupled segmentation and edge learning. In one example, the method includes: (1) receiving an input image, (2) generating, from the input image, a semantic feature map, an affinity map, and a semantic edge map from a single backbone network of a convolutional neural network (CNN), and (3) producing a refined semantic feature map by smoothing pixels of the semantic feature map using spatial propagation, and controlling the smoothing using both affinity values from the affinity map and edge values from the semantic edge map.

    VIDEO PREDICTION USING SPATIALLY DISPLACED CONVOLUTION

    公开(公告)号:US20190297326A1

    公开(公告)日:2019-09-26

    申请号:US16360853

    申请日:2019-03-21

    Abstract: A neural network architecture is disclosed for performing video frame prediction using a sequence of video frames and corresponding pairwise optical flows. The neural network processes the sequence of video frames and optical flows utilizing three-dimensional convolution operations, where time (or multiple video frames in the sequence of video frames) provides the third dimension in addition to the two-dimensional pixel space of the video frames. The neural network generates a set of parameters used to predict a next video frame in the sequence of video frames by sampling a previous video frame utilizing spatially-displaced convolution operations. In one embodiment, the set of parameters includes a displacement vector and at least one convolution kernel per pixel. Generating a pixel value in the next video frame includes applying the convolution kernel to a corresponding patch of pixels in the previous video frame based on the displacement vector.

    Inverse rendering of a scene from a single image

    公开(公告)号:US11295514B2

    公开(公告)日:2022-04-05

    申请号:US16685538

    申请日:2019-11-15

    Abstract: Inverse rendering estimates physical scene attributes (e.g., reflectance, geometry, and lighting) from image(s) and is used for gaming, virtual reality, augmented reality, and robotics. An inverse rendering network (IRN) receives a single input image of a 3D scene and generates the physical scene attributes for the image. The IRN is trained by using the estimated physical scene attributes generated by the IRN to reproduce the input image and updating parameters of the IRN to reduce differences between the reproduced input image and the input image. A direct renderer and a residual appearance renderer (RAR) reproduce the input image. The RAR predicts a residual image representing complex appearance effects of the real (not synthetic) image based on features extracted from the image and the reflectance and geometry properties. The residual image represents near-field illumination, cast shadows, inter-reflections, and realistic shading that are not provided by the direct renderer.

Patent Agency Ranking