Budget-aware method for detecting activity in video

    公开(公告)号:US10860859B2

    公开(公告)日:2020-12-08

    申请号:US16202703

    申请日:2018-11-28

    Abstract: Detection of activity in video content, and more particularly detecting in video start and end frames inclusive of an activity and a classification for the activity, is fundamental for video analytics including categorizing, searching, indexing, segmentation, and retrieval of videos. Existing activity detection processes rely on a large set of features and classifiers that exhaustively run over every time step of a video at multiple temporal scales, or as a small improvement computationally propose segments of the video on which to perform classification. These existing activity detection processes, however, are computationally expensive, particularly when trying to achieve activity detection accuracy, and moreover are not configurable for any particular time or computation budget. The present disclosure provides a time and/or computation budget-aware method for detecting activity in video that relies on a recurrent neural network implementing a learned policy.

    INVERSE RENDERING OF A SCENE FROM A SINGLE IMAGE

    公开(公告)号:US20200160593A1

    公开(公告)日:2020-05-21

    申请号:US16685538

    申请日:2019-11-15

    Abstract: Inverse rendering estimates physical scene attributes (e.g., reflectance, geometry, and lighting) from image(s) and is used for gaming, virtual reality, augmented reality, and robotics. An inverse rendering network (IRN) receives a single input image of a 3D scene and generates the physical scene attributes for the image. The IRN is trained by using the estimated physical scene attributes generated by the IRN to reproduce the input image and updating parameters of the IRN to reduce differences between the reproduced input image and the input image. A direct renderer and a residual appearance renderer (RAR) reproduce the input image. The RAR predicts a residual image representing complex appearance effects of the real (not synthetic) image based on features extracted from the image and the reflectance and geometry properties. The residual image represents near-field illumination, cast shadows, inter-reflections, and realistic shading that are not provided by the direct renderer.

    LEARNING AFFINITY VIA A SPATIAL PROPAGATION NEURAL NETWORK

    公开(公告)号:US20190095791A1

    公开(公告)日:2019-03-28

    申请号:US16134716

    申请日:2018-09-18

    Abstract: A spatial linear propagation network (SLPN) system learns the affinity matrix for vision tasks. An affinity matrix is a generic matrix that defines the similarity of two points in space. The SLPN system is trained for a particular computer vision task and refines an input map (i.e., affinity matrix) that indicates pixels the share a particular property (e.g., color, object, texture, shape, etc.). Inputs to the SLPN system are input data (e.g., pixel values for an image) and the input map corresponding to the input data to be propagated. The input data is processed to produce task-specific affinity values (guidance data). The task-specific affinity values are applied to values in the input map, with at least two weighted values from each column contributing to a value in the refined map data for the adjacent column.

    Image generation of a three-dimensional scene using multiple focal lengths

    公开(公告)号:US10212406B2

    公开(公告)日:2019-02-19

    申请号:US15381010

    申请日:2016-12-15

    Abstract: A system and method for computational zoom generates a resulting image having two or more effective focal lengths. A first surface within a three-dimensional (3D) scene including a first and second set of 3D objects defined by 3D information is identified. The first and second sets of 3D objects are located within first and second depth ranges of the 3D scene, respectively. The first set of 3D objects is projected onto the first surface according to a first projection mapping to produce a first portion of image components. The second set of 3D objects is projected onto the first surface according to a second projection mapping to produce a second portion of image components. The resulting image comprising the first portion of image components and the second portion of image components is generated based on a camera projection from the first surface to a camera view plane.

    TRANSFORMING CONVOLUTIONAL NEURAL NETWORKS FOR VISUAL SEQUENCE LEARNING

    公开(公告)号:US20180373985A1

    公开(公告)日:2018-12-27

    申请号:US15880472

    申请日:2018-01-25

    Abstract: A method, computer readable medium, and system are disclosed for visual sequence learning using neural networks. The method includes the steps of replacing a non-recurrent layer within a trained convolutional neural network model with a recurrent layer to produce a visual sequence learning neural network model and transforming feedforward weights for the non-recurrent layer into input-to-hidden weights of the recurrent layer to produce a transformed recurrent layer. The method also includes the steps of setting hidden-to-hidden weights of the recurrent layer to initial values and processing video image data by the visual sequence learning neural network model to generate classification or regression output data.

    SYSTEMS AND METHODS FOR COMPUTATIONAL ZOOM
    169.
    发明申请

    公开(公告)号:US20180176532A1

    公开(公告)日:2018-06-21

    申请号:US15381010

    申请日:2016-12-15

    Abstract: A system and method for computational zoom generates a resulting image having two or more effective focal lengths. A first surface within a three-dimensional (3D) scene including a first and second set of 3D objects defined by 3D information is identified. The first and second sets of 3D objects are located within first and second depth ranges of the 3D scene, respectively. The first set of 3D objects is projected onto the first surface according to a first projection mapping to produce a first portion of image components. The second set of 3D objects is projected onto the first surface according to a second projection mapping to produce a second portion of image components. The resulting image comprising the first portion of image components and the second portion of image components is generated based on a camera projection from the first surface to a camera view plane.

    SYSTEMS AND METHODS FOR PRUNING NEURAL NETWORKS FOR RESOURCE EFFICIENT INFERENCE

    公开(公告)号:US20180114114A1

    公开(公告)日:2018-04-26

    申请号:US15786406

    申请日:2017-10-17

    CPC classification number: G06N3/082 G06N3/0454 G06N3/084

    Abstract: A method, computer readable medium, and system are disclosed for neural network pruning. The method includes the steps of receiving first-order gradients of a cost function relative to layer parameters for a trained neural network and computing a pruning criterion for each layer parameter based on the first-order gradient corresponding to the layer parameter, where the pruning criterion indicates an importance of each neuron that is included in the trained neural network and is associated with the layer parameter. The method includes the additional steps of identifying at least one neuron having a lowest importance and removing the at least one neuron from the trained neural network to produce a pruned neural network.

Patent Agency Ranking