Abstract:
A method for convolution in a convolutional neural network (CNN) is provided that includes accessing a coefficient value of a filter corresponding to an input feature map of a convolution layer of the CNN, and performing a block multiply accumulation operation on a block of data elements of the input feature map, the block of data elements corresponding to the coefficient value, wherein, for each data element of the block of data elements, a value of the data element is multiplied by the coefficient value and a result of the multiply is added to a corresponding data element in a corresponding output block of data elements comprised in an output feature map.
Abstract:
Disclosed examples include image processing methods and systems to process image data, including computing a plurality of scaled images according to input image data for a current image frame, computing feature vectors for locations of the individual scaled images, classifying the feature vectors to determine sets of detection windows, and grouping detection windows to identify objects in the current frame, where the grouping includes determining first clusters of the detection windows using non-maxima suppression grouping processing, determining positions and scores of second clusters using mean shift clustering according to the first clusters, and determining final clusters representing identified objects in the current image frame using non-maxima suppression grouping of the second clusters. Disclosed examples also include methods and systems to track identified objects from one frame to another using feature vectors and overlap of identified objects between frames to minimize computation intensive operations involving feature vectors.
Abstract:
Estimation of the ground plane of a three dimensional (3D) point cloud based modifications to the random sample consensus (RANSAC) algorithm is provided. The modifications may include applying roll and pitch constraints to the selection of random planes in the 3D point cloud, using a cost function based on the number of inliers in the random plane and the number of 3D points below the random plane in the 3D point cloud, and computing a distance threshold for the 3D point cloud that is used in determining whether or not a 3D point in the 3D point cloud is an inlier of a random plane.
Abstract:
A method for convolution in a convolutional neural network (CNN) is provided that includes accessing a coefficient value of a filter corresponding to an input feature map of a convolution layer of the CNN, and performing a block multiply accumulation operation on a block of data elements of the input feature map, the block of data elements corresponding to the coefficient value, wherein, for each data element of the block of data elements, a value of the data element is multiplied by the coefficient value and a result of the multiply is added to a corresponding data element in a corresponding output block of data elements comprised in an output feature map.
Abstract:
A method for estimating time to collision (TTC) of a detected object in a computer vision system is provided that includes determining a three dimensional (3D) position of a camera in the computer vision system, determining a 3D position of the detected object based on a 2D position of the detected object in an image captured by the camera and an estimated ground plane corresponding to the image, computing a relative 3D position of the camera, a velocity of the relative 3D position, and an acceleration of the relative 3D position based on the 3D position of the camera and the 3D position of the detected object, wherein the relative 3D position of the camera is relative to the 3D position of the detected object, and computing the TTC of the detected object based on the relative 3D position, the velocity, and the acceleration.
Abstract:
This invention predicts that intra mode prediction is more effective for the macroblocks where motion estimation in inter mode prediction fails. This failure is indicated by a large value of the inter mode SAD. This invention performs intra mode prediction for only macro blocks have larger inter mode SADs. The definition of a large inter mode SAD differs for different content. This invention compares the inter mode SAD of a current macroblock with an adaptive threshold. This adaptive threshold depends on the average and variance of the SADs of the previous predicted frame. An adaptive threshold is calculated for each new predictive frame.
Abstract:
Several systems and methods for filtering noise from a picture in a picture sequence associated with video data are disclosed. In an embodiment, the method includes accessing a plurality of pixel blocks associated with the picture and filtering noise from at least one pixel block from among the plurality of pixel blocks. The filtering of noise from a pixel block from among the at least one pixel block includes identifying pixel blocks corresponding to the pixel block in one or more reference pictures associated with the picture sequence. Each identified pixel block is associated with a cost value. One or more pixel blocks are selected from among the identified pixel blocks based on associated cost values. Weights are assigned to the selected one or more pixel blocks and set of filtered pixels for the pixel block is generated based on the weights.
Abstract:
Several systems and methods for filtering noise from a picture in a picture sequence associated with video data are disclosed. In an embodiment, the method includes accessing a plurality of pixel blocks associated with the picture and filtering noise from at least one pixel block from among the plurality of pixel blocks. The filtering of noise from a pixel block from among the at least one pixel block includes identifying pixel blocks corresponding to the pixel block in one or more reference pictures associated with the picture sequence. Each identified pixel block is associated with a cost value. One or more pixel blocks are selected from among the identified pixel blocks based on associated cost values. Weights are assigned to the selected one or more pixel blocks and set of filtered pixels for the pixel block is generated based on the weights.
Abstract:
In described examples of a method for quantizing data for a convolutional neural network (CNN) is provided. A set of data is received and quantized the using a power-of-2 parametric activation (PACT2) function. The PACT2 function arranges the set of data as a histogram and discards a portion of the data corresponding to a tail of the histogram to form a remaining set of data. A clipping value is determined by expanding the remaining set of data to a nearest power of two value. The set of data is then quantized using the clipping value. With PACT2, a model can be quantized either using post training quantization or using quantization aware training. PACT2 helps a quantized model to achieve close accuracy compared to the corresponding floating-point model.
Abstract:
Disclosed herein are systems and methods for determining the scaling factors for a neural network that satisfy the activation functions employed by the nodes of the network. A processor identifies a saturation point of an activation function. Next, the processor determines a scaling factor for an output feature map based on the saturation point of the activation function. Then, the processor determines a scaling factor for an accumulator based on the scaling for the output feature map and further based on a shift value related to a quantization. Finally, the processor determines a scaling factor for a weight map based on the scaling factor for the accumulator.