摘要:
An image feature extraction method for person re-identification includes performing person re-identification by means of aligned local descriptor extraction and graded global feature extraction; performing the aligned local descriptor extraction by processing an original image by affine transformation and performing a summation pooling operation on image block features of same regions to obtain an aligned local descriptor; reserving spatial information between inner blocks of the image for the aligned local descriptor; and performing the graded global feature extraction by grading a positioned pedestrian region block and solving a corresponding feature mean value to obtain a global feature. The method can resolve the problem of feature misalignment caused by posture changes of pedestrian, etc., and eliminate the effect of unrelated backgrounds on re-recognition, thus improving the precision and robustness of person re-identification.
摘要:
A method is described that includes executing a convolutional neural network layer on an image processor having an array of execution lanes and a two-dimensional shift register. The executing of the convolutional neural network includes loading a plane of image data of a three-dimensional block of image data into the two-dimensional shift register. The executing of the convolutional neural network also includes performing a two-dimensional convolution of the plane of image data with an array of coefficient values by sequentially: concurrently multiplying within the execution lanes respective pixel and coefficient values to produce an array of partial products; concurrently summing within the execution lanes the partial products with respective accumulations of partial products being kept within the two dimensional register for different stencils within the image data; and, effecting alignment of values for the two-dimensional convolution within the execution lanes by shifting content within the two-dimensional shift register array.
摘要:
Image processing device for producing in real-time a digital composite image from a sequence of digital images recorded by a camera device, in particular an endoscopic camera device, the image processing device including a selecting unit, a key point detection unit, a transforming unit and a joining unit,wherein the key point detection unit includes a maximum detection unit configured for executing following steps separately for the filter response for the reference image and for the filter response for the further image, wherein a variable threshold is used: i) creating blocks by dividing the respective filter response, ii) calculating the variable threshold for each of the blocks, iii) discarding those blocks of the blocks from further consideration, in which the respective filter response at a reference point of the respective block is less than the respective variable threshold.
摘要:
A method is described that includes executing a convolutional neural network layer on an image processor having an array of execution lanes and a two-dimensional shift register. The two-dimensional shift register provides local respective register space for the execution lanes. The executing of the convolutional neural network includes loading a plane of image data of a three-dimensional block of image data into the two-dimensional shift register. The executing of the convolutional neural network also includes performing a two-dimensional convolution of the plane of image data with an array of coefficient values by sequentially: concurrently multiplying within the execution lanes respective pixel and coefficient values to produce an array of partial products; concurrently summing within the execution lanes the partial products with respective accumulations of partial products being kept within the two dimensional register for different stencils within the image data; and, effecting alignment of values for the two-dimensional convolution within the execution lanes by shifting content within the two-dimensional shift register array.
摘要:
Training a generative adversarial network (GAN) for use in facial recognition, comprising providing an input image of a particular face into a facial recognition system to obtain a faceprint; obtaining, based on the input faceprint and a noise value, a set of output images from a GAN generator; obtaining feedback from a GAN discriminator, wherein obtaining feedback comprises inputting each output image into the GAN discriminator and determining a set of likelihood values indicative of whether each output image comprises a facial image; determining, based on each output image, a modified noise value; inputting each output image into a second facial recognition network to determine a set of modified faceprints; defining, based on each modified noise value and modified faceprint, feedback for the GAN generator, wherein the feedback comprises a first value and a second value; and modifying control parameters of the GAN generator.
摘要:
An image processing method includes obtaining a sensed image, wherein the sensed image comprises a pattern; dividing the sensed image into a plurality of blocks; calculating a direction field according to the pattern in each of the blocks; calculating a similarity degree between the direction field of a first block and the direction fields of adjacent blocks of the first block; and classifying the first block into a first part according to the similarity degree of the first block.
摘要:
Rescaling or reconstructing of a digital image may be accomplished by directional interpolation, so that interpolation is done in the direction perpendicular to the gradient—the direction in which the change in pixel values is the smallest. Each pixel is generated by interpolation in the output image as a weighted average of nearby pixels, in which the weighting is done in the direction of the gradient. The interpolation is accomplished with an adaptive filter that has an elliptical frequency response determined by the direction of the gradient. The filter uses filter coefficients that are a function of the direction. Rather than storing coefficients for each of several directions, three filter coefficients are stored—one set for non-directional filter, one for one direction such as 45 degrees, and another for another direction such as 135 degrees. A blending of the filter coefficients is used.
摘要:
In a system for determining liveness of an image presented for authentication, a reference signal is rendered on a display, and a reflection of the rendered signal from a target is analyzed to determine liveness thereof. The analysis includes spatially and/or temporally band pass filtering the reflected signal, and determining RGB values for each frame in the reflected signal and/or each pixel in one or more frames of the reflected signal. Frame level and/or pixel-by-pixel correlations between the determined RGB values and the rendered signal are computed, and a determination of whether an image presented is live or fake is made using either or both correlations.
摘要:
An approach to computation of kernel descriptors is accelerated using precomputed tables. In one aspect, a fast algorithm for kernel descriptor computation that takes O(1) operations per pixel in each patch, based on pre-computed kernel values. This speeds up the kernel descriptor features under consideration, to levels that are comparable with D-SIFT and color SIFT, and two orders of magnitude faster than STIP and HoG3D. In some examples, kernel descriptors are applied to extract gradient, flow and texture based features for video analysis. In tests of the approach on a large database of internet videos used in the TRECVID MED 2011 evaluations, the flow based kernel descriptors are up to two orders of magnitude faster than STIP and HoG3D, and also produce significant performance improvements. Further, using features from multiple color planes produces small but consistent gains.
摘要翻译:使用预先计算的表加速了内核描述符的计算方法。 在一个方面,一种用于内核描述符计算的快速算法,其基于预先计算的内核值在每个补丁中每像素执行O(1)个操作。 这将加速考虑的内核描述符功能,达到与D-SIFT和颜色SIFT相当的水平,比STIP和HoG3D快两个数量级。 在一些示例中,内核描述符被应用于提取用于视频分析的梯度,流和纹理的特征。 在对TRECVID MED 2011评估中使用的大量互联网视频数据库的方法进行测试时,基于流的内核描述符比STIP和HoG3D快两个数量级,并且还可以显着提高性能。 此外,使用来自多个颜色平面的特征产生小但恒定的增益。
摘要:
In techniques for adaptive denoising with internal and external patches, example image patches taken from example images are grouped into partitions of similar patches, and a partition center patch is determined for each of the partitions. An image denoising technique is applied to image patches of a noisy image to generate modified image patches, and a closest partition center patch to each of the modified image patches is determined. The image patches of the noisy image are then classified as either a common patch or a complex patch of the noisy image, where an image patch is classified based on a distance between the corresponding modified image patch and the closest partition center patch. A denoising operator can be applied to an image patch based on the classification, such as applying respective denoising operators to denoise the image patches that are classified as the common patches of the noisy image.