摘要:
An image is segmented into superpixels by constructing a graph with vertices connected by edges, wherein each vertex corresponds to a pixel in the image, and each edge is associated with a weight indicating a similarity of the corresponding pixels, A subset of edges in the graph are selected to segment the graph into subgraphs, wherein the selecting maximizes an objective function based on an entropy rate and a balancing term. The edges with maximum gains are added to the graph until a number of subgraphs is equal to some threshold.
摘要:
An image is segmented into superpixels by constructing a graph with vertices connected by edges, wherein each vertex corresponds to a pixel in the image, and each edge is associated with a weight indicating a similarity of the corresponding pixels, A subset of edges in the graph are selected to segment the graph into subgraphs, wherein the selecting maximizes an objective function based on an entropy rate and a balancing term. The edges with maximum gains are added to the graph until a number of subgraphs is equal to some threshold.
摘要:
A pose of an object is determine by acquiring sets of images of the object by a camera, wherein the object has a thread arranged on a surface such that a local region of the object appears substantially spherical, wherein the camera is at a different point of view for each set, and wherein each image in each set is acquired while the scene is illuminated from a different direction. A set of features is extracted from each image, wherein the features correspond to points on the surface having normals towards the camera. A parametric line is fitted to the points for each image, wherein the line lies on a plane joining a center of the camera and an axis of the object. Then, geometric constraints are applied to lines to determine the pose of the object.
摘要:
A method includes determining a detection output that represents an object in a two-dimensional image using a detection model, wherein the detection output includes a shape definition that describes a shape and size of the object; defining a three-dimensional representation based on the shape definition, wherein the three-dimensional representation includes a three-dimensional model that represents the object that is placed in three-dimensional space according to a position and a rotation; determining a three-dimensional detection loss that describes a difference between the three-dimensional representation and three-dimensional sensor information; and updating the detection model based on the three-dimensional detection loss.
摘要:
A neural network is trained to defend against adversarial attacks, such as by preparing an input image for classification by a neural network where the input image includes a noise-based perturbation. The input image is divided into source patches. Replacement patches are selected for the source patches by searching a patch library for candidate patches available for replacing ones of those source patches, such as based on sizes of those source patches. A denoised image reconstructed from a number of replacement patches is then output to the neural network for classification. The denoised image may be produced based on reconstruction errors determined for individual candidate patches identified from the patch library. Alternatively, the denoised image may be selected from amongst a number of candidate denoised images. A set of training images is used to construct the patch library, such as based on salient data within patches of those training images.
摘要:
A method clusters samples using a mean shift procedure. A kernel matrix is determined from the samples in a first dimension. A constraint matrix and a scaling matrix are determined from a constraint set. The kernel matrix is projected to a feature space having a second dimension using the constraint matrix, wherein the second dimension is higher than the first dimension. Then, the samples are clustered according to the kernel matrix.
摘要:
A compressed state sequence s is determined directly from the input sequence of data x. A deterministic function ƒ(x) only tracks unique state transitions, and not the dwell times in each state. A polynomial time compressed state sequence inference method outperforms conventional compressed state sequence inference techniques.
摘要:
A method includes obtaining training samples that include images that depict objects and annotations of annotated key point locations for the objects. The method also includes training a machine learning model to determine estimated key point locations for the objects and key point uncertainty values for the estimated key point locations by minimizing a loss function that is based in part on a key point localization loss value that represents a difference between the annotated key point locations and the estimated key point locations values and is weighted by the key point uncertainty values.
摘要:
A pose for an object in a scene is determined by first rendering sets of virtual images of a model of the object using a virtual camera. Each set of virtual images is for a different known pose the model, and constructing virtual depth edge map from each virtual image, which are stored in a database. A set of real images of the object at an unknown pose are acquired by a real camera, and constructing real depth edge map for each real image. The real depth edge maps are compared with the virtual depth edge maps using a cost function to determine the known pose that best matches the unknown pose, wherein the matching is based on locations and orientations of pixels in the depth edge maps.
摘要:
A compressed state sequence s is determined directly from the input sequence of data x. A deterministic function ƒ(x) only tracks unique state transitions, and not the dwell times in each state. A polynomial time compressed state sequence inference method outperforms conventional compressed state sequence inference techniques.