Learning image categorization using related attributes

    公开(公告)号:US09953425B2

    公开(公告)日:2018-04-24

    申请号:US14447296

    申请日:2014-07-30

    CPC classification number: G06T7/33 G06K9/627 G06N3/0454

    Abstract: A first set of attributes (e.g., style) is generated through pre-trained single column neural networks and leveraged to regularize the training process of a regularized double-column convolutional neural network (RDCNN). Parameters of the first column (e.g., style) of the RDCNN are fixed during RDCNN training. Parameters of the second column (e.g., aesthetics) are fine-tuned while training the RDCNN and the learning process is supervised by the label identified by the second column (e.g., aesthetics). Thus, features of the images may be leveraged to boost classification accuracy of other features by learning a RDCNN.

    Finding semantic parts in images
    12.
    发明授权

    公开(公告)号:US09940577B2

    公开(公告)日:2018-04-10

    申请号:US14793157

    申请日:2015-07-07

    Abstract: Embodiments of the present invention relate to finding semantic parts in images. In implementation, a convolutional neural network (CNN) is applied to a set of images to extract features for each image. Each feature is defined by a feature vector that enables a subset of the set of images to be clustered in accordance with a similarity between feature vectors. Normalized cuts may be utilized to help preserve pose within each cluster. The images in the cluster are aligned and part proposals are generated by sampling various regions in various sizes across the aligned images. To determine which part proposal corresponds to a semantic part, a classifier is trained for each part proposal and semantic part to determine which part proposal best fits the correlation pattern given by the true semantic part. In this way, semantic parts in images can be identified without any previous part annotations.

    Image assessment using deep convolutional neural networks
    13.
    发明授权
    Image assessment using deep convolutional neural networks 有权
    使用深卷积神经网络的图像评估

    公开(公告)号:US09536293B2

    公开(公告)日:2017-01-03

    申请号:US14447290

    申请日:2014-07-30

    Abstract: Deep convolutional neural networks receive local and global representations of images as inputs and learn the best representation for a particular feature through multiple convolutional and fully connected layers. A double-column neural network structure receives each of the local and global representations as two heterogeneous parallel inputs to the two columns. After some layers of transformations, the two columns are merged to form the final classifier. Additionally, features may be learned in one of the fully connected layers. The features of the images may be leveraged to boost classification accuracy of other features by learning a regularized double-column neural network.

    Abstract translation: 深卷积神经网络接收图像的局部和全局表示作为输入,并通过多个卷积和完全连接的层学习特定特征的最佳表示。 双列神经网络结构将每个本地和全局表示都接收到两列异构并行输入。 在一些转换层之后,两列合并形成最终的分类器。 另外,可以在完全连接的层之一中学习特征。 可以通过学习正则化的双列神经网络来利用图像的特征来提高其他特征的分类精度。

    Facial Expression Capture for Character Animation
    14.
    发明申请
    Facial Expression Capture for Character Animation 有权
    角色动画面部表情捕捉

    公开(公告)号:US20160275341A1

    公开(公告)日:2016-09-22

    申请号:US14661788

    申请日:2015-03-18

    Abstract: Techniques for facial expression capture for character animation are described. In one or more implementations, facial key points are identified in a series of images. Each image, in the series of images, is normalized from the identified facial key points. Facial features are determined from each of the normalized images. Then a facial expression is classified, based on the determined facial features, for each of the normalized images. In additional implementations, a series of images are captured that include performances of one or more facial expressions. The facial expressions in each image of the series of images are classified by a facial expression classifier. Then the facial expression classifications are used by a character animator system to produce a series of animated images of an animated character that include animated facial expressions that are associated with the facial expression classification of the corresponding image in the series of images.

    Abstract translation: 描述了用于人物动画的面部表情捕获的技术。 在一个或多个实现中,在一系列图像中识别面部关键点。 一系列图像中的每个图像都从识别的面部关键点进行归一化。 从每个标准化图像确定面部特征。 然后,基于所确定的面部特征,针对每个标准化图像分类面部表情。 在另外的实现中,捕获包括一个或多个面部表情的表现的一系列图像。 一系列图像的每个图像中的面部表情由面部表情分类器分类。 然后,人物动画师系统使用面部表情分类来产生动画角色的一系列动画图像,其包括与一系列图像中的对应图像的面部表情分类相关联的动画面部表情。

    Video Denoising using Optical Flow
    15.
    发明申请
    Video Denoising using Optical Flow 审中-公开
    视频去噪使用光流

    公开(公告)号:US20160191753A1

    公开(公告)日:2016-06-30

    申请号:US15063240

    申请日:2016-03-07

    Abstract: In techniques for video denoising using optical flow, image frames of video content include noise that corrupts the video content. A reference frame is selected, and matching patches to an image patch in the reference frame are determined from within the reference frame. A noise estimate is computed for previous and subsequent image frames relative to the reference frame. The noise estimate for an image frame is computed based on optical flow, and is usable to determine a contribution of similar motion patches to denoise the image patch in the reference frame. The similar motion patches from the previous and subsequent image frames that correspond to the image patch in the reference frame are determined based on the optical flow computations. The image patch is denoised based on an average of the matching patches from reference frame and the similar motion patches determined from the previous and subsequent image frames.

    Abstract translation: 在使用光流的视频去噪的技术中,视频内容的图像帧包括破坏视频内容的噪声。 选择参考帧,并且从参考帧内确定参考帧中的图像块的匹配补丁。 针对相对于参考帧的先前和后续图像帧计算噪声估计。 基于光流计算图像帧的噪声估计,并且可用于确定类似运动补丁对参考帧中的图像补丁进行去噪的贡献。 基于光流计算确定与参考帧中的图像块相对应的来自先前和后续图像帧的类似运动补丁。 基于来自参考帧的匹配补丁的平均值和从先前和后续图像帧确定的类似运动补丁,去除图像补丁。

    LOCAL FEATURE REPRESENTATION FOR IMAGE RECOGNITION
    16.
    发明申请
    LOCAL FEATURE REPRESENTATION FOR IMAGE RECOGNITION 审中-公开
    本地特征表征图像识别

    公开(公告)号:US20160132750A1

    公开(公告)日:2016-05-12

    申请号:US14535963

    申请日:2014-11-07

    Abstract: Techniques are disclosed for image feature representation. The techniques exhibit discriminative power that can be used in any number of classification tasks, and are particularly effective with respect to fine-grained image classification tasks. In an embodiment, a given image to be classified is divided into image patches. A vector is generated for each image patch. Each image patch vector is compared to the Gaussian mixture components (each mixture component is also a vector) of a Gaussian Mixture Model (GMM). Each such comparison generates a similarity score for each image patch vector. For each Gaussian mixture component, the image patch vectors associated with a similarity score that is too low are eliminated. The selectively pooled vectors from all the Gaussian mixture components are then concatenated to form the final image feature vector, which can be provided to a classifier so the given input image can be properly categorized.

    Abstract translation: 公开了用于图像特征表示的技术。 该技术表现出可以在任意数量的分类任务中使用的辨别力,并且在细粒度图像分类任务方面特别有效。 在一个实施例中,要分类的给定图像被分成图像斑块。 为每个图像补丁生成一个向量。 将每个图像块向量与高斯混合模型(GMM)的高斯混合分量(每个混合分量也是向量)进行比较。 每个这样的比较生成每个图像块向量的相似性得分。 对于每个高斯混合分量,消除了与相似度得分相关的图像块向量太低。 然后将来自所有高斯混合分量的选择性汇集的向量连接起来以形成最终图像特征向量,其可以提供给分类器,从而可以对给定的输入图像进行适当的分类。

    Text detection in natural images
    17.
    发明授权
    Text detection in natural images 有权
    自然图像中的文本检测

    公开(公告)号:US09076056B2

    公开(公告)日:2015-07-07

    申请号:US13970993

    申请日:2013-08-20

    CPC classification number: G06K9/18 G06K9/3258

    Abstract: A system and method of text detection in an image are described. A component detection module applies a filter having a stroke width constraint and a stroke color constraint to an image to identify text stroke pixels in the image and to generate both a first map based on the stroke width constraint and a second map based on the stroke color constraint. A component filtering module has a first classifier and second classifier. The first classifier is applied to both the first map and the second map to generate a third map identifying a component of a text in the image. The second classifier is applied to the third map to generate a fourth map identifying a text line of the text in the image. A text region locator module thresholds the fourth map to identify text regions in the image.

    Abstract translation: 描述图像中文本检测的系统和方法。 分量检测模块将具有笔划宽度约束和笔画颜色约束的滤波器应用于图像以识别图像中的文本笔划像素,并且基于笔画宽度约束生成第一地图,并且基于笔画颜色生成第二地图 约束。 组件过滤模块具有第一分类器和第二分类器。 将第一分类器应用于第一地图和第二地图,以生成标识图像中的文本的分量的第三映射。 将第二分类器应用于第三图,以生成标识图像中的文本的文本行的第四图。 文本区域定位器模块阈值第四个映射以识别图像中的文本区域。

    Generating a hierarchy of visual pattern classes
    18.
    发明授权
    Generating a hierarchy of visual pattern classes 有权
    生成视觉模式类的层次结构

    公开(公告)号:US09053392B2

    公开(公告)日:2015-06-09

    申请号:US14012770

    申请日:2013-08-28

    Abstract: A hierarchy machine may be configured as a clustering machine that utilizes local feature embedding to organize visual patterns into nodes that each represent one or more visual patterns. These nodes may be arranged as a hierarchy in which a node may have a parent-child relationship with one or more other nodes. The hierarchy machine may implement a node splitting and tree-learning algorithm that includes hard-splitting of nodes and soft-assignment of nodes to perform error-bounded splitting of nodes into clusters. This may enable the hierarchy machine, which may form all or part of a visual pattern recognition system, to perform large-scale visual pattern recognition, such as font recognition or facial recognition, based on a learned error-bounded tree of visual patterns.

    Abstract translation: 层次机器可以被配置为利用局部特征嵌入将可视图案组织成每个表示一个或多个视觉图案的节点的聚类机器。 这些节点可以被布置为其中节点可以与一个或多个其他节点具有父子关系的层级。 层次机器可以实现节点分割和树学习算法,其包括节点的硬分割和节点的软分配,以执行节点到分簇的有界限制的分割。 这可以使得可以形成视觉图案识别系统的全部或一部分的层次机器基于学习的有界错误的视觉图案树来执行诸如字体识别或面部识别的大规模视觉模式识别。

    ADAPTIVE DENOISING WITH INTERNAL AND EXTERNAL PATCHES
    19.
    发明申请
    ADAPTIVE DENOISING WITH INTERNAL AND EXTERNAL PATCHES 有权
    适用于内部和外部配线

    公开(公告)号:US20150131915A1

    公开(公告)日:2015-05-14

    申请号:US14080659

    申请日:2013-11-14

    Abstract: In techniques for adaptive denoising with internal and external patches, example image patches taken from example images are grouped into partitions of similar patches, and a partition center patch is determined for each of the partitions. An image denoising technique is applied to image patches of a noisy image to generate modified image patches, and a closest partition center patch to each of the modified image patches is determined. The image patches of the noisy image are then classified as either a common patch or a complex patch of the noisy image, where an image patch is classified based on a distance between the corresponding modified image patch and the closest partition center patch. A denoising operator can be applied to an image patch based on the classification, such as applying respective denoising operators to denoise the image patches that are classified as the common patches of the noisy image.

    Abstract translation: 在使用内部和外部补丁进行自适应去噪的技术中,从示例图像获取的示例图像修补程序分组到类似修补程序的分区中,并为每个分区确定分区中心修补程序。 将图像去噪技术应用于噪声图像的图像补丁以产生修改后的图像斑块,并确定每个修改后的图像斑块的最接近的分割中心斑块。 然后,噪声图像的图像块被分类为噪声图像的公共补丁或复杂补丁,其中基于对应的修改的图像补丁和最接近的分割中心补丁之间的距离对图像补丁进行分类。 可以基于分类将去噪算子应用于图像补片,例如应用相应的去噪算子去除被分类为噪声图像的公共斑块的图像斑块。

    IMAGE TAGGING
    20.
    发明申请
    IMAGE TAGGING 有权
    图像标记

    公开(公告)号:US20150120760A1

    公开(公告)日:2015-04-30

    申请号:US14068238

    申请日:2013-10-31

    CPC classification number: G06F17/30265 G06K9/6263 G06K2209/27

    Abstract: A system is configured to annotate an image with tags. As configured, the system accesses an image and generates a set of vectors for the image. The set of vectors may be generated by mathematically transforming the image, such as by applying a mathematical transform to predetermined regions of the image. The system may then query a database of tagged images by submitting the set of vectors as search criteria to a search engine. The querying of the database may obtain a set of tagged images. Next, the system may rank the obtained set of tagged images according to similarity scores that quantify degrees of similarity between the image and each tagged image obtained. Tags from a top-ranked subset of the tagged images may be extracted by the system, which may then annotate the image with these extracted tags.

    Abstract translation: 系统配置为使用标签注释图像。 如所配置的,系统访问图像并生成图像的一组向量。 可以通过数学变换图像来生成向量集合,例如通过对图像的预定区域应用数学变换。 然后,系统可以通过将搜索标准的向量集合提交给搜索引擎来查询标记图像的数据库。 数据库的查询可以获得一组标记的图像。 接下来,系统可以根据量化图像和所获得的每个标记图像之间的相似度的相似度分数来对获得的标记图像集进行排序。 来自标记图像的顶级子集的标签可以由系统提取,然后系统可以利用这些提取的标签来注释图像。

Patent Agency Ranking