-
公开(公告)号:US08131066B2
公开(公告)日:2012-03-06
申请号:US12098026
申请日:2008-04-04
申请人: Gang Hua , Paul Viola
发明人: Gang Hua , Paul Viola
IPC分类号: G06K9/62
CPC分类号: G06F17/3025 , G06F17/30262 , G06K9/00664 , G06K9/4642 , G06K9/4652 , G06K9/6256 , G06K9/6285
摘要: Images are classified as photos (e.g., natural photographs) or graphics (e.g., cartoons, synthetically generated images), such that when searched (online) with a filter, an image database returns images corresponding to the filter criteria (e.g., either photos or graphics will be returned). A set of image statistics pertaining to various visual cues (e.g., color, texture, shape) are identified in classifying the images. These image statistics, combined with pre-tagged image metadata defining an image as either a graphic or a photo, may be used to train a boosting decision tree. The trained boosting decision tree may be used to classify additional images as graphics or photos based on image statistics determined for the additional images.
摘要翻译: 图像被分类为照片(例如,自然照片)或图形(例如,漫画,综合生成的图像),使得当用过滤器搜索(在线)时,图像数据库返回与过滤标准相对应的图像(例如,照片或 图形将被返回)。 在对图像进行分类时,识别关于各种视觉提示(例如,颜色,纹理,形状)的一组图像统计信息。 这些图像统计信息与将图像定义为图形或照片的预先标记的图像元数据可以用于训练增强决策树。 经训练的增强决策树可以用于基于为附加图像确定的图像统计来将附加图像分类为图形或照片。
-
12.
公开(公告)号:US08059902B2
公开(公告)日:2011-11-15
申请号:US11929420
申请日:2007-10-30
申请人: Onur G. Guleryuz , Gang Hua
发明人: Onur G. Guleryuz , Gang Hua
IPC分类号: G06K9/46
CPC分类号: H04N19/583 , H04N19/105 , H04N19/117 , H04N19/137 , H04N19/18 , H04N19/196 , H04N19/197 , H04N19/46 , H04N19/48 , H04N19/51 , H04N19/61 , H04N19/82
摘要: A method and apparatus are disclosed herein for spatial sparsity induced temporal prediction. In one embodiment, the method comprises: performing motion compensation to generate a first motion compensated prediction using a first block from a previously coded frame; generating a second motion compensated prediction for a second block to be coded from the first motion compensated prediction using a plurality of predictions in the spatial domain, including generating each of the plurality of predictions by generating block transform coefficients for the first block using a transform, generating predicted transform coefficients of the second block to be coded using the block transform coefficients, and performing an inverse transform on the predicted transform coefficients to create the second motion compensated prediction in the pixel domain; subtracting the second motion compensated prediction from a block in a current frame to produce a residual frame; and coding the residual frame.
摘要翻译: 本文公开了一种用于空间稀疏引起的时间预测的方法和装置。 在一个实施例中,该方法包括:使用来自先前编码的帧的第一块来执行运动补偿以产生第一运动补偿预测; 使用所述空间域中的多个预测从所述第一运动补偿预测生成要编码的第二块的第二运动补偿预测,包括通过使用变换生成所述第一块的块变换系数来生成所述多个预测中的每一个, 使用所述块变换系数生成要编码的所述第二块的预测变换系数,对所述预测变换系数进行逆变换,以生成所述像素域中的所述第二运动补偿预测; 从当前帧中的块中减去第二运动补偿预测以产生残余帧; 并对残余帧进行编码。
-
公开(公告)号:US08027541B2
公开(公告)日:2011-09-27
申请号:US11725129
申请日:2007-03-15
申请人: Gang Hua , Steven M. Drucker , Michael Revow , Paul A. Viola , Richard Zemel
发明人: Gang Hua , Steven M. Drucker , Michael Revow , Paul A. Viola , Richard Zemel
IPC分类号: G06K9/46
CPC分类号: G06K9/00228 , G06K9/6251
摘要: A system for organizing images includes an extraction component that extracts visual information (e.g., faces, scenes, etc.) from the images. The extracted visual information is provided to a comparison component which computes similarity confidence data between the extracted visual information. The similarity confidence data is an indication of the likelihood that items of extracted visual information are similar. The comparison component then generates a visual distribution of the extracted visual information based upon the similarity confidence data. The visual distribution can include groupings of the extracted visual information based on computed similarity confidence data. For example, the visual distribution can be a two-dimensional layout of faces organized based on the computed similarity confidence data—with faces in closer proximity faces computed to have a greater probability of representing the same person. The visual distribution can then be utilized by a user to sort, organize and/or tag images.
摘要翻译: 用于组织图像的系统包括从图像中提取视觉信息(例如,面部,场景等)的提取组件。 提取的视觉信息被提供给计算提取的视觉信息之间的相似性置信度数据的比较部件。 相似性置信度数据是提取的视觉信息的项目相似的可能性的指示。 然后,比较组件基于相似性置信度数据生成所提取的视觉信息的视觉分布。 视觉分布可以包括基于计算的相似性置信度数据提取的视觉信息的分组。 例如,视觉分布可以是基于所计算的相似性置信度数据组织的面部的二维布局,其中更接近的面中的面被计算为具有更大的代表同一人的概率。 然后用户可以利用视觉分布来对图像进行分类,组织和/或标记。
-
公开(公告)号:US08023742B2
公开(公告)日:2011-09-20
申请号:US11868988
申请日:2007-10-09
申请人: Matthew Alun Brown , Gang Hua , Simon A. J. Winder
发明人: Matthew Alun Brown , Gang Hua , Simon A. J. Winder
CPC分类号: G06K9/4609 , G06K9/6256
摘要: To render the comparison of image patches more efficient, the data of an image patch can be projected into a smaller-dimensioned subspace, resulting in a descriptor of the image patch. The projection into the descriptor subspace is known as a linear discriminant embedding, and can be performed with reference to a linear discriminant embedding matrix. The linear discriminant embedding matrix can be constructed from projection vectors that maximize those elements that are shared by matching image patches or that are used to distinguish non-matching image patches, while also minimizing those elements that are common to non-matching image patches or that distinguish matching image patches. The determination of such projection vectors can be limited such that only orthogonal vectors comprise the linear discriminant embedding matrix. The determination of the linear discriminant embedding matrix can likewise be constrained to avoid overfitting to training data.
摘要翻译: 为了使图像补丁的比较更加有效,图像补丁的数据可以投影到较小尺寸的子空间中,从而导致图像补丁的描述符。 将描述符子空间的投影称为线性判别嵌入,并且可以参照线性判别嵌入矩阵来执行。 线性判别嵌入矩阵可以由投影向量构成,该矢量最大化匹配图像片段共享的元素,或者用于区分不匹配的图像片段,同时最小化非匹配图像片段常见的那些元素,或者 区分匹配的图像补丁。 可以限制这种投影向量的确定,使得仅正交向量包括线性判别嵌入矩阵。 线性判别嵌入矩阵的确定同样可以被限制,以避免训练数据过度拟合。
-
公开(公告)号:US20090251594A1
公开(公告)日:2009-10-08
申请号:US12060890
申请日:2008-04-02
申请人: Gang Hua , Cha Zhang , Zhengyou Zhang , Zicheng Liu , Ying Shan
发明人: Gang Hua , Cha Zhang , Zhengyou Zhang , Zicheng Liu , Ying Shan
IPC分类号: H04N7/01
CPC分类号: G06T11/00 , H04N1/3875 , H04N7/0122 , H04N21/2662
摘要: Videos are retargeted to a target display for viewing with little to no geometric distortion or video information loss. Salient regions of video frames may be determined using scale-space spatiotemporal information. Video information loss may be a result of spatial loss, due to cropping, and resolution loss, due to resizing. A desired cropping window may be determined using a coarse-to-fine searching strategy. Video frames may be cropped with a window that matches an aspect ratio of the target display, and resized isotropically to match a size of the target display.
摘要翻译: 视频被重定向到目标显示器,用于观看几乎没有几何失真或视频信息丢失。 可以使用尺度空间时空信息来确定视频帧的显着区域。 视频信息丢失可能是由于调整大小而导致的空间损失,由于裁剪和分辨率损失造成的。 可以使用粗略到精细的搜索策略来确定期望的裁剪窗口。 可以用与目标显示器的纵横比匹配的窗口裁剪视频帧,并且各向同性地调整大小以匹配目标显示器的大小。
-
公开(公告)号:US20090091802A1
公开(公告)日:2009-04-09
申请号:US11868988
申请日:2007-10-09
申请人: Matthew Alun Brown , Gang Hua , Simon A. J. Winder
发明人: Matthew Alun Brown , Gang Hua , Simon A. J. Winder
IPC分类号: H04N1/40
CPC分类号: G06K9/4609 , G06K9/6256
摘要: To render the comparison of image patches more efficient, the data of an image patch can be projected into a smaller-dimensioned subspace, resulting in a descriptor of the image patch. The projection into the descriptor subspace is known as a linear discriminant embedding, and can be performed with reference to a linear discriminant embedding matrix. The linear discriminant embedding matrix can be constructed from projection vectors that maximize those elements that are shared by matching image patches or that are used to distinguish non-matching image patches, while also minimizing those elements that are common to non-matching image patches or that distinguish matching image patches. The determination of such projection vectors can be limited such that only orthogonal vectors comprise the linear discriminant embedding matrix. The determination of the linear discriminant embedding matrix can likewise be constrained to avoid overfitting to training data.
摘要翻译: 为了使图像补丁的比较更有效率,图像补丁的数据可以投影到更小尺寸的子空间中,导致图像补丁的描述符。 将描述符子空间的投影称为线性判别嵌入,并且可以参照线性判别嵌入矩阵来执行。 线性判别嵌入矩阵可以由投影向量构成,该投影矢量使由匹配的图像补丁共享的元素最大化,或者用于区分非匹配图像补丁的同时也最小化非匹配图像补丁常见的那些元素,或者 区分匹配的图像补丁。 可以限制这种投影向量的确定,使得仅正交向量包括线性判别嵌入矩阵。 线性判别嵌入矩阵的确定同样可以被限制,以避免训练数据过度拟合。
-
17.
公开(公告)号:US20070122039A1
公开(公告)日:2007-05-31
申请号:US11291309
申请日:2005-11-29
申请人: Zhengyou Zhang , Zicheng Liu , Gang Hua
发明人: Zhengyou Zhang , Zicheng Liu , Gang Hua
CPC分类号: G06K9/38 , G06K9/6226 , G06T7/11 , G06T7/143 , G06T7/149 , G06T2207/20116
摘要: An “Image Segmenter” provides a variational energy formulation for segmentation of natural objects from images. In general, the Image Segmenter operates by adopting Gaussian mixture models (GMM) to capture the appearance variation of objects in one or more images. A global image data likelihood potential is then computed and combined with local region potentials to obtain a robust and accurate estimation of pixel foreground and background distributions. Iterative minimization of a “global-local energy function” is then accomplished by evolution of a foreground/background boundary curve by level set, and estimation of a foreground/background model by fixed-point iteration, termed “quasi-semi-supervised EM.” In various embodiments, this process is further improved by providing general object shape information for use in rectifying objects segmented from the image.
摘要翻译: “图像分割器”提供了用于从图像中分割自然对象的变分能量公式。 通常,图像分割器通过采用高斯混合模型(GMM)来捕获一个或多个图像中的对象的外观变化来操作。 然后计算全局图像数据可能性电位并与局部区域电位组合以获得对像素前景和背景分布的鲁棒且准确的估计。 “全局局部能量函数”的迭代最小化通过水平集演化前景/背景边界曲线,并通过称为“准半监督EM”的定点迭代估计前景/背景模型来实现。 “ 在各种实施例中,通过提供用于整流从图像分割的对象的一般对象形状信息来进一步改进该过程。
-
公开(公告)号:US07212665B2
公开(公告)日:2007-05-01
申请号:US11266830
申请日:2005-11-03
申请人: Ming-Hsuan Yang , Gang Hua
发明人: Ming-Hsuan Yang , Gang Hua
IPC分类号: G06K9/62
CPC分类号: G06K9/00362
摘要: A statistical formulation estimates two-dimensional human pose from single images. This is based on a Markov network and on inferring pose parameters from cues such as appearance, shape, edge, and color. A data-driven belief propagation Monte Carlo algorithm performs efficient Bayesian inferencing within a rigorous statistical framework. Experimental results demonstrate the effectiveness of the method in estimating human pose from single images.
摘要翻译: 统计公式估计单一图像的二维人类姿势。 这是基于马尔可夫网络,并从提示,如外观,形状,边缘和颜色推断姿态参数。 数据驱动的信念传播蒙特卡罗算法在严格的统计框架内执行高效的贝叶斯推理。 实验结果证明了该方法在从单个图像估计人类姿势方面的有效性。
-
19.
公开(公告)号:US08712109B2
公开(公告)日:2014-04-29
申请号:US12437561
申请日:2009-05-08
申请人: Gang Hua , John Wright , Amir Akbarzadeh
发明人: Gang Hua , John Wright , Amir Akbarzadeh
IPC分类号: G06K9/00
CPC分类号: G06K9/00281 , G06K9/38 , G06K9/4642 , G06K9/6215
摘要: Representing a face by jointly quantizing features and spatial location to perform implicit elastic matching between features. A plurality of the features are extracted from a face image and expanded with a corresponding spatial location in the face image. Each of the expanded features is quantized based on one or more randomized decision trees. A histogram of the quantized features is calculated to represent the face image. The histogram is compared to histograms of other face images to identify a match, or to calculate a distance metric representative of a difference between faces.
摘要翻译: 通过联合量化特征和空间位置来表征特征之间的隐式弹性匹配。 从面部图像中提取多个特征,并且利用面部图像中的对应的空间位置进行扩展。 基于一个或多个随机化决策树来量化每个扩展特征。 计算量化特征的直方图以表示脸部图像。 将直方图与其他面部图像的直方图进行比较以识别匹配,或者计算表示面部之间的差异的距离度量。
-
公开(公告)号:US20120141020A1
公开(公告)日:2012-06-07
申请号:US13371719
申请日:2012-02-13
申请人: Gang Hua , Paul Viola
发明人: Gang Hua , Paul Viola
IPC分类号: G06K9/62
CPC分类号: G06F17/3025 , G06F17/30262 , G06K9/00664 , G06K9/4642 , G06K9/4652 , G06K9/6256 , G06K9/6285
摘要: Images are classified as photos (e.g., natural photographs) or graphics (e.g., cartoons, synthetically generated images), such that when searched (online) with a filter, an image database returns images corresponding to the filter criteria (e.g., either photos or graphics will be returned). A set of image statistics pertaining to various visual cues (e.g., color, texture, shape) are identified in classifying the images. These image statistics, combined with pre-tagged image metadata defining an image as either a graphic or a photo, may be used to train a boosting decision tree. The trained boosting decision tree may be used to classify additional images as graphics or photos based on image statistics determined for the additional images.
摘要翻译: 图像被分类为照片(例如,自然照片)或图形(例如,漫画,综合生成的图像),使得当用过滤器搜索(在线)时,图像数据库返回与过滤标准相对应的图像(例如,照片或 图形将被返回)。 在对图像进行分类时,识别关于各种视觉提示(例如,颜色,纹理,形状)的一组图像统计信息。 这些图像统计信息与将图像定义为图形或照片的预先标记的图像元数据可以用于训练增强决策树。 经训练的增强决策树可以用于基于为附加图像确定的图像统计来将附加图像分类为图形或照片。
-
-
-
-
-
-
-
-
-