Transductive multi-label learning for video concept detection
    41.
    发明授权
    Transductive multi-label learning for video concept detection 有权
    用于视频概念检测的转换多标签学习

    公开(公告)号:US08218859B2

    公开(公告)日:2012-07-10

    申请号:US12329293

    申请日:2008-12-05

    IPC分类号: G06K9/62 G06K9/00

    CPC分类号: G06K9/00718 G06K9/6297

    摘要: This disclosure describes various exemplary method and computer program products for transductive multi-label classification in detecting video concepts for information retrieval. This disclosure describes utilizing a hidden Markov random field formulation to detect labels for concepts in a video content and modeling a multi-label interdependence between the labels by a pairwise Markov random field. The process groups the labels into several parts to speed up a labeling inference and calculates a conditional probability score for the labels, the conditional probability scores are ordered for ranking in a video retrieval evaluation.

    摘要翻译: 本公开描述了用于检测用于信息检索的视频概念的用于转换多标签分类的各种示例性方法和计算机程序产品。 本公开描述了利用隐马尔科夫随机场公式来检测视频内容中的概念的标签,并通过成对的马尔可夫随机场对标签之间的多标签相互依赖进行建模。 该过程将标签分组成几个部分,以加快标签推理,并计算标签的条件概率分数,条件概率分数被排序用于视频检索评估中的排名。

    NEAR-LOSSLESS VIDEO SUMMARIZATION
    42.
    发明申请
    NEAR-LOSSLESS VIDEO SUMMARIZATION 有权
    近无障碍视频总结

    公开(公告)号:US20110267544A1

    公开(公告)日:2011-11-03

    申请号:US12768769

    申请日:2010-04-28

    IPC分类号: H04N5/14

    摘要: Described is perceptually near-lossless video summarization for use in maintaining video summaries, which operates to substantially reconstruct an original video in a generally perceptually near-lossless manner. A video stream is summarized with little information loss by using a relatively very small piece of summary metadata. The summary metadata comprises an image set of synthesized mosaics and representative keyframes, audio data, and the metadata about video structure and motion. In one implementation, the metadata is computed and maintained (e.g., as a file) to summarize a relatively large video sequence, by segmenting a video shot into subshots, and selecting keyframes and mosaics based upon motion data corresponding to those subshots. The motion data is maintained as a semantic description associated with the image set. To reconstruct the video, the metadata is processed, including simulating motion using the image set and the semantic description, which recovers the audiovisual content without any significant information loss.

    摘要翻译: 描述的是用于维护视频摘要的感知上的近无损视频摘要,其操作以基本上以感知方式近无损的方式基本上重建原始视频。 通过使用相对非常小的汇总元数据,视频流总结了很少的信息丢失。 摘要元数据包括合成马赛克的图像集和代表性的关键帧,音频数据以及关于视频结构和运动的元数据。 在一个实施方式中,通过将视频拍摄分割为子照片,并且基于与这些子图片相对应的运动数据来选择关键帧和马赛克,计算和维护元数据(例如,作为文件)来总结相对较大的视频序列。 运动数据被保持为与图像集相关联的语义描述。 为了重建视频,处理元数据,包括使用图像集和语义描述来模拟运动,其恢复视听内容,而没有任何显着的信息丢失。

    ENRICHING ONLINE VIDEOS BY CONTENT DETECTION, SEARCHING, AND INFORMATION AGGREGATION
    43.
    发明申请
    ENRICHING ONLINE VIDEOS BY CONTENT DETECTION, SEARCHING, AND INFORMATION AGGREGATION 有权
    通过内容检测,搜索和信息聚合增强在线视频

    公开(公告)号:US20110264700A1

    公开(公告)日:2011-10-27

    申请号:US12767114

    申请日:2010-04-26

    摘要: Many internet users consume content through online videos. For example, users may view movies, television shows, music videos, and/or homemade videos. It may be advantageous to provide additional information to users consuming the online videos. Unfortunately, many current techniques may be unable to provide additional information relevant to the online videos from outside sources. Accordingly, one or more systems and/or techniques for determining a set of additional information relevant to an online video are disclosed herein. In particular, visual, textual, audio, and/or other features may be extracted from an online video (e.g., original content of the online video and/or embedded advertisements). Using the extracted features, additional information (e.g., images, advertisements, etc.) may be determined based upon matching the extracted features with content of a database. The additional information may be presented to a user consuming the online video.

    摘要翻译: 许多互联网用户通过在线视频消费内容。 例如,用户可以观看电影,电视节目,音乐视频和/或自制视频。 向消费在线视频的用户提供附加信息可能是有利的。 不幸的是,许多当前的技术可能无法提供与来自外部来源的在线视频相关的附加信息。 因此,本文公开了用于确定与在线视频相关的一组附加信息的一个或多个系统和/或技术。 特别地,可以从在线视频(例如,在线视频和/或嵌入式广告的原始内容)提取视觉,文本,音频和/或其他特征。 使用所提取的特征,可以基于将提取的特征与数据库的内容相匹配来确定附加信息(例如,图像,广告等)。 附加信息可以被呈现给使用在线视频的用户。

    CONTEXTUAL IMAGE SEARCH
    44.
    发明申请
    CONTEXTUAL IMAGE SEARCH 审中-公开
    背景图像搜索

    公开(公告)号:US20110191336A1

    公开(公告)日:2011-08-04

    申请号:US12696591

    申请日:2010-01-29

    IPC分类号: G06F17/30

    CPC分类号: G06F16/00

    摘要: Techniques for image search using contextual information related to a user query are described. A user query including at least one of textual data or image data from a collection of data displayed by a computing device is received from a user. At least one other subset of data selected from the collection of data is received as contextual information that is related to and different from the user query. Data files such as image files are retrieved and ranked based on the user query to provide a pre-ranked set of data files. The pre-ranked data files are then ranked based on the contextual information to provide a re-ranked set of data files to be displayed to the user.

    摘要翻译: 描述使用与用户查询相关的上下文信息进行图像搜索的技术。 从用户接收包括来自计算设备显示的数据集合的文本数据或图像数据中的至少一个的用户查询。 作为与用户查询相关并且不同于用户查询的上下文信息,接收到从数据收集中选出的至少一个其他数据子集。 基于用户查询来检索和排序诸如图像文件的数据文件,以提供预先排列的数据文件集合。 然后基于上下文信息对预先排序的数据文件进行排名,以提供要向用户显示的重新排列的数据文件集合。

    Video booklet
    45.
    发明授权
    Video booklet 有权
    视频小册子

    公开(公告)号:US07840898B2

    公开(公告)日:2010-11-23

    申请号:US11264357

    申请日:2005-11-01

    IPC分类号: G06F3/00 G06F3/048

    摘要: Systems and methods are described for creating a video booklet that allows browsing and search of a video library. In one implementation, each video in the video library is divided into segments. Each segment is represented by a thumbnail image. Signatures of the representative thumbnails are extracted and stored in a database. The thumbnail images are then printed into an artistic paper booklet. A user can photograph one of the thumbnails in the paper booklet to automatically play the video segment corresponding to the thumbnail. Active shape modeling is used to identify and restore the photo information to the form of a thumbnail image from which a signature can be extracted for comparison with the database.

    摘要翻译: 描述了用于创建允许浏览和搜索视频库的视频小册子的系统和方法。 在一个实现中,视频库中的每个视频被分成多个段。 每个片段由缩略图形式表示。 代表性缩略图的签名被提取并存储在数据库中。 然后将缩略图图像打印到艺术纸小册子中。 用户可以拍摄纸小册子中的一个缩略图,以自动播放与缩略图对应的视频段。 主动形状建模用于将照片信息识别并恢复为缩略图的形式,从中可以提取签名以与数据库进行比较。

    Multi-Label Multi-Instance Learning for Image Classification
    46.
    发明申请
    Multi-Label Multi-Instance Learning for Image Classification 有权
    图像分类的多标签多实例学习

    公开(公告)号:US20090310854A1

    公开(公告)日:2009-12-17

    申请号:US12140247

    申请日:2008-06-16

    IPC分类号: G06K9/62

    CPC分类号: G06K9/4638 G06K9/342

    摘要: Described is a technology by which an image is classified (e.g., grouped and/or labeled), based on multi-label multi-instance data learning-based classification according to semantic labels and regions. An image is processed in an integrated framework into multi-label multi-instance data, including region and image labels. The framework determines local association data based on each region of an image. Other multi-label multi-instance data is based on relationships between region labels of the image, relationships between image labels of the image, and relationships between the region and image labels. These data are combined to classify the image. Training is also described.

    摘要翻译: 基于根据语义标签和区域的基于多标签多实例数据学习的分类,描述了图像被分类(例如,分组和/或标记)的技术。 图像在集成框架中被处理成多标签多实例数据,包括区域和图像标签。 该框架基于图像的每个区域确定局部关联数据。 其他多标签多实例数据基于图像的区域标签之间的关系,图像的图像标签之间的关系以及区域和图像标签之间的关系。 组合这些数据以对图像进行分类。 培训也被描述。

    VIDEO CONCEPT DETECTION USING MULTI-LAYER MULTI-INSTANCE LEARNING
    47.
    发明申请
    VIDEO CONCEPT DETECTION USING MULTI-LAYER MULTI-INSTANCE LEARNING 有权
    使用多层次多实例学习的视频概念检测

    公开(公告)号:US20090274434A1

    公开(公告)日:2009-11-05

    申请号:US12111202

    申请日:2008-04-29

    IPC分类号: G11B27/00

    摘要: Visual concepts contained within a video clip are classified based upon a set of target concepts. The clip is segmented into shots and a multi-layer multi-instance (MLMI) structured metadata representation of each shot is constructed. A set of pre-generated trained models of the target concepts is validated using a set of training shots. An MLMI kernel is recursively generated which models the MLMI structured metadata representation of each shot by comparing prescribed pairs of shots. The MLMI kernel is subsequently utilized to generate a learned objective decision function which learns a classifier for determining if a particular shot (that is not in the set of training shots) contains instances of the target concepts. A regularization framework can also be utilized in conjunction with the MLMI kernel to generate modified learned objective decision functions. The regularization framework introduces explicit constraints which serve to maximize the precision of the classifier.

    摘要翻译: 视频剪辑中包含的视觉概念基于一组目标概念进行分类。 剪辑被分割成镜头,并且构建每个镜头的多层多实例(MLMI)结构化元数据表示。 使用一组训练镜头验证了一组预先生成的目标概念训练模型。 通过比较规定的拍摄对,递归地生成MLMI内核,以对每个镜头的MLMI结构化元数据表示进行建模。 MLMI内核随后被用于生成学习的客观决策函数,该函数学习用于确定特定镜头(不在该组训练镜头中)是否包含目标概念的实例的分类器。 正则化框架也可以与MLMI内核一起使用,以生成修改后的学习目标决策函数。 正则化框架引入明确的约束,用于最大化分类器的精度。