Video concept detection using multi-layer multi-instance learning
    1.
    发明授权
    Video concept detection using multi-layer multi-instance learning 有权
    使用多层多实例学习的视频概念检测

    公开(公告)号:US08804005B2

    公开(公告)日:2014-08-12

    申请号:US12111202

    申请日:2008-04-29

    IPC分类号: G06K9/62 G06K9/34

    摘要: Visual concepts contained within a video clip are classified based upon a set of target concepts. The clip is segmented into shots and a multi-layer multi-instance (MLMI) structured metadata representation of each shot is constructed. A set of pre-generated trained models of the target concepts is validated using a set of training shots. An MLMI kernel is recursively generated which models the MLMI structured metadata representation of each shot by comparing prescribed pairs of shots. The MLMI kernel is subsequently utilized to generate a learned objective decision function which learns a classifier for determining if a particular shot (that is not in the set of training shots) contains instances of the target concepts. A regularization framework can also be utilized in conjunction with the MLMI kernel to generate modified learned objective decision functions. The regularization framework introduces explicit constraints which serve to maximize the precision of the classifier.

    摘要翻译: 视频剪辑中包含的视觉概念基于一组目标概念进行分类。 剪辑被分割成镜头,并且构建每个镜头的多层多实例(MLMI)结构化元数据表示。 使用一组训练镜头验证了一组预先生成的目标概念训练模型。 通过比较规定的拍摄对,递归地生成MLMI内核,以对每个镜头的MLMI结构化元数据表示进行建模。 MLMI内核随后被用于生成学习的客观决策函数,该函数学习用于确定特定镜头(不在该组训练镜头中)是否包含目标概念的实例的分类器。 正则化框架也可以与MLMI内核一起使用,以生成修改后的学习目标决策函数。 正则化框架引入明确的约束,用于最大化分类器的精度。

    Enhancing photo browsing through music and advertising
    2.
    发明授权
    Enhancing photo browsing through music and advertising 有权
    通过音乐和广告加强照片浏览

    公开(公告)号:US08504422B2

    公开(公告)日:2013-08-06

    申请号:US12786020

    申请日:2010-05-24

    IPC分类号: G06Q30/00

    摘要: Techniques for recommending music and advertising to enhance a user's experience while photo browsing are described. In some instances, songs and ads are ranked for relevance to at least one photo from a photo album. The songs, ads and photo(s) from the photo album are then mapped to a style and mood ontology to obtain vector-based representations. The vector-based representations can include real valued terms, each term associated with a human condition defined by the ontology. A re-ranking process generates a relevancy term for each song and each ad indicating relevancy to the photo album. The relevancy terms can be calculated by summing weighted terms from the ranking and the mapping. Recommended music and ads may then be provided to a user, as the user browses a series of photos obtained from the photo album. The ads may be seamlessly embedded into the music in a nonintrusive manner.

    摘要翻译: 描述用于推荐音乐和广告以提高用户在照相浏览时体验的技术。 在某些情况下,歌曲和广告的排名与相册中的至少一张照片相关。 然后将相册中的歌曲,广告和照片映射到风格和心境本体以获得基于矢量的表示。 基于向量的表示可以包括实际值,每个术语与由本体定义的人类条件相关联。 重新排序过程产生每个歌曲的相关术语,每个广告指示相册的相关性。 可以通过从排名和映射求和加权项来计算相关项。 然后,当用户浏览从相册获得的一系列照片时,推荐的音乐和广告可以被提供给用户。 广告可以无缝地嵌入到音乐中。

    Intelligent overlay for video advertising
    3.
    发明授权
    Intelligent overlay for video advertising 有权
    视频广告的智能覆盖

    公开(公告)号:US08369686B2

    公开(公告)日:2013-02-05

    申请号:US12571373

    申请日:2009-09-30

    IPC分类号: H04N9/80

    摘要: Video advertising overlay technique embodiments are presented that generally detect a set of spatio-temporal nonintrusive positions within a series of consecutive video frames in shots of a digital video and then overlay contextually relevant ads on these positions. In one general embodiment, this is accomplished by decomposing the video into a series of shots, and then identifying a video advertisement for each of a selected set of the shots. The identified video advertisement is one that is determined to be the most relevant to the content of the shot. An overlay area is also identified in each of the shots, where the selected overlay area is the least intrusive among a plurality of prescribed areas to a viewer of the video. The video advertisements identified for the shots are then respectively scheduled to be overlaid in the identified overlay area of a shot, whenever the shot is played.

    摘要翻译: 提供了视频广告覆盖技术实施例,其通常在数字视频的拍摄中检测一系列连续视频帧内的一组时空非侵入位置,然后在这些位置上重叠相关的相关广告。 在一个一般实施例中,这通过将视频分解成一系列镜头,然后为所选择的一组拍摄中的每一个识别视频广告来实现。 所识别的视频广告是被确定为与拍摄内容最相关的广告。 在每个拍摄中还识别覆盖区域,其中所选覆盖区域在多个规定区域中对于视频的观看者是最小的侵入。 每当拍摄被拍摄时,为拍摄而识别的视频广告然后分别被调度为覆盖在所识别的拍摄的重叠区域中。

    Enhancing Photo Browsing through Music and Advertising
    4.
    发明申请
    Enhancing Photo Browsing through Music and Advertising 有权
    通过音乐和广告增强照片浏览

    公开(公告)号:US20110288929A1

    公开(公告)日:2011-11-24

    申请号:US12786020

    申请日:2010-05-24

    IPC分类号: G06Q30/00 G06F17/30 G06Q10/00

    摘要: Techniques for recommending music and advertising to enhance a user's experience while photo browsing are described. In some instances, songs and ads are ranked for relevance to at least one photo from a photo album. The songs, ads and photo(s) from the photo album are then mapped to a style and mood ontology to obtain vector-based representations. The vector-based representations can include real valued terms, each term associated with a human condition defined by the ontology. A re-ranking process generates a relevancy term for each song and each ad indicating relevancy to the photo album. The relevancy terms can be calculated by summing weighted terms from the ranking and the mapping. Recommended music and ads may then be provided to a user, as the user browses a series of photos obtained from the photo album. The ads may be seamlessly embedded into the music in a nonintrusive manner.

    摘要翻译: 描述用于推荐音乐和广告以提高用户在照相浏览时体验的技术。 在某些情况下,歌曲和广告的排名与相册中的至少一张照片相关。 然后将相册中的歌曲,广告和照片映射到风格和心境本体以获得基于矢量的表示。 基于向量的表示可以包括实际值,每个术语与由本体定义的人类条件相关联。 重新排序过程产生每个歌曲的相关术语,每个广告指示相册的相关性。 可以通过从排名和映射求和加权项来计算相关项。 然后,当用户浏览从相册获得的一系列照片时,推荐的音乐和广告可以被提供给用户。 广告可以无缝地嵌入到音乐中。

    Video Collage Presentation
    5.
    发明申请
    Video Collage Presentation 审中-公开
    视频拼贴演示

    公开(公告)号:US20090003712A1

    公开(公告)日:2009-01-01

    申请号:US12055267

    申请日:2008-03-25

    IPC分类号: G06K9/62

    摘要: A method, a computer-readable storage media, and a user interface describe techniques for creating a video collage synthesized from video content, selecting representative images from the video content, extracting and resizing regions of interest (ROI) from the representative images from the video content, and arranging the regions of interest on a canvas without seams while preserving a temporal structure of the video content. The described method, computer-readable storage, and user interface enhance the experience of the user in browsing a video collage that is compact.

    摘要翻译: 一种方法,计算机可读存储介质和用户界面描述了用于创建从视频内容合成的视频拼贴的技术,从视频内容中选择代表图像,从视频中提取和调整来自代表图像的感兴趣区域(ROI) 内容,并且在没有接缝的情况下在画布上排列感兴趣的区域,同时保留视频内容的时间结构。 所描述的方法,计算机可读存储和用户界面增强了用户浏览紧凑的视频拼贴的体验。

    Enriching online videos by content detection, searching, and information aggregation
    6.
    发明授权
    Enriching online videos by content detection, searching, and information aggregation 有权
    通过内容检测,搜索和信息聚合丰富在线视频

    公开(公告)号:US09443147B2

    公开(公告)日:2016-09-13

    申请号:US12767114

    申请日:2010-04-26

    摘要: Many internet users consume content through online videos. For example, users may view movies, television shows, music videos, and/or homemade videos. It may be advantageous to provide additional information to users consuming the online videos. Unfortunately, many current techniques may be unable to provide additional information relevant to the online videos from outside sources. Accordingly, one or more systems and/or techniques for determining a set of additional information relevant to an online video are disclosed herein. In particular, visual, textual, audio, and/or other features may be extracted from an online video (e.g., original content of the online video and/or embedded advertisements). Using the extracted features, additional information (e.g., images, advertisements, etc.) may be determined based upon matching the extracted features with content of a database. The additional information may be presented to a user consuming the online video.

    摘要翻译: 许多互联网用户通过在线视频消费内容。 例如,用户可以观看电影,电视节目,音乐视频和/或自制视频。 向消费在线视频的用户提供附加信息可能是有利的。 不幸的是,许多当前的技术可能无法提供与来自外部来源的在线视频相关的附加信息。 因此,本文公开了用于确定与在线视频相关的一组附加信息的一个或多个系统和/或技术。 特别地,可以从在线视频(例如,在线视频和/或嵌入式广告的原始内容)提取视觉,文本,音频和/或其他特征。 使用所提取的特征,可以基于将提取的特征与数据库的内容相匹配来确定附加信息(例如,图像,广告等)。 附加信息可以被呈现给使用在线视频的用户。

    Advertisement insertion points detection for online video advertising
    7.
    发明授权
    Advertisement insertion points detection for online video advertising 有权
    广告插入点检测在线视频广告

    公开(公告)号:US08654255B2

    公开(公告)日:2014-02-18

    申请号:US11858628

    申请日:2007-09-20

    IPC分类号: H04N9/74

    摘要: Systems and methods for determining insertion points in a first video stream are described. The insertions points being configured for inserting at least one second video into the first video. In accordance with one embodiment, a method for determining the insertion points includes parsing the first video into a plurality of shots. The plurality of shots includes one or more shot boundaries. The method then determines one or more insertion points by balancing a discontinuity metric and an attractiveness metric of each shot boundary.

    摘要翻译: 描述用于确定第一视频流中的插入点的系统和方法。 插入点被配置用于将至少一个第二视频插入到第一视频中。 根据一个实施例,用于确定插入点的方法包括将第一视频解析成多个镜头。 多个镜头包括一个或多个镜头边界。 然后,该方法通过平衡不连续度量和每个镜头边界的吸引度度量来确定一个或多个插入点。

    NEAR-LOSSLESS VIDEO SUMMARIZATION
    8.
    发明申请
    NEAR-LOSSLESS VIDEO SUMMARIZATION 有权
    近无障碍视频总结

    公开(公告)号:US20110267544A1

    公开(公告)日:2011-11-03

    申请号:US12768769

    申请日:2010-04-28

    IPC分类号: H04N5/14

    摘要: Described is perceptually near-lossless video summarization for use in maintaining video summaries, which operates to substantially reconstruct an original video in a generally perceptually near-lossless manner. A video stream is summarized with little information loss by using a relatively very small piece of summary metadata. The summary metadata comprises an image set of synthesized mosaics and representative keyframes, audio data, and the metadata about video structure and motion. In one implementation, the metadata is computed and maintained (e.g., as a file) to summarize a relatively large video sequence, by segmenting a video shot into subshots, and selecting keyframes and mosaics based upon motion data corresponding to those subshots. The motion data is maintained as a semantic description associated with the image set. To reconstruct the video, the metadata is processed, including simulating motion using the image set and the semantic description, which recovers the audiovisual content without any significant information loss.

    摘要翻译: 描述的是用于维护视频摘要的感知上的近无损视频摘要,其操作以基本上以感知方式近无损的方式基本上重建原始视频。 通过使用相对非常小的汇总元数据,视频流总结了很少的信息丢失。 摘要元数据包括合成马赛克的图像集和代表性的关键帧,音频数据以及关于视频结构和运动的元数据。 在一个实施方式中,通过将视频拍摄分割为子照片,并且基于与这些子图片相对应的运动数据来选择关键帧和马赛克,计算和维护元数据(例如,作为文件)来总结相对较大的视频序列。 运动数据被保持为与图像集相关联的语义描述。 为了重建视频,处理元数据,包括使用图像集和语义描述来模拟运动,其恢复视听内容,而没有任何显着的信息丢失。

    ENRICHING ONLINE VIDEOS BY CONTENT DETECTION, SEARCHING, AND INFORMATION AGGREGATION
    9.
    发明申请
    ENRICHING ONLINE VIDEOS BY CONTENT DETECTION, SEARCHING, AND INFORMATION AGGREGATION 有权
    通过内容检测,搜索和信息聚合增强在线视频

    公开(公告)号:US20110264700A1

    公开(公告)日:2011-10-27

    申请号:US12767114

    申请日:2010-04-26

    摘要: Many internet users consume content through online videos. For example, users may view movies, television shows, music videos, and/or homemade videos. It may be advantageous to provide additional information to users consuming the online videos. Unfortunately, many current techniques may be unable to provide additional information relevant to the online videos from outside sources. Accordingly, one or more systems and/or techniques for determining a set of additional information relevant to an online video are disclosed herein. In particular, visual, textual, audio, and/or other features may be extracted from an online video (e.g., original content of the online video and/or embedded advertisements). Using the extracted features, additional information (e.g., images, advertisements, etc.) may be determined based upon matching the extracted features with content of a database. The additional information may be presented to a user consuming the online video.

    摘要翻译: 许多互联网用户通过在线视频消费内容。 例如,用户可以观看电影,电视节目,音乐视频和/或自制视频。 向消费在线视频的用户提供附加信息可能是有利的。 不幸的是,许多当前的技术可能无法提供与来自外部来源的在线视频相关的附加信息。 因此,本文公开了用于确定与在线视频相关的一组附加信息的一个或多个系统和/或技术。 特别地,可以从在线视频(例如,在线视频和/或嵌入式广告的原始内容)提取视觉,文本,音频和/或其他特征。 使用所提取的特征,可以基于将提取的特征与数据库的内容相匹配来确定附加信息(例如,图像,广告等)。 附加信息可以被呈现给使用在线视频的用户。

    Multi-Label Multi-Instance Learning for Image Classification
    10.
    发明申请
    Multi-Label Multi-Instance Learning for Image Classification 有权
    图像分类的多标签多实例学习

    公开(公告)号:US20090310854A1

    公开(公告)日:2009-12-17

    申请号:US12140247

    申请日:2008-06-16

    IPC分类号: G06K9/62

    CPC分类号: G06K9/4638 G06K9/342

    摘要: Described is a technology by which an image is classified (e.g., grouped and/or labeled), based on multi-label multi-instance data learning-based classification according to semantic labels and regions. An image is processed in an integrated framework into multi-label multi-instance data, including region and image labels. The framework determines local association data based on each region of an image. Other multi-label multi-instance data is based on relationships between region labels of the image, relationships between image labels of the image, and relationships between the region and image labels. These data are combined to classify the image. Training is also described.

    摘要翻译: 基于根据语义标签和区域的基于多标签多实例数据学习的分类,描述了图像被分类(例如,分组和/或标记)的技术。 图像在集成框架中被处理成多标签多实例数据,包括区域和图像标签。 该框架基于图像的每个区域确定局部关联数据。 其他多标签多实例数据基于图像的区域标签之间的关系,图像的图像标签之间的关系以及区域和图像标签之间的关系。 组合这些数据以对图像进行分类。 培训也被描述。