专利检索 ap:("Tao Mei" OR "Xian-Sheng Hua" OR "Shipeng Li" OR "Yan Wang") AND inv:"Shipeng Li" 第 1 页

1.

发明授权
Detecting key roles and their relationships from video 有权
标题翻译：从视频中检测关键角色及其关系

公开(公告)号：US09271035B2

公开(公告)日：2016-02-23

申请号：US13085288

申请日：2011-04-12

申请人： Tao Mei , Xian-Sheng Hua , Shipeng Li , Yan Wang

发明人： Tao Mei , Xian-Sheng Hua , Shipeng Li , Yan Wang

IPC分类号： H04N21/44 , G06Q30/02 , H04N21/84 , G06K9/00

CPC分类号： H04N21/44008 , G06F17/30793 , G06F17/30843 , G06K9/00718 , G06Q30/0276 , H04N21/84

摘要： Tools and techniques for acquiring key roles and their relationships from a video independent of metadata, such as cast lists and scripts, are described herein. These techniques include discovering key roles and their relationships by treating a video (e.g., a movie, television program, music video, and personal video, etc.) as a community. For instance, a video is segmented into a hierarchical structure that includes levels for scenes, shots, and key frames. In some implementations, the techniques include performing face detection and grouping on the detected key frames. In some implementations, the techniques include exploiting the key roles and their correlations in this video to discover a community. The discovered community provides for a wide variety of applications, including the automatic generation of visual summaries or video posters including acquired key roles.

摘要翻译： 本文描述了从独立于元数据的视频（如演员列表和脚本）获取关键角色及其关系的工具和技术。这些技术包括通过将视频（例如，电影，电视节目，音乐视频和个人视频等）视为社区来发现关键角色及其关系。例如，视频被分割成层次结构，其包括场景，镜头和关键帧的级别。在一些实现中，这些技术包括在检测到的关键帧上执行面部检测和分组。在一些实现中，这些技术包括利用该视频中的关键角色及其相关性来发现社区。被发现的社区提供了广泛的应用，包括自动生成视觉摘要或视频海报，包括已获得的关键角色。

2.

发明申请
Detecting Key Roles and Their Relationships from Video 有权
标题翻译：从视频中检测关键角色及其关系

公开(公告)号：US20120263433A1

公开(公告)日：2012-10-18

申请号：US13085288

申请日：2011-04-12

申请人： Tao Mei , Xian-Sheng Hua , Shipeng Li , Yan Wang

发明人： Tao Mei , Xian-Sheng Hua , Shipeng Li , Yan Wang

IPC分类号： H04N9/80

CPC分类号： H04N21/44008 , G06F17/30793 , G06F17/30843 , G06K9/00718 , G06Q30/0276 , H04N21/84

摘要： Tools and techniques for acquiring key roles and their relationships from a video independent of metadata, such as cast lists and scripts, are described herein. These techniques include discovering key roles and their relationships by treating a video (e.g., a movie, television program, music video, and personal video, etc.) as a community. For instance, a video is segmented into a hierarchical structure that includes levels for scenes, shots, and key frames. In some implementations, the techniques include performing face detection and grouping on the detected key frames. In some implementations, the techniques include exploiting the key roles and their correlations in this video to discover a community. The discovered community provides for a wide variety of applications, including the automatic generation of visual summaries or video posters including acquired key roles.

摘要翻译： 本文描述了从独立于元数据的视频（如演员列表和脚本）获取关键角色及其关系的工具和技术。这些技术包括通过将视频（例如，电影，电视节目，音乐视频和个人视频等）视为社区来发现关键角色及其关系。例如，视频被分割成层次结构，其包括场景，镜头和关键帧的级别。在一些实现中，这些技术包括在检测到的关键帧上执行面部检测和分组。在一些实现中，这些技术包括利用该视频中的关键角色及其相关性来发现社区。被发现的社区提供了广泛的应用，包括自动生成视觉摘要或视频海报，包括已获得的关键角色。

3.

发明授权
Video concept detection using multi-layer multi-instance learning 有权
标题翻译：使用多层多实例学习的视频概念检测

公开(公告)号：US08804005B2

公开(公告)日：2014-08-12

申请号：US12111202

申请日：2008-04-29

申请人： Tao Mei , Xian-Sheng Hua , Shipeng Li , Zhiwei Gu

发明人： Tao Mei , Xian-Sheng Hua , Shipeng Li , Zhiwei Gu

IPC分类号： G06K9/62 , G06K9/34

CPC分类号： G11B27/28 , G06K9/00711 , G06K9/6282

摘要： Visual concepts contained within a video clip are classified based upon a set of target concepts. The clip is segmented into shots and a multi-layer multi-instance (MLMI) structured metadata representation of each shot is constructed. A set of pre-generated trained models of the target concepts is validated using a set of training shots. An MLMI kernel is recursively generated which models the MLMI structured metadata representation of each shot by comparing prescribed pairs of shots. The MLMI kernel is subsequently utilized to generate a learned objective decision function which learns a classifier for determining if a particular shot (that is not in the set of training shots) contains instances of the target concepts. A regularization framework can also be utilized in conjunction with the MLMI kernel to generate modified learned objective decision functions. The regularization framework introduces explicit constraints which serve to maximize the precision of the classifier.

摘要翻译： 视频剪辑中包含的视觉概念基于一组目标概念进行分类。剪辑被分割成镜头，并且构建每个镜头的多层多实例（MLMI）结构化元数据表示。使用一组训练镜头验证了一组预先生成的目标概念训练模型。通过比较规定的拍摄对，递归地生成MLMI内核，以对每个镜头的MLMI结构化元数据表示进行建模。 MLMI内核随后被用于生成学习的客观决策函数，该函数学习用于确定特定镜头（不在该组训练镜头中）是否包含目标概念的实例的分类器。正则化框架也可以与MLMI内核一起使用，以生成修改后的学习目标决策函数。正则化框架引入明确的约束，用于最大化分类器的精度。

4.

发明授权
Enhancing photo browsing through music and advertising 有权
标题翻译：通过音乐和广告加强照片浏览

公开(公告)号：US08504422B2

公开(公告)日：2013-08-06

申请号：US12786020

申请日：2010-05-24

申请人： Tao Mei , Xian-Sheng Hua , Shipeng Li , Jinlian Guo , Fei Sheng

发明人： Tao Mei , Xian-Sheng Hua , Shipeng Li , Jinlian Guo , Fei Sheng

IPC分类号： G06Q30/00

CPC分类号： G06F17/30056 , G06F17/30047 , G06Q30/02 , G06Q30/0243

摘要： Techniques for recommending music and advertising to enhance a user's experience while photo browsing are described. In some instances, songs and ads are ranked for relevance to at least one photo from a photo album. The songs, ads and photo(s) from the photo album are then mapped to a style and mood ontology to obtain vector-based representations. The vector-based representations can include real valued terms, each term associated with a human condition defined by the ontology. A re-ranking process generates a relevancy term for each song and each ad indicating relevancy to the photo album. The relevancy terms can be calculated by summing weighted terms from the ranking and the mapping. Recommended music and ads may then be provided to a user, as the user browses a series of photos obtained from the photo album. The ads may be seamlessly embedded into the music in a nonintrusive manner.

摘要翻译： 描述用于推荐音乐和广告以提高用户在照相浏览时体验的技术。在某些情况下，歌曲和广告的排名与相册中的至少一张照片相关。然后将相册中的歌曲，广告和照片映射到风格和心境本体以获得基于矢量的表示。基于向量的表示可以包括实际值，每个术语与由本体定义的人类条件相关联。重新排序过程产生每个歌曲的相关术语，每个广告指示相册的相关性。可以通过从排名和映射求和加权项来计算相关项。然后，当用户浏览从相册获得的一系列照片时，推荐的音乐和广告可以被提供给用户。广告可以无缝地嵌入到音乐中。

5.

发明授权
Intelligent overlay for video advertising 有权
标题翻译：视频广告的智能覆盖

公开(公告)号：US08369686B2

公开(公告)日：2013-02-05

申请号：US12571373

申请日：2009-09-30

申请人： Tao Mei , Xian-Sheng Hua , Shipeng Li , Jinlian Guo

发明人： Tao Mei , Xian-Sheng Hua , Shipeng Li , Jinlian Guo

IPC分类号： H04N9/80

CPC分类号： H04N21/458 , G06Q30/02 , G06Q30/0243 , G11B27/034 , G11B27/28 , H04N21/44008 , H04N21/440236 , H04N21/466 , H04N21/812

摘要： Video advertising overlay technique embodiments are presented that generally detect a set of spatio-temporal nonintrusive positions within a series of consecutive video frames in shots of a digital video and then overlay contextually relevant ads on these positions. In one general embodiment, this is accomplished by decomposing the video into a series of shots, and then identifying a video advertisement for each of a selected set of the shots. The identified video advertisement is one that is determined to be the most relevant to the content of the shot. An overlay area is also identified in each of the shots, where the selected overlay area is the least intrusive among a plurality of prescribed areas to a viewer of the video. The video advertisements identified for the shots are then respectively scheduled to be overlaid in the identified overlay area of a shot, whenever the shot is played.

摘要翻译： 提供了视频广告覆盖技术实施例，其通常在数字视频的拍摄中检测一系列连续视频帧内的一组时空非侵入位置，然后在这些位置上重叠相关的相关广告。在一个一般实施例中，这通过将视频分解成一系列镜头，然后为所选择的一组拍摄中的每一个识别视频广告来实现。所识别的视频广告是被确定为与拍摄内容最相关的广告。在每个拍摄中还识别覆盖区域，其中所选覆盖区域在多个规定区域中对于视频的观看者是最小的侵入。每当拍摄被拍摄时，为拍摄而识别的视频广告然后分别被调度为覆盖在所识别的拍摄的重叠区域中。

6.

发明申请
Enhancing Photo Browsing through Music and Advertising 有权
标题翻译：通过音乐和广告增强照片浏览

公开(公告)号：US20110288929A1

公开(公告)日：2011-11-24

申请号：US12786020

申请日：2010-05-24

申请人： Tao Mei , Xian-Sheng Hua , Shipeng Li , Jinlian Guo , Fei Sheng

发明人： Tao Mei , Xian-Sheng Hua , Shipeng Li , Jinlian Guo , Fei Sheng

IPC分类号： G06Q30/00 , G06F17/30 , G06Q10/00

CPC分类号： G06F17/30056 , G06F17/30047 , G06Q30/02 , G06Q30/0243

摘要： Techniques for recommending music and advertising to enhance a user's experience while photo browsing are described. In some instances, songs and ads are ranked for relevance to at least one photo from a photo album. The songs, ads and photo(s) from the photo album are then mapped to a style and mood ontology to obtain vector-based representations. The vector-based representations can include real valued terms, each term associated with a human condition defined by the ontology. A re-ranking process generates a relevancy term for each song and each ad indicating relevancy to the photo album. The relevancy terms can be calculated by summing weighted terms from the ranking and the mapping. Recommended music and ads may then be provided to a user, as the user browses a series of photos obtained from the photo album. The ads may be seamlessly embedded into the music in a nonintrusive manner.

摘要翻译： 描述用于推荐音乐和广告以提高用户在照相浏览时体验的技术。在某些情况下，歌曲和广告的排名与相册中的至少一张照片相关。然后将相册中的歌曲，广告和照片映射到风格和心境本体以获得基于矢量的表示。基于向量的表示可以包括实际值，每个术语与由本体定义的人类条件相关联。重新排序过程产生每个歌曲的相关术语，每个广告指示相册的相关性。可以通过从排名和映射求和加权项来计算相关项。然后，当用户浏览从相册获得的一系列照片时，推荐的音乐和广告可以被提供给用户。广告可以无缝地嵌入到音乐中。

7.

发明申请
Video Collage Presentation 审中-公开
标题翻译：视频拼贴演示

公开(公告)号：US20090003712A1

公开(公告)日：2009-01-01

申请号：US12055267

申请日：2008-03-25

申请人： Tao Mei , Xian-Sheng Hua , Shipeng Li

发明人： Tao Mei , Xian-Sheng Hua , Shipeng Li

IPC分类号： G06K9/62

CPC分类号： G06F16/745 , G06F16/739 , G06K9/00744 , G06K9/3233

摘要： A method, a computer-readable storage media, and a user interface describe techniques for creating a video collage synthesized from video content, selecting representative images from the video content, extracting and resizing regions of interest (ROI) from the representative images from the video content, and arranging the regions of interest on a canvas without seams while preserving a temporal structure of the video content. The described method, computer-readable storage, and user interface enhance the experience of the user in browsing a video collage that is compact.

摘要翻译： 一种方法，计算机可读存储介质和用户界面描述了用于创建从视频内容合成的视频拼贴的技术，从视频内容中选择代表图像，从视频中提取和调整来自代表图像的感兴趣区域（ROI）内容，并且在没有接缝的情况下在画布上排列感兴趣的区域，同时保留视频内容的时间结构。所描述的方法，计算机可读存储和用户界面增强了用户浏览紧凑的视频拼贴的体验。

8.

发明授权
Enriching online videos by content detection, searching, and information aggregation 有权
标题翻译：通过内容检测，搜索和信息聚合丰富在线视频

公开(公告)号：US09443147B2

公开(公告)日：2016-09-13

申请号：US12767114

申请日：2010-04-26

申请人： Tao Mei , Xian-Sheng Hua , Shipeng Li

发明人： Tao Mei , Xian-Sheng Hua , Shipeng Li

IPC分类号： G06K9/00 , H04N21/462 , H04N21/4722 , G06F17/30 , G06Q30/02

CPC分类号： G06K9/00751 , G06F17/30017 , G06F17/30828 , G06K9/00765 , G06Q30/02 , H04N21/4622 , H04N21/4722

摘要： Many internet users consume content through online videos. For example, users may view movies, television shows, music videos, and/or homemade videos. It may be advantageous to provide additional information to users consuming the online videos. Unfortunately, many current techniques may be unable to provide additional information relevant to the online videos from outside sources. Accordingly, one or more systems and/or techniques for determining a set of additional information relevant to an online video are disclosed herein. In particular, visual, textual, audio, and/or other features may be extracted from an online video (e.g., original content of the online video and/or embedded advertisements). Using the extracted features, additional information (e.g., images, advertisements, etc.) may be determined based upon matching the extracted features with content of a database. The additional information may be presented to a user consuming the online video.

摘要翻译： 许多互联网用户通过在线视频消费内容。例如，用户可以观看电影，电视节目，音乐视频和/或自制视频。向消费在线视频的用户提供附加信息可能是有利的。不幸的是，许多当前的技术可能无法提供与来自外部来源的在线视频相关的附加信息。因此，本文公开了用于确定与在线视频相关的一组附加信息的一个或多个系统和/或技术。特别地，可以从在线视频（例如，在线视频和/或嵌入式广告的原始内容）提取视觉，文本，音频和/或其他特征。使用所提取的特征，可以基于将提取的特征与数据库的内容相匹配来确定附加信息（例如，图像，广告等）。附加信息可以被呈现给使用在线视频的用户。

9.

发明授权
Advertisement insertion points detection for online video advertising 有权
标题翻译：广告插入点检测在线视频广告

公开(公告)号：US08654255B2

公开(公告)日：2014-02-18

申请号：US11858628

申请日：2007-09-20

申请人： Xian-Sheng Hua , Tao Mei , Linjun Yang , Shipeng Li

发明人： Xian-Sheng Hua , Tao Mei , Linjun Yang , Shipeng Li

IPC分类号： H04N9/74

CPC分类号： G11B27/28 , G11B27/036 , H04N21/2365 , H04N21/4347 , H04N21/812 , H04N21/8456 , H04N21/854

摘要： Systems and methods for determining insertion points in a first video stream are described. The insertions points being configured for inserting at least one second video into the first video. In accordance with one embodiment, a method for determining the insertion points includes parsing the first video into a plurality of shots. The plurality of shots includes one or more shot boundaries. The method then determines one or more insertion points by balancing a discontinuity metric and an attractiveness metric of each shot boundary.

摘要翻译： 描述用于确定第一视频流中的插入点的系统和方法。插入点被配置用于将至少一个第二视频插入到第一视频中。根据一个实施例，用于确定插入点的方法包括将第一视频解析成多个镜头。多个镜头包括一个或多个镜头边界。然后，该方法通过平衡不连续度量和每个镜头边界的吸引度度量来确定一个或多个插入点。

10.

发明申请
NEAR-LOSSLESS VIDEO SUMMARIZATION 有权
标题翻译：近无障碍视频总结

公开(公告)号：US20110267544A1

公开(公告)日：2011-11-03

申请号：US12768769

申请日：2010-04-28

申请人： Tao Mei , Xian-Sheng Hua , Shipeng Li , Lin-Xie Tang

发明人： Tao Mei , Xian-Sheng Hua , Shipeng Li , Lin-Xie Tang

IPC分类号： H04N5/14

CPC分类号： H04N5/144 , G06K9/00751 , G11B27/034 , G11B27/105 , G11B27/28 , H04N19/159 , H04N19/25 , H04N19/61 , H04N21/816 , H04N21/84 , H04N21/8453 , H04N21/8455 , H04N21/8456 , H04N21/8549

摘要： Described is perceptually near-lossless video summarization for use in maintaining video summaries, which operates to substantially reconstruct an original video in a generally perceptually near-lossless manner. A video stream is summarized with little information loss by using a relatively very small piece of summary metadata. The summary metadata comprises an image set of synthesized mosaics and representative keyframes, audio data, and the metadata about video structure and motion. In one implementation, the metadata is computed and maintained (e.g., as a file) to summarize a relatively large video sequence, by segmenting a video shot into subshots, and selecting keyframes and mosaics based upon motion data corresponding to those subshots. The motion data is maintained as a semantic description associated with the image set. To reconstruct the video, the metadata is processed, including simulating motion using the image set and the semantic description, which recovers the audiovisual content without any significant information loss.

摘要翻译： 描述的是用于维护视频摘要的感知上的近无损视频摘要，其操作以基本上以感知方式近无损的方式基本上重建原始视频。通过使用相对非常小的汇总元数据，视频流总结了很少的信息丢失。摘要元数据包括合成马赛克的图像集和代表性的关键帧，音频数据以及关于视频结构和运动的元数据。在一个实施方式中，通过将视频拍摄分割为子照片，并且基于与这些子图片相对应的运动数据来选择关键帧和马赛克，计算和维护元数据（例如，作为文件）来总结相对较大的视频序列。运动数据被保持为与图像集相关联的语义描述。为了重建视频，处理元数据，包括使用图像集和语义描述来模拟运动，其恢复视听内容，而没有任何显着的信息丢失。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类