Video concept detection using multi-layer multi-instance learning

发明授权

US08804005B2 Video concept detection using multi-layer multi-instance learning 有权

标题翻译：使用多层多实例学习的视频概念检测

请登陆查看更多内容

专利标题： Video concept detection using multi-layer multi-instance learning
专利标题（中）： 使用多层多实例学习的视频概念检测
申请号： US12111202

申请日： 2008-04-29
公开(公告)号： US08804005B2

公开(公告)日： 2014-08-12
发明人: Tao Mei , Xian-Sheng Hua , Shipeng Li , Zhiwei Gu
申请人： Tao Mei , Xian-Sheng Hua , Shipeng Li , Zhiwei Gu
申请人地址： US WA Redmond
专利权人： Microsoft Corporation
当前专利权人： Microsoft Corporation
当前专利权人地址： US WA Redmond
代理商 Carole Boelitz; Micky Minhas
主分类号： G06K9/62
IPC分类号： G06K9/62 ; G06K9/34

Video concept detection using multi-layer multi-instance learning

摘要：

Visual concepts contained within a video clip are classified based upon a set of target concepts. The clip is segmented into shots and a multi-layer multi-instance (MLMI) structured metadata representation of each shot is constructed. A set of pre-generated trained models of the target concepts is validated using a set of training shots. An MLMI kernel is recursively generated which models the MLMI structured metadata representation of each shot by comparing prescribed pairs of shots. The MLMI kernel is subsequently utilized to generate a learned objective decision function which learns a classifier for determining if a particular shot (that is not in the set of training shots) contains instances of the target concepts. A regularization framework can also be utilized in conjunction with the MLMI kernel to generate modified learned objective decision functions. The regularization framework introduces explicit constraints which serve to maximize the precision of the classifier.

摘要（中）：

视频剪辑中包含的视觉概念基于一组目标概念进行分类。剪辑被分割成镜头，并且构建每个镜头的多层多实例（MLMI）结构化元数据表示。使用一组训练镜头验证了一组预先生成的目标概念训练模型。通过比较规定的拍摄对，递归地生成MLMI内核，以对每个镜头的MLMI结构化元数据表示进行建模。 MLMI内核随后被用于生成学习的客观决策函数，该函数学习用于确定特定镜头（不在该组训练镜头中）是否包含目标概念的实例的分类器。正则化框架也可以与MLMI内核一起使用，以生成修改后的学习目标决策函数。正则化框架引入明确的约束，用于最大化分类器的精度。

公开/授权文献

US20090274434A1 VIDEO CONCEPT DETECTION USING MULTI-LAYER MULTI-INSTANCE LEARNING 公开/授权日：2009-11-05

信息查询

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06K	图形数据读取（图像或视频识别或理解G06V）；数据的呈现；记录载体；处理记录载体
G06K9/00	识别模式的方法或装置（图形读取或将机械参数模式（例如力或存在）转换为电信号的方法或装置 G06K11/00）（图像或视频识别或理解 G06V）（语音识别 G10L15/00 )
G06K9/62	.应用电子设备进行识别的方法或装置