METHOD AND APPARATUS FOR DETECTING TALKING SEGMENTS IN A VIDEO SEQUENCE USING VISUAL CUES

Invention Application

US20130271361A1 METHOD AND APPARATUS FOR DETECTING TALKING SEGMENTS IN A VIDEO SEQUENCE USING VISUAL CUES 有权

Title translation: 在视频序列中使用视觉色调检测标记段的方法和装置

Please log in to see more content

Patent Title: METHOD AND APPARATUS FOR DETECTING TALKING SEGMENTS IN A VIDEO SEQUENCE USING VISUAL CUES
Patent Title (中): 在视频序列中使用视觉色调检测标记段的方法和装置
Application No.: US13800486

Application Date: 2013-03-13
Publication No.: US20130271361A1

Publication Date: 2013-10-17
Inventor: Sudha Velusamy , Viswanath Gopalakrishnan , Bilva Bhalachandra Navathe , Anshul Sharma
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Applicant Address: KR Suwon-si
Assignee: Samsung Electronics Co., Ltd.
Current Assignee: Samsung Electronics Co., Ltd.
Current Assignee Address: KR Suwon-si
Priority: IN1519/CHE/2012 20120417; KR10-2012-0086189 20120807
Main IPC: G06F3/01
IPC: G06F3/01

METHOD AND APPARATUS FOR DETECTING TALKING SEGMENTS IN A VIDEO SEQUENCE USING VISUAL CUES

Abstract:

A method and system for detecting temporal segments of talking faces in a video sequence using visual cues. The system detects talking segments by classifying talking and non-talking segments in a sequence of image frames using visual cues. The present disclosure detects temporal segments of talking faces in video sequences by first localizing face, eyes, and hence, a mouth region. Then, the localized mouth regions across the video frames are encoded in terms of integrated gradient histogram (IGH) of visual features and quantified using evaluated entropy of the IGH. The time series data of entropy values from each frame is further clustered using online temporal segmentation (K-Means clustering) algorithm to distinguish talking mouth patterns from other mouth movements. Such segmented time series data is then used to enhance the emotion recognition system.

Abstract(Chinese):

一种使用视觉提示来检测视频序列中的谈话面部的时间段的方法和系统。该系统通过使用视觉线索对一系列图像帧中的说话和非说话段进行分类来检测通话段。本公开通过首先定位脸部，眼睛以及因此嘴部区域来检测视频序列中的说话面部的时间片段。然后，通过视觉特征的积分梯度直方图（IGH）对视频帧上的局部化口腔区进行编码，并使用IGH的评估熵进行量化。来自每个帧的熵值的时间序列数据使用在线时间分割（K-Means clustering）算法进一步聚类，以区分说话嘴模式与其他口部动作。然后使用这种分段时间序列数据来增强情绪识别系统。

Public/Granted literature

US09110501B2 Method and apparatus for detecting talking segments in a video sequence using visual cues Public/Granted day:2015-08-18

Information query

Global Dossier Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06F	电数字数据处理（基于特定计算模型的计算机系统入G06N）
G06F3/00	用于将所要处理的数据转变成为计算机能够处理的形式的输入装置；用于将数据从处理机传送到输出设备的输出装置，例如，接口装置
G06F3/01	.用于用户和计算机之间交互的输入装置或输入和输出组合装置（G06F3/16优先）