Method and apparatus for using face detection information to improve speaker segmentation

Invention Grant

US09165182B2 Method and apparatus for using face detection information to improve speaker segmentation 有权

Title translation: 用于使用面部检测信息来改善说话者分割的方法和装置

Please log in to see more content

Patent Title: Method and apparatus for using face detection information to improve speaker segmentation
Patent Title (中): 用于使用面部检测信息来改善说话者分割的方法和装置
Application No.: US13969914

Application Date: 2013-08-19
Publication No.: US09165182B2

Publication Date: 2015-10-20
Inventor: Sachin S. Kajarekar , Mainak Sen
Applicant: Cisco Technology, Inc.
Applicant Address: US CA San Jose
Assignee: Cisco Technology, Inc.
Current Assignee: Cisco Technology, Inc.
Current Assignee Address: US CA San Jose
Agent P. Su
Main IPC: G06K9/00
IPC: G06K9/00 ; H04S7/00 ; H04N7/14

Method and apparatus for using face detection information to improve speaker segmentation

Abstract:

In one embodiment, a method includes obtaining media that includes a video stream and an audio stream. The method also includes detecting a number of faces visible in the video stream, and performing a speaker segmentation on the media. Performing the speaker segmentation on the media includes utilizing the number of faces visible in the video stream to augment the speaker segmentation.

Abstract(Chinese):

在一个实施例中，一种方法包括获得包括视频流和音频流的媒体。该方法还包括检测视频流中可见的多个面部以及在介质上执行扬声器分割。在媒体上执行扬声器分割包括利用在视频流中可见的面的数量来增加说话者分割。

Public/Granted literature

US20150049247A1 METHOD AND APPARATUS FOR USING FACE DETECTION INFORMATION TO IMPROVE SPEAKER SEGMENTATION Public/Granted day:2015-02-19

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06K	图形数据读取（图像或视频识别或理解G06V）；数据的呈现；记录载体；处理记录载体
G06K9/00	识别模式的方法或装置（图形读取或将机械参数模式（例如力或存在）转换为电信号的方法或装置 G06K11/00）（图像或视频识别或理解 G06V）（语音识别 G10L15/00 )