专利检索 ap:("Ross Cutler" OR "Xinding Sun" OR "Senthil Velayutham") AND inv:"Xinding Sun" 第 1 页

1.

发明授权
Identifying far-end sound 有权
标题翻译：识别远端声音

公开(公告)号：US08219387B2

公开(公告)日：2012-07-10

申请号：US11953764

申请日：2007-12-10

申请人： Ross Cutler , Xinding Sun , Senthil Velayutham

发明人： Ross Cutler , Xinding Sun , Senthil Velayutham

IPC分类号： G06F15/00 , G10L11/00 , G10L19/12 , G10L21/02 , G10L17/00

CPC分类号： G06K9/6293 , G10L2021/02082 , G10L2021/02166

摘要： Frames containing audio data may be received, the audio data having been derived from a microphone array, at least some of the frames containing residual acoustic echo after having acoustic echo partially removed therefrom. Probability distribution functions are determined from the frames of audio data. A probability distribution function comprises likelihoods that respective directions are directions of sources of sounds. An active speaker may be identified in frames of video data based on the video data and based on audio information derived from the audio data, where use of the audio information as a basis for identifying the active speaker is controlled by determining whether the probability distribution functions indicate that corresponding audio data includes residual acoustic echo.

摘要翻译： 可以接收包含音频数据的帧，音频数据已经从麦克风阵列导出，至少一些帧在具有从其中部分地去除声学回声之后包含残余声学回声。概率分布函数由音频数据的帧确定。概率分布函数包括各个方向是声源的方向的似然性。可以基于视频数据在视频数据的帧中基于从音频数据导出的音频信息来识别有源扬声器，其中通过确定概率分布函数是否控制通过音频信息作为用于识别有源说话者的基础的使用指示对应的音频数据包括残余声学回声。

2.

发明申请
Identifying far-end sound 有权
标题翻译：识别远端声音

公开(公告)号：US20090150149A1

公开(公告)日：2009-06-11

申请号：US11953764

申请日：2007-12-10

申请人： Ross Culter , Xinding Sun , Senthil Velayutham

发明人： Ross Culter , Xinding Sun , Senthil Velayutham

IPC分类号： G10L17/00

CPC分类号： G06K9/6293 , G10L2021/02082 , G10L2021/02166

摘要： Frames containing audio data may be received, the audio data having been derived from a microphone array, at least some of the frames containing residual acoustic echo after having acoustic echo partially removed therefrom. Probability distribution functions are determined from the frames of audio data. A probability distribution function comprises likelihoods that respective directions are directions of sources of sounds. An active speaker may be identified in frames of video data based on the video data and based on audio information derived from the audio data, where use of the audio information as a basis for identifying the active speaker is controlled by determining whether the probability distribution functions indicate that corresponding audio data includes residual acoustic echo.

摘要翻译： 可以接收包含音频数据的帧，音频数据已经从麦克风阵列导出，至少一些帧在具有从其中部分地去除声学回声之后包含残余声学回声。概率分布函数由音频数据的帧确定。概率分布函数包括各个方向是声源的方向的似然性。可以基于视频数据在视频数据的帧中基于从音频数据导出的音频信息来识别有源扬声器，其中通过确定概率分布函数是否控制通过音频信息作为用于识别有源说话者的基础的使用指示对应的音频数据包括残余声学回声。

3.

发明申请
IDENTIFICATION OF PEOPLE USING MULTIPLE TYPES OF INPUT 有权
标题翻译：使用多种输入类型识别人

公开(公告)号：US20110313766A1

公开(公告)日：2011-12-22

申请号：US13221640

申请日：2011-08-30

申请人： Cha Zhang , Paul A. Viola , Pei Yin , Ross G. Cutler , Xinding Sun , Yong Rui

发明人： Cha Zhang , Paul A. Viola , Pei Yin , Ross G. Cutler , Xinding Sun , Yong Rui

IPC分类号： G10L17/00

CPC分类号： G06K9/6256 , G06K9/4614 , G10L25/78 , G10L2021/02166 , H04N7/147 , H04N7/15 , H04N21/42203 , H04N21/4223 , H04N21/4394 , H04N21/44008 , H04N21/44213 , H04N21/4788

摘要： Systems and methods for detecting people or speakers in an automated fashion are disclosed. A pool of features including more than one type of input (like audio input and video input) may be identified and used with a learning algorithm to generate a classifier that identifies people or speakers. The resulting classifier may be evaluated to detect people or speakers.

摘要翻译： 公开了以自动方式检测人或扬声器的系统和方法。可以识别包括多于一种类型的输入（例如音频输入和视频输入）的功能池，并与学习算法一起使用以生成识别人或扬声器的分类器。可以评估所得分类器以检测人或扬声器。

4.

发明授权
Object activity modeling method 失效
标题翻译：对象活动建模方法

公开(公告)号：US07362806B2

公开(公告)日：2008-04-22

申请号：US09916210

申请日：2001-07-27

申请人： Yang-lim Choi , Yun-ju Yu , Bangalore S. Manjunath , Xinding Sun , Ching-wei Chen

发明人： Yang-lim Choi , Yun-ju Yu , Bangalore S. Manjunath , Xinding Sun , Ching-wei Chen

IPC分类号： H04B1/66

CPC分类号： G06K9/00335 , G06T7/269 , G06T7/277 , G06T2207/10016

摘要： An object activity modeling method which can efficiently model complex objects such as a human body is provided. The object activity modeling method includes the steps of (a) obtaining an optical flow vector from a video sequence; (b) obtaining the probability distribution of the feature vector for a plurality of video frames, using the optical flow vector; (c) modeling states, using the probability distribution of the feature vector; and (d) expressing the activity of the object in the video sequence based on state transition. According to the modeling method, in video indexing and recognition field, complex activities such as human activities can be efficiently modeled and recognized without segmenting objects.

摘要翻译： 提供了一种可以有效地建模诸如人体之类的复杂对象的对象活动建模方法。对象活动建模方法包括以下步骤：（a）从视频序列获得光流矢量; （b）使用光流矢量获得多个视频帧的特征向量的概率分布; （c）建模状态，使用特征向量的概率分布; 和（d）基于状态转换在视频序列中表达对象的活动。根据建模方法，在视频索引识别领域，无需分割对象即可有效地建模和识别人类活动等复杂的活动。

5.

发明授权
Object activity modeling method 失效
标题翻译：对象活动建模方法

公开(公告)号：US07308030B2

公开(公告)日：2007-12-11

申请号：US11103588

申请日：2005-04-12

申请人： Yang-lim Choi , Yun-ju Yu , Bangalore S. Manjunath , Xinding Sun , Ching-wei Chen

发明人： Yang-lim Choi , Yun-ju Yu , Bangalore S. Manjunath , Xinding Sun , Ching-wei Chen

IPC分类号： H04B1/66

CPC分类号： G06K9/00335 , G06T7/269 , G06T7/277 , G06T2207/10016

摘要： An object activity modeling method which can efficiently model complex objects such as a human body is provided. The object activity modeling method includes the steps of (a) obtaining an optical flow vector from a video sequence; (b) obtaining the probability distribution of the feature vector for a plurality of video frames, using the optical flow vector; (c) modeling states, using the probability distribution of the feature vector; and (d) expressing the activity of the object in the video sequence based on state transition. According to the modeling method, in video indexing and recognition field, complex activities such as human activities can be efficiently modeled and recognized without segmenting objects.

摘要翻译： 提供了一种可以有效地建模诸如人体之类的复杂对象的对象活动建模方法。对象活动建模方法包括以下步骤：（a）从视频序列获得光流矢量; （b）使用光流矢量获得多个视频帧的特征向量的概率分布; （c）建模状态，使用特征向量的概率分布; 和（d）基于状态转换在视频序列中表达对象的活动。根据建模方法，在视频索引识别领域，无需分割对象即可有效地建模和识别人类活动等复杂的活动。

6.

发明申请
Dynamic Switching of Microphone Inputs for Identification of a Direction of a Source of Speech Sounds 有权
标题翻译：用于识别语音源的方向的麦克风输入的动态切换

公开(公告)号：US20100092007A1

公开(公告)日：2010-04-15

申请号：US12251525

申请日：2008-10-15

申请人： Xinding Sun

发明人： Xinding Sun

IPC分类号： H04R3/00

CPC分类号： H04R3/005 , G10L25/00 , G10L2021/02166 , H04N7/147 , H04N7/15

摘要： This disclosure describes techniques of automatically identifying a direction of a speech source relative to an array of directional microphones using audio streams from some or all of the directional microphones. Whether the direction of the speech source is identified using audio streams from some of the directional microphones or from all of the directional microphones depends on whether using audio streams from a subgroup of the directional microphones or using audio streams from all of the directional microphones is more likely to correctly identify the direction of the speech source. Switching between using audio streams from some of the directional microphones and using audio streams from all of the directional microphones may occur automatically to best identify the direction of the speech source. A display screen at a remote venue may then display images having angles of view that are centered generally in the direction of the speech source.

摘要翻译： 本公开描述了使用来自一些或所有定向麦克风的音频流自动识别语音源相对于定向麦克风阵列的方向的技术。使用来自一些定向麦克风或所有定向麦克风的音频流来识别语音源的方向取决于是使用来自定向麦克风的子组的音频流还是使用来自所有定向麦克风的音频流更多可能正确识别语音源的方向。使用来自一些定向麦克风的音频流和使用来自所有定向麦克风的音频流之间的切换可以自动发生，以最好地识别语音源的方向。然后，远程场地的显示屏幕可以显示具有视角的图像，该视角的大致在语音源的方向上居中。

7.

发明授权
Digital video processing method and apparatus thereof 有权
标题翻译：数字视频处理方法及其装置

公开(公告)号：US07656951B2

公开(公告)日：2010-02-02

申请号：US10633617

申请日：2003-08-05

申请人： Hyun-doo Shin , Yang-lim Choi , B. S. Manjunath , Xinding Sun

发明人： Hyun-doo Shin , Yang-lim Choi , B. S. Manjunath , Xinding Sun

IPC分类号： H04N7/12 , H04B1/66

CPC分类号： G06F17/30784 , H04N5/147 , H04N19/48

摘要： A digital video processing method and an apparatus thereof are provided. The method for processing digital images received in the form of compressed video streams comprising the step of determining a region intensity histogram (RIH) based on information on motion compensation of inter frames. The RIH information is obtained based on the motion compensation values of inter frames, and the RIH information is a good indicator of motion information of a video scene. Also, since the RIH information is quite a good indicator of intensity of the video scene, video streams having similar intensities can be effectively searched by searching for similar video scenes based on the RIH information obtained by the digital video processing method.

摘要翻译： 提供了一种数字视频处理方法及其装置。用于处理以压缩视频流形式接收的数字图像的方法，包括基于帧间运动补偿的信息来确定区域强度直方图（RIH）的步骤。 RIH信息是基于帧间的运动补偿值获得的，RIH信息是视频场景的运动信息的良好指标。此外，由于RIH信息对于视频场景的强度是相当好的指标，因此可以通过基于通过数字视频处理方法获得的RIH信息搜索类似的视频场景来有效地搜索具有相似强度的视频流。

8.

发明授权
Activity descriptor for video sequences 失效

公开(公告)号：US07003038B2

公开(公告)日：2006-02-21

申请号：US10217918

申请日：2002-08-13

申请人： Ajay Divakaran , Huifang Sun , Hae-Kwang Kim , Chul-Soo Park , Xinding Sun , Bangalore S. Manjunath , Vinod V. Vasudevan , Manoranjan D. Jesudoss , Ganesh Rattinassababady , Hyundoo Shin

发明人： Ajay Divakaran , Huifang Sun , Hae-Kwang Kim , Chul-Soo Park , Xinding Sun , Bangalore S. Manjunath , Vinod V. Vasudevan , Manoranjan D. Jesudoss , Ganesh Rattinassababady , Hyundoo Shin

IPC分类号： H04B1/66

CPC分类号： G06K9/00711 , G06F17/30811 , G06F17/30843 , G06F17/30852 , G06T7/20 , G11B27/28 , H04N19/196 , H04N19/463 , H04N19/61

摘要： A method describes activity in a video sequence. The method measures intensity, direction, spatial, and temporal attributes in the video sequence, and the measured attributes are combined in a digital descriptor of the activity of the video sequence.

9.

发明申请
Object activity modeling method 失效
标题翻译：对象活动建模方法

公开(公告)号：US20050220191A1

公开(公告)日：2005-10-06

申请号：US11103588

申请日：2005-04-12

申请人： Yang-lim Choi , Yun-ju Yu , Bangalore Manjunath , Xinding Sun , Ching-wei Chen

发明人： Yang-lim Choi , Yun-ju Yu , Bangalore Manjunath , Xinding Sun , Ching-wei Chen

IPC分类号： G06T17/00 , G06K9/00 , G06T5/00 , G06T7/20 , H04N7/12

CPC分类号： G06K9/00335 , G06T7/269 , G06T7/277 , G06T2207/10016

摘要： An object activity modeling method which can efficiently model complex objects such as a human body is provided. The object activity modeling method includes the steps of (a) obtaining an optical flow vector from a video sequence; (b) obtaining the probability distribution of the feature vector for a plurality of video frames, using the optical flow vector; (c) modeling states, using the probability distribution of the feature vector; and (d) expressing the activity of the object in the video sequence based on state transition. According to the modeling method, in video indexing and recognition field, complex activities such as human activities can be efficiently modeled and recognized without segmenting objects.

摘要翻译： 提供了一种可以有效地建模诸如人体之类的复杂对象的对象活动建模方法。对象活动建模方法包括以下步骤：（a）从视频序列获得光流矢量; （b）使用光流矢量获得多个视频帧的特征向量的概率分布; （c）建模状态，使用特征向量的概率分布; 和（d）基于状态转换在视频序列中表达对象的活动。根据建模方法，在视频索引识别领域，可以有效地建模和识别诸如人类活动之类的复杂活动，而无需分割对象。

10.

发明授权
Identification of people using multiple types of input 有权
标题翻译：识别使用多种输入的人

公开(公告)号：US08510110B2

公开(公告)日：2013-08-13

申请号：US13546153

申请日：2012-07-11

申请人： Cha Zhang , Paul A. Viola , Pei Yin , Ross G. Cutler , Xinding Sun , Yong Rui

发明人： Cha Zhang , Paul A. Viola , Pei Yin , Ross G. Cutler , Xinding Sun , Yong Rui

IPC分类号： G10L15/00

CPC分类号： G06K9/6256 , G06K9/4614 , G10L25/78 , G10L2021/02166 , H04N7/147 , H04N7/15 , H04N21/42203 , H04N21/4223 , H04N21/4394 , H04N21/44008 , H04N21/44213 , H04N21/4788

摘要： Systems and methods for detecting people or speakers in an automated fashion are disclosed. A pool of features including more than one type of input (like audio input and video input) may be identified and used with a learning algorithm to generate a classifier that identifies people or speakers. The resulting classifier may be evaluated to detect people or speakers.

摘要翻译： 公开了以自动方式检测人或扬声器的系统和方法。可以识别包括多于一种类型的输入（例如音频输入和视频输入）的功能池，并与学习算法一起使用以生成识别人或扬声器的分类器。可以评估所得分类器以检测人或扬声器。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类