Identifying far-end sound
    1.
    发明授权
    Identifying far-end sound 有权
    识别远端声音

    公开(公告)号:US08219387B2

    公开(公告)日:2012-07-10

    申请号:US11953764

    申请日:2007-12-10

    摘要: Frames containing audio data may be received, the audio data having been derived from a microphone array, at least some of the frames containing residual acoustic echo after having acoustic echo partially removed therefrom. Probability distribution functions are determined from the frames of audio data. A probability distribution function comprises likelihoods that respective directions are directions of sources of sounds. An active speaker may be identified in frames of video data based on the video data and based on audio information derived from the audio data, where use of the audio information as a basis for identifying the active speaker is controlled by determining whether the probability distribution functions indicate that corresponding audio data includes residual acoustic echo.

    摘要翻译: 可以接收包含音频数据的帧,音频数据已经从麦克风阵列导出,至少一些帧在具有从其中部分地去除声学回声之后包含残余声学回声。 概率分布函数由音频数据的帧确定。 概率分布函数包括各个方向是声源的方向的似然性。 可以基于视频数据在视频数据的帧中基于从音频数据导出的音频信息来识别有源扬声器,其中通过确定概率分布函数是否控制通过音频信息作为用于识别有源说话者的基础的使用 指示对应的音频数据包括残余声学回声。

    Identifying far-end sound
    2.
    发明申请
    Identifying far-end sound 有权
    识别远端声音

    公开(公告)号:US20090150149A1

    公开(公告)日:2009-06-11

    申请号:US11953764

    申请日:2007-12-10

    IPC分类号: G10L17/00

    摘要: Frames containing audio data may be received, the audio data having been derived from a microphone array, at least some of the frames containing residual acoustic echo after having acoustic echo partially removed therefrom. Probability distribution functions are determined from the frames of audio data. A probability distribution function comprises likelihoods that respective directions are directions of sources of sounds. An active speaker may be identified in frames of video data based on the video data and based on audio information derived from the audio data, where use of the audio information as a basis for identifying the active speaker is controlled by determining whether the probability distribution functions indicate that corresponding audio data includes residual acoustic echo.

    摘要翻译: 可以接收包含音频数据的帧,音频数据已经从麦克风阵列导出,至少一些帧在具有从其中部分地去除声学回声之后包含残余声学回声。 概率分布函数由音频数据的帧确定。 概率分布函数包括各个方向是声源的方向的似然性。 可以基于视频数据在视频数据的帧中基于从音频数据导出的音频信息来识别有源扬声器,其中通过确定概率分布函数是否控制通过音频信息作为用于识别有源说话者的基础的使用 指示对应的音频数据包括残余声学回声。

    Object activity modeling method
    4.
    发明授权
    Object activity modeling method 失效
    对象活动建模方法

    公开(公告)号:US07362806B2

    公开(公告)日:2008-04-22

    申请号:US09916210

    申请日:2001-07-27

    IPC分类号: H04B1/66

    摘要: An object activity modeling method which can efficiently model complex objects such as a human body is provided. The object activity modeling method includes the steps of (a) obtaining an optical flow vector from a video sequence; (b) obtaining the probability distribution of the feature vector for a plurality of video frames, using the optical flow vector; (c) modeling states, using the probability distribution of the feature vector; and (d) expressing the activity of the object in the video sequence based on state transition. According to the modeling method, in video indexing and recognition field, complex activities such as human activities can be efficiently modeled and recognized without segmenting objects.

    摘要翻译: 提供了一种可以有效地建模诸如人体之类的复杂对象的对象活动建模方法。 对象活动建模方法包括以下步骤:(a)从视频序列获得光流矢量; (b)使用光流矢量获得多个视频帧的特征向量的概率分布; (c)建模状态,使用特征向量的概率分布; 和(d)基于状态转换在视频序列中表达对象的活动。 根据建模方法,在视频索引识别领域,无需分割对象即可有效地建模和识别人类活动等复杂的活动。

    Object activity modeling method
    5.
    发明授权
    Object activity modeling method 失效
    对象活动建模方法

    公开(公告)号:US07308030B2

    公开(公告)日:2007-12-11

    申请号:US11103588

    申请日:2005-04-12

    IPC分类号: H04B1/66

    摘要: An object activity modeling method which can efficiently model complex objects such as a human body is provided. The object activity modeling method includes the steps of (a) obtaining an optical flow vector from a video sequence; (b) obtaining the probability distribution of the feature vector for a plurality of video frames, using the optical flow vector; (c) modeling states, using the probability distribution of the feature vector; and (d) expressing the activity of the object in the video sequence based on state transition. According to the modeling method, in video indexing and recognition field, complex activities such as human activities can be efficiently modeled and recognized without segmenting objects.

    摘要翻译: 提供了一种可以有效地建模诸如人体之类的复杂对象的对象活动建模方法。 对象活动建模方法包括以下步骤:(a)从视频序列获得光流矢量; (b)使用光流矢量获得多个视频帧的特征向量的概率分布; (c)建模状态,使用特征向量的概率分布; 和(d)基于状态转换在视频序列中表达对象的活动。 根据建模方法,在视频索引识别领域,无需分割对象即可有效地建模和识别人类活动等复杂的活动。

    Dynamic Switching of Microphone Inputs for Identification of a Direction of a Source of Speech Sounds
    6.
    发明申请
    Dynamic Switching of Microphone Inputs for Identification of a Direction of a Source of Speech Sounds 有权
    用于识别语音源的方向的麦克风输入的动态切换

    公开(公告)号:US20100092007A1

    公开(公告)日:2010-04-15

    申请号:US12251525

    申请日:2008-10-15

    申请人: Xinding Sun

    发明人: Xinding Sun

    IPC分类号: H04R3/00

    摘要: This disclosure describes techniques of automatically identifying a direction of a speech source relative to an array of directional microphones using audio streams from some or all of the directional microphones. Whether the direction of the speech source is identified using audio streams from some of the directional microphones or from all of the directional microphones depends on whether using audio streams from a subgroup of the directional microphones or using audio streams from all of the directional microphones is more likely to correctly identify the direction of the speech source. Switching between using audio streams from some of the directional microphones and using audio streams from all of the directional microphones may occur automatically to best identify the direction of the speech source. A display screen at a remote venue may then display images having angles of view that are centered generally in the direction of the speech source.

    摘要翻译: 本公开描述了使用来自一些或所有定向麦克风的音频流自动识别语音源相对于定向麦克风阵列的方向的技术。 使用来自一些定向麦克风或所有定向麦克风的音频流来识别语音源的方向取决于是使用来自定向麦克风的子组的音频流还是使用来自所有定向麦克风的音频流更多 可能正确识别语音源的方向。 使用来自一些定向麦克风的音频流和使用来自所有定向麦克风的音频流之间的切换可以自动发生,以最好地识别语音源的方向。 然后,远程场地的显示屏幕可以显示具有视角的图像,该视角的大致在语音源的方向上居中。

    Digital video processing method and apparatus thereof
    7.
    发明授权
    Digital video processing method and apparatus thereof 有权
    数字视频处理方法及其装置

    公开(公告)号:US07656951B2

    公开(公告)日:2010-02-02

    申请号:US10633617

    申请日:2003-08-05

    IPC分类号: H04N7/12 H04B1/66

    摘要: A digital video processing method and an apparatus thereof are provided. The method for processing digital images received in the form of compressed video streams comprising the step of determining a region intensity histogram (RIH) based on information on motion compensation of inter frames. The RIH information is obtained based on the motion compensation values of inter frames, and the RIH information is a good indicator of motion information of a video scene. Also, since the RIH information is quite a good indicator of intensity of the video scene, video streams having similar intensities can be effectively searched by searching for similar video scenes based on the RIH information obtained by the digital video processing method.

    摘要翻译: 提供了一种数字视频处理方法及其装置。 用于处理以压缩视频流形式接收的数字图像的方法,包括基于帧间运动补偿的信息来确定区域强度直方图(RIH)的步骤。 RIH信息是基于帧间的运动补偿值获得的,RIH信息是视频场景的运动信息的良好指标。 此外,由于RIH信息对于视频场景的强度是相当好的指标,因此可以通过基于通过数字视频处理方法获得的RIH信息搜索类似的视频场景来有效地搜索具有相似强度的视频流。

    Object activity modeling method
    9.
    发明申请
    Object activity modeling method 失效
    对象活动建模方法

    公开(公告)号:US20050220191A1

    公开(公告)日:2005-10-06

    申请号:US11103588

    申请日:2005-04-12

    摘要: An object activity modeling method which can efficiently model complex objects such as a human body is provided. The object activity modeling method includes the steps of (a) obtaining an optical flow vector from a video sequence; (b) obtaining the probability distribution of the feature vector for a plurality of video frames, using the optical flow vector; (c) modeling states, using the probability distribution of the feature vector; and (d) expressing the activity of the object in the video sequence based on state transition. According to the modeling method, in video indexing and recognition field, complex activities such as human activities can be efficiently modeled and recognized without segmenting objects.

    摘要翻译: 提供了一种可以有效地建模诸如人体之类的复杂对象的对象活动建模方法。 对象活动建模方法包括以下步骤:(a)从视频序列获得光流矢量; (b)使用光流矢量获得多个视频帧的特征向量的概率分布; (c)建模状态,使用特征向量的概率分布; 和(d)基于状态转换在视频序列中表达对象的活动。 根据建模方法,在视频索引识别领域,可以有效地建模和识别诸如人类活动之类的复杂活动,而无需分割对象。