MULTIMODAL OBJECT LOCALIZATION
    1.
    发明申请
    MULTIMODAL OBJECT LOCALIZATION 有权
    多目标对象本地化

    公开(公告)号:US20100315905A1

    公开(公告)日:2010-12-16

    申请号:US12482773

    申请日:2009-06-11

    申请人: Bowon Lee Kar-Han Tan

    发明人: Bowon Lee Kar-Han Tan

    IPC分类号: G01S3/80

    CPC分类号: G01S5/28

    摘要: Various embodiments of the present invention are directed to systems and methods for multimodal object localization using one or more depth sensors and two or more microphones. In one aspect, a method comprises capturing three-dimensional images of a region of space wherein the object is located. The images comprise three-dimensional depth sensor observations. The method collects ambient audio generated by the object, providing acoustic observation regarding the ambient audio time difference of arrival at the audio sensors. The method determines a coordinate location of the object corresponding to the maximum of a joint probability distribution characterizing the probability of the acoustic observations emanating from each coordinate location in the region of space and the probability of each coordinate location in the region of space given depth sensor observations.

    摘要翻译: 本发明的各种实施例涉及使用一个或多个深度传感器和两个或更多个麦克风的多模态对象定位的系统和方法。 一方面,一种方法包括捕获物体所位于的空间区域的三维图像。 图像包括三维深度传感器观察。 该方法收集由对象产生的环境音频,提供关于到达音频传感器的环境音频时差的声学观察。 该方法确定对应于对应于联合概率分布的最大值的对象的坐标位置,其表征从空间区域中的每个坐标位置发出的声学观察的概率和给定深度传感器的空间区域中的每个坐标位置的概率 观察。

    Multimodal object localization
    2.
    发明授权
    Multimodal object localization 有权
    多模态对象定位

    公开(公告)号:US08174932B2

    公开(公告)日:2012-05-08

    申请号:US12482773

    申请日:2009-06-11

    申请人: Bowon Lee Kar-Han Tan

    发明人: Bowon Lee Kar-Han Tan

    IPC分类号: G01S3/80

    CPC分类号: G01S5/28

    摘要: Various embodiments of the present invention are directed to systems and methods for multimodal object localization using one or more depth sensors and two or more microphones. In one aspect, a method comprises capturing three-dimensional images of a region of space wherein the object is located. The images comprise three-dimensional depth sensor observations. The method collects ambient audio generated by the object, providing acoustic observation regarding the ambient audio time difference of arrival at the audio sensors. The method determines a coordinate location of the object corresponding to the maximum of a joint probability distribution characterizing the probability of the acoustic observations emanating from each coordinate location in the region of space and the probability of each coordinate location in the region of space given depth sensor observations.

    摘要翻译: 本发明的各种实施例涉及使用一个或多个深度传感器和两个或更多个麦克风的多模态对象定位的系统和方法。 一方面,一种方法包括捕获物体所位于的空间区域的三维图像。 图像包括三维深度传感器观察。 该方法收集由对象产生的环境音频,提供关于到达音频传感器的环境音频时差的声学观察。 该方法确定对应于对应于联合概率分布的最大值的对象的坐标位置,其表征从空间区域中的每个坐标位置发出的声学观察的概率和给定深度传感器的空间区域中的每个坐标位置的概率 观察。

    SYSTEMS AND METHODS FOR PERFORMING SOUND SOURCE LOCALIZATION
    3.
    发明申请
    SYSTEMS AND METHODS FOR PERFORMING SOUND SOURCE LOCALIZATION 有权
    用于执行声源定位的系统和方法

    公开(公告)号:US20120093336A1

    公开(公告)日:2012-04-19

    申请号:US12904921

    申请日:2010-10-14

    IPC分类号: H04R3/00

    摘要: Systems and methods for performing sound source localization are provided. In one aspect, a method for locating a sound source using a computing device subdivides a space into subregions. The method then computes a sound source power for each of subregions and determines which of the sound source energies is the largest. When the volume of the subregion is less than a threshold volume, the method outputs the subregion having the largest sound source power. Otherwise, the stages of partitioning, computing, and determining the subregion having the largest sound source power is repeated.

    摘要翻译: 提供了用于执行声源定位的系统和方法。 一方面,使用计算设备定位声源的方法将空间细分为子区域。 该方法然后计算每个子区域的声源功率,并确定哪个声源能量最大。 当该子区域的音量小于阈值音量时,该方法输出具有最大声源功率的子区域。 否则,重复分区,计算和确定具有最大声源功率的子区域的阶段。

    Time delay estimation
    4.
    发明授权
    Time delay estimation 有权
    延时估计

    公开(公告)号:US08699637B2

    公开(公告)日:2014-04-15

    申请号:US13204042

    申请日:2011-08-05

    IPC分类号: H03D1/00

    摘要: A method for time delay estimation performed by a physical computing system includes passing a first input signal obtained by a first sensor through a filter bank to form a first set of sub-band output signals, passing a second input signal obtained by a second sensor through the filter bank to form a second set of sub-band output signals, the second sensor placed a distance from the first sensor, computing cross-correlation data between the first set of sub-band output signals and the second set of sub-band output signals, and applying a time delay determination function to the cross-correlation to determine a time delay estimation.

    摘要翻译: 由物理计算系统执行的用于时间延迟估计的方法包括将由第一传感器获得的第一输​​入信号通过滤波器组以形成第一组子带输出信号,将由第二传感器获得的第二输入信号通过 所述滤波器组形成第二组子带输出信号,所述第二传感器与所述第一传感器放置一距离,计算所述第一组子带输出信号与所述第二组子带输出之间的互相关数据 信号,并将时间延迟确定功能应用于互相关以确定时间延迟估计。

    TIME DELAY ESTIMATION
    5.
    发明申请
    TIME DELAY ESTIMATION 有权
    时间延迟估计

    公开(公告)号:US20130034138A1

    公开(公告)日:2013-02-07

    申请号:US13204042

    申请日:2011-08-05

    IPC分类号: H04B17/00

    摘要: A method for time delay estimation performed by a physical computing system includes passing a first input signal obtained by a first sensor through a filter bank to form a first set of sub-band output signals, passing a second input signal obtained by a second sensor through the filter bank to form a second set of sub-band output signals, the second sensor placed a distance from the first sensor, computing cross-correlation data between the first set of sub-band output signals and the second set of sub-band output signals, and applying a time delay determination function to the cross-correlation to determine a time delay estimation.

    摘要翻译: 由物理计算系统执行的用于时间延迟估计的方法包括将由第一传感器获得的第一输​​入信号通过滤波器组以形成第一组子带输出信号,将由第二传感器获得的第二输入信号通过 所述滤波器组形成第二组子带输出信号,所述第二传感器与所述第一传感器放置一距离,计算所述第一组子带输出信号与所述第二组子带输出之间的互相关数据 信号,并将时间延迟确定功能应用于互相关以确定时间延迟估计。

    System And Method For Determining The Active Talkers In A Video Conference
    6.
    发明申请
    System And Method For Determining The Active Talkers In A Video Conference 有权
    在视频会议中确定主动演讲者的系统和方法

    公开(公告)号:US20110093273A1

    公开(公告)日:2011-04-21

    申请号:US12580958

    申请日:2009-10-16

    IPC分类号: G10L11/00 H04N7/15

    摘要: The present invention describes a method of determining the active talker for display on a video conferencing system, including the steps of: for each participant, capturing audio data using an audio capture sensor and video data using a video capture sensor; determining the probability of active speech (pA, pB . . . pN), where the probability of active speech is a function of the probability of soft voice detection captured by the audio capture sensor and the probability of lip motion detection captured by the video capture sensor; and automatically displaying at least the participant that has the highest probability of active speech.

    摘要翻译: 本发明描述了一种确定用于在视频会议系统上显示的主动讲话者的方法,包括以下步骤:对于每个参与者,使用音频捕获传感器和使用视频捕获传感器的视频数据捕获音频数据; 确定活动语音(pA,pB。pN)的概率,其中活动语音的概率是由音频捕获传感器捕获的软声音检测的概率的函数以及由视频捕获捕获的唇部运动检测的概率 传感器; 并自动显示至少具有最高有效语音概率的参与者。

    Systems and methods for performing sound source localization
    7.
    发明授权
    Systems and methods for performing sound source localization 有权
    用于执行声源定位的系统和方法

    公开(公告)号:US08553904B2

    公开(公告)日:2013-10-08

    申请号:US12904921

    申请日:2010-10-14

    IPC分类号: H04R3/00

    摘要: Systems and methods for performing sound source localization are provided. In one aspect, a method for locating a sound source using a computing device subdivides a space into subregions. The method then computes a sound source power for each of subregions and determines which of the sound source energies is the largest. When the volume of the subregion is less than a threshold volume, the method outputs the subregion having the largest sound source power. Otherwise, the stages of partitioning, computing, and determining the subregion having the largest sound source power is repeated.

    摘要翻译: 提供了用于执行声源定位的系统和方法。 一方面,使用计算设备定位声源的方法将空间细分为子区域。 该方法然后计算每个子区域的声源功率,并确定哪个声源能量最大。 当该子区域的音量小于阈值音量时,该方法输出具有最大声源功率的子区域。 否则,重复分区,计算和确定具有最大声源功率的子区域的阶段。

    System and method for distributed meeting capture
    8.
    发明授权
    System and method for distributed meeting capture 有权
    分布式会议捕获的系统和方法

    公开(公告)号:US08451315B2

    公开(公告)日:2013-05-28

    申请号:US12956033

    申请日:2010-11-30

    申请人: Bowon Lee

    发明人: Bowon Lee

    IPC分类号: H04N7/15

    CPC分类号: H04N7/15 H04N7/142

    摘要: Embodiments of the present invention disclose a system and method for distributed meeting capture. According to one embodiment, the system includes a plurality of personal devices configured to capture video data and audio data associated with at least one operating user. A media hub includes a plurality of I/O ports and is configured to receive video and audio data from the plurality of personal devices. In addition, the media hub is configured to collect the video data and/or audio data from the plurality of personal devices and output at least one audio-visual data stream for facilitating video conferencing over a network.

    摘要翻译: 本发明的实施例公开了一种用于分布式会议捕获的系统和方法。 根据一个实施例,系统包括被配置为捕获与至少一个操作用户相关联的视频数据和音频数据的多个个人设备。 媒体集线器包括多个I / O端口,并且被配置为从多个个人设备接收视频和音频数据。 此外,媒体集线器被配置为从多个个人设备收集视频数据和/或音频数据,并输出至少一个视听数据流,以促进通过网络的视频会议。

    Methods and systems for blind dereverberation
    9.
    发明授权
    Methods and systems for blind dereverberation 有权
    盲目混响的方法和系统

    公开(公告)号:US08218780B2

    公开(公告)日:2012-07-10

    申请号:US12484686

    申请日:2009-06-15

    IPC分类号: H04B3/20 H03G3/00

    CPC分类号: H04M9/082

    摘要: Various embodiments of the present invention are directed to methods for dereverberation of audio generated in a room. In one aspect, a method for dereverberating reverberant digital signals comprises transforming a reverberant digital signal from the time domain into Fourier domain signals using a computing device, each Fourier domain signal corresponding to a subband. For each subband of the Fourier domain signal, the method computes autoregressive model coefficients of the reverberation with the current and previous magnitudes of the Fourier digital signal, and inverse filters the magnitude of the Fourier domain signal using the computing device, based on the autoregressive model coefficients and previous magnitudes of the Fourier digital signal. The method includes inverse transforming the Fourier domain signals with filtered magnitudes into an approximate dereverberated digital signal.

    摘要翻译: 本发明的各种实施例涉及用于在室内产生的音频的混响的方法。 一方面,一种用于去混响混响数字信号的方法包括使用计算装置将混响数字信号从时域变换成傅立叶域信号,每个傅立叶域信号对应于子带。 对于傅立叶域信号的每个子带,该方法利用傅里叶数字信号的当前和先前幅度来计算混响的自回归模型系数,并且使用计算装置基于自回归模型对傅立叶域信号的幅度进行滤波 傅里叶数字信号的系数和先前幅度。 该方法包括将具有滤波幅度的傅立叶域信号逆变换为近似的非反相数字信号。

    METHODS AND SYSTEMS FOR BLIND DEREVERBERATION
    10.
    发明申请
    METHODS AND SYSTEMS FOR BLIND DEREVERBERATION 有权
    BLIND DEREVERBERATION的方法和系统

    公开(公告)号:US20100316228A1

    公开(公告)日:2010-12-16

    申请号:US12484686

    申请日:2009-06-15

    IPC分类号: H04B3/20

    CPC分类号: H04M9/082

    摘要: Various embodiments of the present invention are directed to methods for dereverberation of audio generated in a room. In one aspect, a method for dereverberating reverberant digital signals comprises transforming a reverberant digital signal from the time domain into Fourier domain signals using a computing device, each Fourier domain signal corresponding to a subband. For each subband of the Fourier domain signal, the method computes autoregressive model coefficients of the reverberation with the current and previous magnitudes of the Fourier digital signal, and inverse filters the magnitude of the Fourier domain signal using the computing device, based on the autoregressive model coefficients and previous magnitudes of the Fourier digital signal. The method includes inverse transforming the Fourier domain signals with filtered magnitudes into an approximate dereverberated digital signal.

    摘要翻译: 本发明的各种实施例涉及用于在室内产生的音频的混响的方法。 一方面,一种用于去混响混响数字信号的方法包括使用计算装置将混响数字信号从时域变换成傅立叶域信号,每个傅立叶域信号对应于子带。 对于傅立叶域信号的每个子带,该方法利用傅里叶数字信号的当前和先前幅度来计算混响的自回归模型系数,并且使用计算装置基于自回归模型对傅立叶域信号的幅度进行滤波 傅里叶数字信号的系数和先前幅度。 该方法包括将具有滤波幅度的傅立叶域信号逆变换为近似的非反相数字信号。