Method and apparatus for passive acoustic source localization for video camera steering applications
    1.
    发明授权
    Method and apparatus for passive acoustic source localization for video camera steering applications 有权
    用于摄像机转向应用的无源声源定位的方法和装置

    公开(公告)号:US06826284B1

    公开(公告)日:2004-11-30

    申请号:US09498561

    申请日:2000-02-04

    IPC分类号: H04R300

    摘要: A real-time passive acoustic source localization system for video camera steering advantageously determines the relative delay between the direct paths of two estimated channel impulse responses. The illustrative system employs an approach referred to herein as the “adaptive eigenvalue decomposition algorithm” (AEDA) to make such a determination, and then advantageously employs a “one-step least-squares algorithm” (OSLS) for purposes of acoustic source localization, providing the desired features of robustness, portability, and accuracy in a reverberant environment. The AEDA technique directly estimates the (direct path) impulse response from the sound source to each of a pair of microphones, and then uses these estimated impulse responses to determine the time delay of arrival (TDOA) between the two microphones by measuring the distance between the first peaks thereof (i.e., the first significant taps of the corresponding transfer functions). In one embodiment, the system minimizes an error function (i.e., a difference) which is computed with the use of two adaptive filters, each such filter being applied to a corresponding one of the two signals received from the given pair of microphones. The filtered signals are then subtracted from one another to produce the error signal, which is minimized by a conventional adaptive filtering algorithm such as, for example, an LMS (Least Mean Squared) technique. Then, the TDOA is estimated by measuring the “distance” (i.e., the time) between the first significant taps of the two resultant adaptive filter transfer functions.

    摘要翻译: 用于摄像机转向的实时无源声源定位系统有助于确定两个估计的信道脉冲响应的直接路径之间的相对延迟。 说明性系统采用这里称为“自适应特征值分解算法”(AEDA)的方法进行这样的确定,然后有利地采用“一步最小二乘算法”(OSLS),用于声源定位, 在混响环境中提供鲁棒性,便携性和准确性的所需特征。 AEDA技术直接估计从声源到一对麦克风中的每一个的(直接路径)脉冲响应,然后使用这些估计的脉冲响应来通过测量两个麦克风之间的距离来确定两个麦克风之间的到达时间延迟(TDOA) 其第一个峰值(即相应的传递函数的第一个有效抽头)。 在一个实施例中,该系统将通过使用两个自适应滤波器计算出的误差函数(即,差)最小化,每个这样的滤波器被应用于从给定的一对麦克风接收的两个信号中的对应的一个信号。 然后将滤波的信号相互减去以产生误差信号,该误差信号通过常规的自适应滤波算法(例如LMS(最小均方))技术被最小化。 然后,通过测量两个结果自适应滤波器传递函数的第一有效抽头之间的“距离”(即,时间)来估计TDOA。

    SYSTEM AND METHOD FOR SINGLE-CHANNEL SPEECH NOISE REDUCTION
    2.
    发明申请
    SYSTEM AND METHOD FOR SINGLE-CHANNEL SPEECH NOISE REDUCTION 失效
    用于单通道语音噪声减少的系统和方法

    公开(公告)号:US20120197636A1

    公开(公告)日:2012-08-02

    申请号:US13018973

    申请日:2011-02-01

    IPC分类号: G10L21/02

    摘要: A system and method may receive a single-channel speech input captured via a microphone. For each current frame of speech input, the system and method may (a) perform a time-frequency transformation on the input signal over L (L>1) frames including the current frame to obtain an extended observation vector of the current frame, data elements in the extended observation vector representing the coefficients of the time-frequency transformation of the L frames of the speech input, (b) compute second-order statistics of the extended observation vector and of noise, and (c) construct a noise reduction filter for the current frame of the speech input based on the second-order statistics of the extended observation vector and the second-order statistics of noise.

    摘要翻译: 系统和方法可以接收通过麦克风捕获的单声道语音输入。 对于每个当前的语音输入帧,系统和方法可以(a)在包括当前帧的L(L> 1)帧上对输入信号执行时间 - 频率变换,以获得当前帧的扩展观测向量,数据 扩展观测矢量中的元素表示语音输入的L帧的时间 - 频率变换的系数,(b)计算扩展观测向量和噪声的二阶统计,以及(c)构造降噪滤波器 对于基于扩展观测向量的二阶统计量和噪声的二阶统计量的语音输入的当前帧。

    System and method for single-channel speech noise reduction
    3.
    发明授权
    System and method for single-channel speech noise reduction 失效
    用于单声道语音降噪的系统和方法

    公开(公告)号:US08583429B2

    公开(公告)日:2013-11-12

    申请号:US13018973

    申请日:2011-02-01

    IPC分类号: G10L21/02 H04B15/00

    摘要: A system and method may receive a single-channel speech input captured via a microphone. For each current frame of speech input, the system and method may (a) perform a time-frequency transformation on the input signal over L (L>1) frames including the current frame to obtain an extended observation vector of the current frame, data elements in the extended observation vector representing the coefficients of the time-frequency transformation of the L frames of the speech input, (b) compute second-order statistics of the extended observation vector and of noise, and (c) construct a noise reduction filter for the current frame of the speech input based on the second-order statistics of the extended observation vector and the second-order statistics of noise.

    摘要翻译: 系统和方法可以接收通过麦克风捕获的单声道语音输入。 对于每个当前的语音输入帧,系统和方法可以(a)在包括当前帧的L(L> 1)帧上对输入信号执行时间 - 频率变换,以获得当前帧的扩展观测向量,数据 扩展观测矢量中的元素表示语音输入的L帧的时间 - 频率变换的系数,(b)计算扩展观测向量和噪声的二阶统计,以及(c)构造降噪滤波器 对于基于扩展观测向量的二阶统计量和噪声的二阶统计量的语音输入的当前帧。

    Data-driven method and apparatus for real-time mixing of multichannel signals in a media server
    4.
    发明申请
    Data-driven method and apparatus for real-time mixing of multichannel signals in a media server 有权
    用于在媒体服务器中实时混合多信道信号的数据驱动方法和装置

    公开(公告)号:US20050286664A1

    公开(公告)日:2005-12-29

    申请号:US10875553

    申请日:2004-06-24

    摘要: An apparatus for mixing audio signals in a voice-over-IP teleconferencing environment comprises a preprocessor, a mixing controller, and a mixing processor. The preprocessor is divided into a media parameter estimator and a media preprocessor. The media parameter estimator estimates signal parameters such as signal-to-noise ratios, energy levels, and voice activity (i.e., the presence or absence of voice in the signal), which are used to control how different channels are mixed. The media preprocessor employs signal processing algorithms such as silence suppression, automatic gain control, and noise reduction, so that the quality of the incoming voice streams is optimized. Based on a function of the estimated signal parameters, the mixing controller specifies a particular mixing strategy and the mixing processor mixes the preprocessed voice streams according the strategy provided by the controller.

    摘要翻译: 一种用于在IP语音电话会议环境中混合音频信号的装置包括预处理器,混合控制器和混合处理器。 预处理器被分为媒体参数估计器和媒体预处理器。 媒体参数估计器估计用于控制如何混合不同信道的信号参数,例如信噪比,能级和语音活动(即信号中的语音的存在或不存在)。 媒体预处理器采用静音抑制,自动增益控制,降噪等信号处理算法,优化了传入语音流的质量。 基于估计的信号参数的函数,混合控制器指定特定的混合策略,并且混合处理器根据控制器提供的策略来混合预处理的语音流。