Low bit-rate speech coder using adaptive open-loop subframe pitch lag estimation and vector quantization
    1.
    发明授权
    Low bit-rate speech coder using adaptive open-loop subframe pitch lag estimation and vector quantization 有权
    使用自适应开环子帧间距滞后估计和矢量量化的低比特率语音编码器

    公开(公告)号:US06345248B1

    公开(公告)日:2002-02-05

    申请号:US09433002

    申请日:1999-11-02

    IPC分类号: G10L1912

    摘要: A pitch lag coding device and method using interframe correlation inherent in pitch lag values to reduce coding bit requirements. A pitch lag value is extracted for a given speech frame, and then refined for each subframe. For every speech frame having N samples of speech, LPC analysis and vector quantization are performed for the whole coding frame. The LPC residual obtained for each frame is then processed such that pitch lag values for all subframes within the coding frame are analyzed concurrently. The remaining coding parameters, i.e., the codebook search, gain parameters, and excitation signal, are then analyzed sequentially according to their respective subframes.

    摘要翻译: 音调滞后编码装置和方法,使用音调滞后值固有的帧间相关性来减少编码比特要求。 为给定的语音帧提取音调滞后值,然后针对每个子帧进行细化。 对于具有N个语音样本的每个语音帧,对于整个编码帧执行LPC分析和矢量量化。 然后对每个帧获得的LPC残差进行处理,使得同时分析编码帧内的所有子帧的音调滞后值。 然后根据其各自的子帧依次分析剩余的编码参数,即码本搜索,增益参数和激励信号。

    Low bit rate speech coder using adaptive open-loop subframe pitch lag
estimation and vector quantization
    2.
    发明授权
    Low bit rate speech coder using adaptive open-loop subframe pitch lag estimation and vector quantization 失效
    使用自适应开环子帧间距滞后估计和矢量量化的低比特率语音编码器

    公开(公告)号:US6014622A

    公开(公告)日:2000-01-11

    申请号:US721410

    申请日:1996-09-26

    CPC分类号: G10L19/08 G10L19/06

    摘要: A pitch lag coding device and method using interframe correlation inherent in pitch lag values to reduce coding bit requirements. A pitch lag value is extracted for a given speech frame, and then refined for each subframe. For every speech frame having N samples of speech, LPC analysis and vector quantization are performed for the whole coding frame. The LPC residual obtained for each frame is then processed such that pitch lag values for all subframes within the coding frame are analyzed concurrently. The remaining coding parameters, i.e., the codebook search, gain parameters, and excitation signal, are then analyzed sequentially according to their respective subframes.

    摘要翻译: 音调滞后编码装置和方法,使用音调滞后值固有的帧间相关性来减少编码比特要求。 为给定的语音帧提取音调滞后值,然后针对每个子帧进行细化。 对于具有N个语音样本的每个语音帧,对于整个编码帧执行LPC分析和矢量量化。 然后对每个帧获得的LPC残差进行处理,使得同时分析编码帧内的所有子帧的音调滞后值。 然后根据其各自的子帧依次分析剩余的编码参数,即码本搜索,增益参数和激励信号。

    Adaptive multi-microphone beamforming

    公开(公告)号:US10366701B1

    公开(公告)日:2019-07-30

    申请号:US15681395

    申请日:2017-08-20

    申请人: Huan-Yu Su

    发明人: Huan-Yu Su

    摘要: Provided is a method and computer program product for producing an enhanced audio signal for an output device from audio signals received by 2 or more microphones in close proximity to each other. For example, one embodiment of the present invention comprises the steps of receiving a first input audio signal from the first microphone, digitizing the first input audio signal to produce a first digitized audio input signal, receiving a second input audio input signal from the second microphone, digitizing the second input audio input signal to produce a second digitized audio input signal, using the first digitized audio input signal as a reference signal to an adaptive prediction filter, using the second digitized audio input signal as input to said adaptive prediction filter and finally adding a prediction result signal from the adaptive prediction filter to the first digitized audio input signal to produce the enhanced audio signal. In other embodiments, any number of microphones can be used, and in all embodiments there is no requirement to detect or locate the source or direction of arrival of the input audio signals.

    Detecting and reporting a loss of connection by a telephone
    4.
    发明授权
    Detecting and reporting a loss of connection by a telephone 有权
    通过电话检测和报告连接丢失

    公开(公告)号:US07796623B2

    公开(公告)日:2010-09-14

    申请号:US12384019

    申请日:2009-03-30

    IPC分类号: H04L12/28 H04M3/22

    摘要: There is provided a method of detecting and reporting poor voice quality for use by a gateway device. The method comprises facilitating a connection between a telephone and a remote telephone via a network, and detecting a poor voice quality indictor during the connection. The method further comprises capturing, for a pre-determined period of time, telephone voice data being exchanged between the gateway and the telephone, network voice data being exchanged between the gateway and the network, and gateway parameters. The method also comprises packetizing the telephone voice data, the network voice data and the gateway parameters into a plurality packets having a network address of a network storage, and transmitting the plurality packets destined for the network storage via the network. In one aspect, the poor voice quality indictor may be generated by a user of the telephone in response to a poor voice quality of the connection.

    摘要翻译: 提供了一种检测和报告由网关设备使用的较差语音质量的方法。 该方法包括通过网络促进电话和远程电话之间的连接,以及在连接期间检测不良语音质量指示符。 该方法还包括:在预定时间段内,捕获在网关与电话之间交换的电话语音数据,网关和网络之间交换的网络语音数据以及网关参数。 该方法还包括将电话语音数据,网络语音数据和网关参数分组成具有网络存储器的网络地址的多个分组,并且经由网络发送去往网络存储的多个分组。 在一个方面,响应于连接的差的语音质量,可能由电话的用户产生差的语音质量指示符。

    Pitch determination for speech processing
    5.
    发明申请
    Pitch determination for speech processing 审中-公开
    语音处理的音调确定

    公开(公告)号:US20080147384A1

    公开(公告)日:2008-06-19

    申请号:US12069973

    申请日:2008-02-14

    申请人: Huan-Yu Su Yang Gao

    发明人: Huan-Yu Su Yang Gao

    IPC分类号: G10L11/04

    摘要: There is provided a method of selecting a pitch lag value for a portion of a speech signal, the method comprising: computing a weighted correlation function of the portion of the speech signal for a range of delay times, wherein the weighting of the correlation function depends on both the delay time and a characteristic of one or more previous portions of the speech signal; and selecting the pitch lag value based on a delay time from the range of delay times that maximizes the weighted correlation function.

    摘要翻译: 提供了一种为语音信号的一部分选择音调滞后值的方法,所述方法包括:在延迟时间范围内计算语音信号部分的加权相关函数,其中相关函数的权重取决于 在延迟时间和语音信号的一个或多个先前部分的特性上; 以及从加权相关函数最大化的延迟时间的范围内,基于延迟时间选择音调滞后值。

    Pitch determination based on weighting of pitch lag candidates
    6.
    发明授权
    Pitch determination based on weighting of pitch lag candidates 有权
    基于音调滞后候选的加权的音调确定

    公开(公告)号:US07266493B2

    公开(公告)日:2007-09-04

    申请号:US11251179

    申请日:2005-10-13

    申请人: Huan-Yu Su Yang Gao

    发明人: Huan-Yu Su Yang Gao

    IPC分类号: G10L11/04

    摘要: There is provided a method of selecting a pitch lag value from a plurality of pitch lag candidates for coding a speech signal. The method comprises identifying the plurality of pitch lag candidates from a frame of the speech signal using correlation; classifying the speech signal to obtain a voice classification; determining whether one or more of the plurality of pitch lag candidates are in a temporal neighborhood of one or more previous pitch lag values; favoring the one or more of the plurality of pitch lag candidates determined to be in the temporal neighborhood of the one or more previous pitch lag values, by adaptive weighting, over other ones of the plurality of pitch lag candidates; and selecting the pitch lag value based on the voice classification and the one or more of the plurality of pitch lag candidates favored by the adaptive weighting.

    摘要翻译: 提供了一种从用于编码语音信号的多个音调滞后候选中选择音调滞后值的方法。 该方法包括使用相关性从语音信号的帧中识别多个音调滞后候选; 对语音信号进行分类以获得语音分类; 确定所述多个音调滞后候选中的一个或多个是否在一个或多个先前音调滞后值的时间邻域中; 通过对多个音调滞后候选中的其他音调滞后候选,通过自适应加权来确定被确定为处于一个或多个先前音调滞后值的时间邻域中的多个音调滞后候选中的一个或多个; 以及基于所述语音分类和由所述自适应加权优选的所述多个音调滞后候选中的一个或多个来选择所述音调滞后值。

    Complexity resource manager for multi-channel speech processing
    7.
    发明授权
    Complexity resource manager for multi-channel speech processing 有权
    用于多声道语音处理的复杂性资源管理器

    公开(公告)号:US07080010B2

    公开(公告)日:2006-07-18

    申请号:US10911118

    申请日:2004-08-03

    IPC分类号: G10L19/02

    CPC分类号: G10L15/285

    摘要: A multi-channel speech processor for encoding speech in a packet network environment is disclosed. In one illustrative aspect, a complexity resource manager (CRM) is executed by a controller or processor. The CRM manages the level of complexity of encoding which is used by a signal processing unit (SPU) to convert the speech signal into packet data. In general, the CRM determines the level of complexity of encoding based on a calculated complexity budget, where the complexity budget is determined based on the time required to process prior speech signal channels and the time available to process the remaining channels. In this way, the CRM is able to control the overall complexity of the speech processor through its ability to signal the SPU to encode speech signal in a complexity reduced mode based on the calculated complexity budget under certain conditions.

    摘要翻译: 公开了一种用于在分组网络环境中编码语音的多声道语音处理器。 在一个说明性方面,复杂性资源管理器(CRM)由控制器或处理器执行。 CRM管理由信号处理单元(SPU)用于将语音信号转换成分组数据的编码的复杂程度。 通常,CRM基于计算的复杂度预算确定编码的复杂程度,其中基于处理先前语音信号信道所需的时间和可用于处理剩余信道的时间来确定复杂度预算。 以这种方式,CRM能够通过其在特定条件下基于计算的复杂度预算在复杂度降低模式下对SPU进行信号编码语音信号的能力来控制语音处理器的总体复杂性。

    Using signal to noise ratio of a speech signal to adjust thresholds for extracting speech parameters for coding the speech signal
    8.
    发明授权
    Using signal to noise ratio of a speech signal to adjust thresholds for extracting speech parameters for coding the speech signal 有权
    使用语音信号的信噪比来调整用于提取用于编码语音信号的语音参数的阈值

    公开(公告)号:US06898566B1

    公开(公告)日:2005-05-24

    申请号:US09640841

    申请日:2000-08-16

    摘要: There are provided speech coding methods and systems for estimating a plurality of speech parameters of a speech signal for coding the speech signal using one of a plurality of speech coding algorithms, the plurality of speech parameters includes pitch information, the plurality of speech parameters is calculated using a plurality of thresholds. An example method includes estimating a background noise level in the speech signal to determine a signal to noise ratio (SNR) for the speech signal, adjusting one or more of the plurality of thresholds based on the SNR to generate one or more SNR adjusted thresholds, analyzing the speech signal to extract the pitch information using the one or more SNR adjusted thresholds, and repeating the estimating, the adjusting and the analyzing to code the speech signal using one the plurality of speech coding algorithms.

    摘要翻译: 提供了语音编码方法和系统,用于使用多种语音编码算法中的一种来估计用于对语音信号进行编码的语音信号的多个语音参数,所述多个语音参数包括音调信息,所述多个语音参数被计算 使用多个阈值。 示例性方法包括估计语音信号中的背景噪声电平以确定语音信号的信噪比(SNR),基于SNR调整多个阈值中的一个或多个阈值以产生一个或多个SNR调整阈值, 分析语音信号以使用一个或多个SNR调整的阈值提取音调信息,并且使用多个语音编码算法中的一个重复对该语音信号的估计,调整和分析。

    Flexible variable rate vocoder for wireless communication systems
    9.
    发明授权
    Flexible variable rate vocoder for wireless communication systems 有权
    用于无线通信系统的灵活可变速率声码器

    公开(公告)号:US06856954B1

    公开(公告)日:2005-02-15

    申请号:US09627375

    申请日:2000-07-28

    申请人: Huan-Yu Su

    发明人: Huan-Yu Su

    CPC分类号: H04L1/0014

    摘要: A flexible variable rate vocoder and related method of operation. The vocoder selects a target average data rate responsive to at least one network parameter and at least one external parameter.

    摘要翻译: 灵活的可变速率声码器及相关操作方法。 声码器响应于至少一个网络参数和至少一个外部参数来选择目标平均数据速率。

    Intelligent discontinuous transmission and comfort noise generation scheme for pulse code modulation speech coders
    10.
    发明授权
    Intelligent discontinuous transmission and comfort noise generation scheme for pulse code modulation speech coders 有权
    用于脉码调制语音编码器的智能不连续传输和舒适噪声生成方案

    公开(公告)号:US06510409B1

    公开(公告)日:2003-01-21

    申请号:US09484731

    申请日:2000-01-18

    申请人: Huan-Yu Su

    发明人: Huan-Yu Su

    IPC分类号: G10L1102

    CPC分类号: G10L19/012 G10L25/78

    摘要: A fully backward compatible intelligent discontinued transmission (DTX) and comfort noise generation (CNG) scheme that is operable in pulse code modulation (PCM) speech coding systems. The scheme, for example, provides a speech encoder comprising a speech signal analysis circuitry configured to calculates a predetermined plurality of parameters from the speech signal, a voice activity detector configured to determine voice activity in the speech signal, where the speech encoder enters a discontinued transmission mode of the voice activity detector does not detect voice activity, and a transmitter configured to transmit one or more speech samples of the speech signal after the speech encoder enters the discontinued transmission mode, where the one or more speech samples are capable of use by a remote speech decoder to extract a parameter from the one or more speech samples in order generate a background noise base on the parameter.

    摘要翻译: 完全向后兼容的智能中断传输(DTX)和舒适噪声生成(CNG)方案,其可在脉冲编码调制(PCM)语音编码系统中操作。 该方案例如提供了语音编码器,其包括语音信号分析电路,该语音信号分析电路经配置以从语音信号计算预定的多个参数;语音活动检测器,被配置为确定语音信号中的语音活动,其中语音编码器进入中断 语音活动检测器的传输模式不检测语音活动,并且发送器被配置为在语音编码器进入中断传输模式之后发送语音信号的一个或多个语音样本,其中一个或多个语音样本能够由 远程语音解码器,用于从一个或多个语音样本中提取参数,以便根据该参数产生背景噪声。