专利检索 ap:("Huan-Yu Su" OR "Yang Gao") AND inv:"Yang Gao" 第 1 页

1.

发明授权
Adaptive tilt compensation for synthesized speech 有权
标题翻译：合成语音的自适应倾斜补偿

公开(公告)号：US09401156B2

公开(公告)日：2016-07-26

申请号：US12215649

申请日：2008-06-27

申请人： Huan-Yu Su , Yang Gao

发明人： Huan-Yu Su , Yang Gao

IPC分类号： G10L19/09 , G10L19/12 , G10L19/18 , G10L19/20 , G10L25/90 , G10L19/02 , G10L19/00

CPC分类号： G10L19/12 , G10L19/0204 , G10L19/09 , G10L19/18 , G10L19/20 , G10L25/90 , G10L2019/0002 , G10L2019/0016

摘要： There is provided a method of using an adaptive tilt compensation by a speech decoder. The method comprises receiving a bit stream including a plurality of parameters representative of a speech signal; identifying an adaptive code vector and a fixed code vector using the plurality of parameters; scaling the adaptive code vector and the fixed code vector to generate a scaled adaptive code vector and a scaled fixed code vector; summing the scaled adaptive code vector and the scaled fixed code vector to generate a synthesized output; calculating a first reflection coefficient based on the plurality of parameters representative of the speech signal; multiplying the first reflection coefficient by a factor to generate a tilt factor; and applying the tilt factor to the synthesized output based on an encoding bit rate.

摘要翻译： 提供了一种通过语音解码器使用自适应倾斜补偿的方法。该方法包括：接收包括表示语音信号的多个参数的比特流; 使用所述多个参数来识别自适应码矢量和固定码矢量; 缩放自适应码矢量和固定码矢量以生成缩放的自适应码矢量和缩放的固定码矢量; 对经缩放的自适应码矢量和缩放的固定码矢量求和以产生合成输出; 基于表示所述语音信号的多个参数来计算第一反射系数; 将第一反射系数乘以因子以产生倾斜因子; 以及基于编码比特率将所述倾斜因子应用于所述合成输出。

2.

发明申请
Adaptive tilt compensation for synthesized speech 有权
标题翻译：合成语音的自适应倾斜补偿

公开(公告)号：US20080294429A1

公开(公告)日：2008-11-27

申请号：US12215649

申请日：2008-06-27

申请人： Huan-Yu Su , Yang Gao

发明人： Huan-Yu Su , Yang Gao

IPC分类号： G10L19/12

CPC分类号： G10L19/12 , G10L19/0204 , G10L19/09 , G10L19/18 , G10L19/20 , G10L25/90 , G10L2019/0002 , G10L2019/0016

摘要： There is provided a method of using an adaptive tilt compensation by a speech decoder. The method comprises receiving a bit stream including a plurality of parameters representative of a speech signal; identifying an adaptive code vector and a fixed code vector using the plurality of parameters; scaling the adaptive code vector and the fixed code vector to generate a scaled adaptive code vector and a scaled fixed code vector; summing the scaled adaptive code vector and the scaled fixed code vector to generate a synthesized output; calculating a first reflection coefficient based on the plurality of parameters representative of the speech signal; multiplying the first reflection coefficient by a factor to generate a tilt factor; and applying the tilt factor to the synthesized output based on an encoding bit rate.

摘要翻译： 提供了一种通过语音解码器使用自适应倾斜补偿的方法。该方法包括：接收包括表示语音信号的多个参数的比特流; 使用所述多个参数来识别自适应码矢量和固定码矢量; 缩放自适应码矢量和固定码矢量以生成缩放的自适应码矢量和缩放的固定码矢量; 对经缩放的自适应码矢量和缩放的固定码矢量求和以产生合成输出; 基于表示所述语音信号的多个参数来计算第一反射系数; 将第一反射系数乘以因子以产生倾斜因子; 以及基于编码比特率将所述倾斜因子应用于所述合成输出。

3.

发明申请
Selection of preferential pitch value for speech processing 审中-公开
标题翻译：选择语音处理的优先音调值

公开(公告)号：US20080288246A1

公开(公告)日：2008-11-20

申请号：US12220480

申请日：2008-07-23

申请人： Huan-Yu Su , Yang Gao

发明人： Huan-Yu Su , Yang Gao

IPC分类号： G10L11/04

CPC分类号： G10L19/12 , G10L19/0204 , G10L19/09 , G10L19/18 , G10L19/20 , G10L25/90 , G10L2019/0002 , G10L2019/0016

摘要： There is provided a method of using a processing circuitry for selecting a preferential pitch lag value from a plurality of pitch lag values, including a first pitch lag value and a second pitch lag value, for coding an input speech signal. The method comprises determining a first timing relationship between a previous pitch lag value and at least one of the plurality of pitch lag values; determining a second timing relationship between the first pitch lag value and the second pitch lag value; favoring one of the first pitch lag value and the second pitch lag value based on the first timing relationship and the second timing relationship to select one of the first pitch lag value and the second pitch lag value as the preferential pitch lag value; and converting the input speech signal into an encoded speech using the preferential pitch lag value.

摘要翻译： 提供了一种使用处理电路的方法，用于从包括第一音调滞后值和第二音调滞后值的多个音调滞后值中选择用于编码输入语音信号的优先音调滞后值。该方法包括确定先前的音调滞后值与多个音调滞后值中的至少一个之间的第一定时关系; 确定所述第一音调滞后值和所述第二音调滞后值之间的第二定时关系; 基于第一定时关系和第二定时关系，优选第一音调滞后值和第二音调滞后值中的一个，以选择第一音调滞后值和第二音调滞后值之一作为优先音调滞后值; 以及使用优先音调滞后值将输入语音信号转换为编码语音。

4.

发明申请
System for speech encoding having an adaptive encoding arrangement 审中-公开
标题翻译：具有自适应编码装置的语音编码系统

公开(公告)号：US20070255561A1

公开(公告)日：2007-11-01

申请号：US11827915

申请日：2007-07-12

申请人： Huan-Yu Su , Yang Gao

发明人： Huan-Yu Su , Yang Gao

IPC分类号： G10L21/00

CPC分类号： G10L19/12 , G10L19/0204 , G10L19/09 , G10L19/18 , G10L19/20 , G10L25/90 , G10L2019/0002 , G10L2019/0016

摘要： In accordance with one aspect of the invention, a selector supports the selection of a first encoding scheme or the second encoding scheme based upon the detection or absence of the triggering characteristic in the interval of the input speech signal. The first encoding scheme has a pitch pre-processing procedure for processing the input speech signal to form a revised speech signal biased toward an ideal voiced and stationary characteristic. The pre-processing procedure allows the encoder to fully capture the benefits of a bandwidth-efficient, long-term predictive procedure for a greater amount of speech components of an input speech signal than would otherwise be possible. In accordance with another aspect of the invention, the second encoding scheme entails a long-term prediction mode for encoding the pitch on a sub-frame by sub-frame basis. The long-term prediction mode is tailored to where the generally periodic component of the speech is generally not stationary or less than completely periodic and requires greater frequency of updates from the adaptive codebook to achieve a desired perceptual quality of the reproduced speech under a long-term predictive procedure.

摘要翻译： 根据本发明的一个方面，选择器基于输入语音信号的间隔中的触发特性的检测或不存在，支持选择第一编码方案或第二编码方案。第一编码方案具有用于处理输入语音信号以形成偏向理想有声和静态特征的修正语音信号的音调预处理过程。预处理过程允许编码器完全捕获带宽有效的长期预测程序的优点，用于输入语音信号的大量语音分量比否则可能的更多。根据本发明的另一方面，第二编码方案需要一种长期预测模式，用于以子帧为基础对子帧上的音调进行编码。长期预测模式被定制为语音的大致周期性分量通常不是静止的或小于完全周期性的，并且需要来自自适应码本的更高频率的更新以在长时间内实现再现语音的期望感知质量，术语预测程序。

5.

发明授权
Coding based on spectral content of a speech signal 有权
标题翻译：基于语音信号的频谱内容进行编码

公开(公告)号：US06937979B2

公开(公告)日：2005-08-30

申请号：US09896682

申请日：2001-06-29

申请人： Yang Gao , Huan-Yu Su

发明人： Yang Gao , Huan-Yu Su

IPC分类号： G10L19/14 , G10L21/02 , G10L19/00

CPC分类号： G10L19/265 , G10L19/18 , G10L21/0364

摘要： In a coding procedure, a spectral content of a speech signal is estimated. A preferential coding algorithm or preferential value of at least one coding parameter is selected based on the estimated spectral content of the speech signal. The speech signal is coded in accordance with the selected coding algorithm or the selected coding parameter to control the operation of one or more of the following: a pre-processing filter, a post-processing filter, a coding control coefficient, a weighting filter, a synthesis filter, and a quantization table.

摘要翻译： 在编码过程中，估计语音信号的频谱内容。基于所估计的语音信号的频谱内容来选择优选编码算法或至少一个编码参数的优先值。语音信号根据所选择的编码算法或选择的编码参数进行编码，以控制以下一个或多个的操作：预处理滤波器，后处理滤波器，编码控制系数，加权滤波器，合成滤波器和量化表。

6.

发明授权
Codebook tables for multi-rate encoding and decoding with pre-gain and delayed-gain quantization tables 有权
标题翻译：用于具有预增益和延迟增益量化表的多速率编码和解码的码表

公开(公告)号：US06757649B1

公开(公告)日：2004-06-29

申请号：US10409404

申请日：2003-04-08

申请人： Yang Gao , Adil Benyassine , Jes Thyssen , Eyal Shlomot , Huan-yu Su

发明人： Yang Gao , Adil Benyassine , Jes Thyssen , Eyal Shlomot , Huan-yu Su

IPC分类号： G10L1912

CPC分类号： G10L19/00 , G10L19/167 , G10L19/24 , H03G3/00

摘要： A speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed. The speech compression system optimizes the bandwidth consumed by the bitstream by balancing the desired average bit rate with the perceptual quality of the reconstructed speech. The speech compression system comprises a full-rate codec, a half-rate codec, a quarter-rate codec and an eighth-rate codec. The codecs are selectively activated based on a rate selection. In addition, the full and half-rate codecs are selectively activated based on a type classification. Each codec is selectively activated to encode and decode the speech signals at different bit rates emphasizing different aspects of the speech signal to enhance overall quality of the synthesized speech.

摘要翻译： 公开了能够将语音信号编码为比特流以进行后续解码以产生合成语音的语音压缩系统。语音压缩系统通过将期望的平均比特率与重构语音的感知质量进行平衡来优化比特流消耗的带宽。语音压缩系统包括全速率编解码器，半速率编解码器，四分之一速率编解码器和八速率编解码器。基于速率选择来选择性地激活编解码器。此外，基于类型分类，全速率和半速率编解码器被选择性地激活。选择性地激活每个编解码器以以强调语音信号的不同方面的不同比特率对语音信号进行编码和解码，以增强合成语音的整体质量。

7.

发明申请
Pitch determination for speech processing 审中-公开
标题翻译：语音处理的音调确定

公开(公告)号：US20080147384A1

公开(公告)日：2008-06-19

申请号：US12069973

申请日：2008-02-14

申请人： Huan-Yu Su , Yang Gao

发明人： Huan-Yu Su , Yang Gao

IPC分类号： G10L11/04

CPC分类号： G10L19/12 , G10L19/0204 , G10L19/09 , G10L19/18 , G10L19/20 , G10L25/90 , G10L2019/0002 , G10L2019/0016

摘要： There is provided a method of selecting a pitch lag value for a portion of a speech signal, the method comprising: computing a weighted correlation function of the portion of the speech signal for a range of delay times, wherein the weighting of the correlation function depends on both the delay time and a characteristic of one or more previous portions of the speech signal; and selecting the pitch lag value based on a delay time from the range of delay times that maximizes the weighted correlation function.

摘要翻译： 提供了一种为语音信号的一部分选择音调滞后值的方法，所述方法包括：在延迟时间范围内计算语音信号部分的加权相关函数，其中相关函数的权重取决于在延迟时间和语音信号的一个或多个先前部分的特性上; 以及从加权相关函数最大化的延迟时间的范围内，基于延迟时间选择音调滞后值。

8.

发明授权
Pitch determination based on weighting of pitch lag candidates 有权
标题翻译：基于音调滞后候选的加权的音调确定

公开(公告)号：US07266493B2

公开(公告)日：2007-09-04

申请号：US11251179

申请日：2005-10-13

申请人： Huan-Yu Su , Yang Gao

发明人： Huan-Yu Su , Yang Gao

IPC分类号： G10L11/04

CPC分类号： G10L19/12 , G10L19/0204 , G10L19/09 , G10L19/18 , G10L19/20 , G10L25/90 , G10L2019/0002 , G10L2019/0016

摘要： There is provided a method of selecting a pitch lag value from a plurality of pitch lag candidates for coding a speech signal. The method comprises identifying the plurality of pitch lag candidates from a frame of the speech signal using correlation; classifying the speech signal to obtain a voice classification; determining whether one or more of the plurality of pitch lag candidates are in a temporal neighborhood of one or more previous pitch lag values; favoring the one or more of the plurality of pitch lag candidates determined to be in the temporal neighborhood of the one or more previous pitch lag values, by adaptive weighting, over other ones of the plurality of pitch lag candidates; and selecting the pitch lag value based on the voice classification and the one or more of the plurality of pitch lag candidates favored by the adaptive weighting.

摘要翻译： 提供了一种从用于编码语音信号的多个音调滞后候选中选择音调滞后值的方法。该方法包括使用相关性从语音信号的帧中识别多个音调滞后候选; 对语音信号进行分类以获得语音分类; 确定所述多个音调滞后候选中的一个或多个是否在一个或多个先前音调滞后值的时间邻域中; 通过对多个音调滞后候选中的其他音调滞后候选，通过自适应加权来确定被确定为处于一个或多个先前音调滞后值的时间邻域中的多个音调滞后候选中的一个或多个; 以及基于所述语音分类和由所述自适应加权优选的所述多个音调滞后候选中的一个或多个来选择所述音调滞后值。

9.

发明授权
Encoding and decoding speech signals variably based on signal classification 有权
标题翻译：基于信号分类对语音信号进行编码和解码

公开(公告)号：US06735567B2

公开(公告)日：2004-05-11

申请号：US10409430

申请日：2003-04-08

申请人： Yang Gao , Adil Benyassine , Jes Thyssen , Eyal Shlomot , Huan-yu Su

发明人： Yang Gao , Adil Benyassine , Jes Thyssen , Eyal Shlomot , Huan-yu Su

IPC分类号： G10L1304

CPC分类号： G10L19/00 , G10L19/167 , G10L19/24 , H03G3/00

摘要： A speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed. The speech compression system optimizes the bandwidth consumed by the bitstream by balancing the desired average bit rate with the perceptual quality of the reconstructed speech. The speech compression system comprises a full-rate codec, a half-rate codec, a quarter-rate codec and an eighth-rate codec. The codecs are selectively activated based on a rate selection. In addition, the full and half-rate codecs are selectively activated based on a type classification. Each codec is selectively activated to encode and decode the speech signals at different bit rates emphasizing different aspects of the speech signal to enhance overall quality of the synthesized speech.

摘要翻译： 公开了能够将语音信号编码为比特流以进行后续解码以产生合成语音的语音压缩系统。语音压缩系统通过将期望的平均比特率与重构语音的感知质量进行平衡来优化比特流消耗的带宽。语音压缩系统包括全速率编解码器，半速率编解码器，四分之一速率编解码器和八速率编解码器。基于速率选择来选择性地激活编解码器。此外，基于类型分类，全速率和半速率编解码器被选择性地激活。选择性地激活每个编解码器以以强调语音信号的不同方面的不同比特率对语音信号进行编码和解码，以增强合成语音的整体质量。

10.

发明授权
Bitstream protocol for transmission of encoded voice signals 有权
标题翻译：用于传输编码语音信号的比特流协议

公开(公告)号：US06581032B1

公开(公告)日：2003-06-17

申请号：US09662828

申请日：2000-09-15

申请人： Yang Gao , Adil Benyassine , Jes Thyssen , Eyal Shlomot , Huan-yu Su

发明人： Yang Gao , Adil Benyassine , Jes Thyssen , Eyal Shlomot , Huan-yu Su

IPC分类号： G10L1912

CPC分类号： G10L19/00 , G10L19/167 , G10L19/24 , H03G3/00

摘要： A speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed. The speech compression system optimizes the bandwidth consumed by the bitstream by balancing the desired average bit rate with the perceptual quality of the reconstructed speech. The speech compression system comprises a full-rate codec, a half-rate codec, a quarter-rate codec and an eighth-rate codec. The codecs are selectively activated based on a rate selection. In addition, the full and half-rate codecs are selectively activated based on a type classification. Each codec is selectively activated to encode and decode the speech signals at different bit rates emphasizing different aspects of the speech signal to enhance overall quality of the synthesized speech.

摘要翻译： 公开了能够将语音信号编码为比特流以进行后续解码以产生合成语音的语音压缩系统。语音压缩系统通过将期望的平均比特率与重构语音的感知质量进行平衡来优化比特流消耗的带宽。语音压缩系统包括全速率编解码器，半速率编解码器，四分之一速率编解码器和八速率编解码器。基于速率选择来选择性地激活编解码器。此外，基于类型分类，全速率和半速率编解码器被选择性地激活。选择性地激活每个编解码器以以强调语音信号的不同方面的不同比特率对语音信号进行编码和解码，以增强合成语音的整体质量。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类