专利检索 ap:("Yang Gao" OR "Adil Benyassine") AND inv:"Adil Benyassine" 第 1 页

1.

发明授权
Encoding and decoding speech signals variably based on signal classification 有权
标题翻译：基于信号分类对语音信号进行编码和解码

公开(公告)号：US06735567B2

公开(公告)日：2004-05-11

申请号：US10409430

申请日：2003-04-08

申请人： Yang Gao , Adil Benyassine , Jes Thyssen , Eyal Shlomot , Huan-yu Su

发明人： Yang Gao , Adil Benyassine , Jes Thyssen , Eyal Shlomot , Huan-yu Su

IPC分类号： G10L1304

CPC分类号： G10L19/00 , G10L19/167 , G10L19/24 , H03G3/00

摘要： A speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed. The speech compression system optimizes the bandwidth consumed by the bitstream by balancing the desired average bit rate with the perceptual quality of the reconstructed speech. The speech compression system comprises a full-rate codec, a half-rate codec, a quarter-rate codec and an eighth-rate codec. The codecs are selectively activated based on a rate selection. In addition, the full and half-rate codecs are selectively activated based on a type classification. Each codec is selectively activated to encode and decode the speech signals at different bit rates emphasizing different aspects of the speech signal to enhance overall quality of the synthesized speech.

摘要翻译： 公开了能够将语音信号编码为比特流以进行后续解码以产生合成语音的语音压缩系统。语音压缩系统通过将期望的平均比特率与重构语音的感知质量进行平衡来优化比特流消耗的带宽。语音压缩系统包括全速率编解码器，半速率编解码器，四分之一速率编解码器和八速率编解码器。基于速率选择来选择性地激活编解码器。此外，基于类型分类，全速率和半速率编解码器被选择性地激活。选择性地激活每个编解码器以以强调语音信号的不同方面的不同比特率对语音信号进行编码和解码，以增强合成语音的整体质量。

2.

发明授权
Bitstream protocol for transmission of encoded voice signals 有权
标题翻译：用于传输编码语音信号的比特流协议

公开(公告)号：US06581032B1

公开(公告)日：2003-06-17

申请号：US09662828

申请日：2000-09-15

申请人： Yang Gao , Adil Benyassine , Jes Thyssen , Eyal Shlomot , Huan-yu Su

发明人： Yang Gao , Adil Benyassine , Jes Thyssen , Eyal Shlomot , Huan-yu Su

IPC分类号： G10L1912

CPC分类号： G10L19/00 , G10L19/167 , G10L19/24 , H03G3/00

摘要： A speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed. The speech compression system optimizes the bandwidth consumed by the bitstream by balancing the desired average bit rate with the perceptual quality of the reconstructed speech. The speech compression system comprises a full-rate codec, a half-rate codec, a quarter-rate codec and an eighth-rate codec. The codecs are selectively activated based on a rate selection. In addition, the full and half-rate codecs are selectively activated based on a type classification. Each codec is selectively activated to encode and decode the speech signals at different bit rates emphasizing different aspects of the speech signal to enhance overall quality of the synthesized speech.

摘要翻译： 公开了能够将语音信号编码为比特流以进行后续解码以产生合成语音的语音压缩系统。语音压缩系统通过将期望的平均比特率与重构语音的感知质量进行平衡来优化比特流消耗的带宽。语音压缩系统包括全速率编解码器，半速率编解码器，四分之一速率编解码器和八速率编解码器。基于速率选择来选择性地激活编解码器。此外，基于类型分类，全速率和半速率编解码器被选择性地激活。选择性地激活每个编解码器以以强调语音信号的不同方面的不同比特率对语音信号进行编码和解码，以增强合成语音的整体质量。

3.

发明申请
Speech coding system and method using bi-directional mirror-image predicted pulses 有权
标题翻译：使用双向镜像预测脉冲的语音编码系统和方法

公开(公告)号：US20090043574A1

公开(公告)日：2009-02-12

申请号：US12284623

申请日：2008-09-23

申请人： Yang Gao , Adil Benyassine , Jes Thyssen , Eyal Shlomot , Huan-yu Su

发明人： Yang Gao , Adil Benyassine , Jes Thyssen , Eyal Shlomot , Huan-yu Su

IPC分类号： G10L19/12 , G10L19/00

CPC分类号： G10L19/00 , G10L19/167 , G10L19/20 , G10L19/22 , G10L19/24 , G10L2019/0001 , H03G3/00

摘要： There is provided a method of decoding speech data generated from a speech signal. The method comprises receiving the speech data having at least one main pulse in a subframe of the speech data; generating a first predicted pulse, based on the at least one main pulse, on one side of the main pulse in the subframe of the speech data, wherein the first predicted pulse has a lower gain than the main pulse; generating a second predicted pulse, as a mirror image of the first predicted pulse on a reverse time scale, on the other side of the main pulse in the subframe of the speech data; reconstructing the speech signal using the at least one main pulse, the first predicted pulse and the second predicted pulse.

摘要翻译： 提供了一种对从语音信号产生的语音数据进行解码的方法。该方法包括：接收语音数据的子帧中具有至少一个主脉冲的语音数据; 基于所述至少一个主脉冲在所述语音数据的子帧中的所述主脉冲的一侧产生第一预测脉冲，其中所述第一预测脉冲具有比所述主脉冲更低的增益; 在语音数据的子帧中的主脉冲的另一侧上产生第二预测脉冲作为反时限上的第一预测脉冲的镜像; 使用所述至少一个主脉冲，所述第一预测脉冲和所述第二预测脉冲来重构所述语音信号。

4.

发明授权
Adaptive noise state update for a voice activity detector 有权
标题翻译：语音活动检测器的自适应噪声状态更新

公开(公告)号：US07346502B2

公开(公告)日：2008-03-18

申请号：US11342130

申请日：2006-01-26

申请人： Yang Gao , Eyal Shlomot , Adil Benyassine

发明人： Yang Gao , Eyal Shlomot , Adil Benyassine

IPC分类号： G10L11/06

CPC分类号： G10L25/78 , G10L2025/786

摘要： There is provided a method of updating a noise state of a voice activity detector (VAD) for indicating an active voice mode and an inactive voice mode. The method comprises receiving an input signal having a plurality of frames, determining an elapsed time since the last update of the noise state, updating the noise state of the VAD if the elapsed time exceeds a predetermined time, determining an average minimum energy based on two or more of the plurality of frames, determining a current minimum energy based on a current frame of the plurality of frames, updating the noise state of the VAD if the average minimum energy is less than the current minimum energy, and updating the noise state of the VAD if the average minimum energy is greater than the current minimum energy plus a first predetermined value.

摘要翻译： 提供了一种更新用于指示主动语音模式和无效语音模式的语音活动检测器（VAD）的噪声状态的方法。该方法包括接收具有多个帧的输入信号，确定自上次更新噪声状态以来经过的时间，如果经过时间超过预定时间，则更新VAD的噪声状态，基于二次确定平均最小能量或更多个帧，基于多个帧的当前帧确定当前最小能量，如果平均最小能量小于当前最小能量，则更新VAD的噪声状态，并且更新噪声状态 VAD，如果平均最小能量大于当前最小能量加上第一预定值。

5.

发明授权
Multi-mode bitstream transmission protocol of encoded voice signals with embeded characteristics 失效
标题翻译：具有嵌入特性的编码语音信号的多模比特流传输协议

公开(公告)号：US06961698B1

公开(公告)日：2005-11-01

申请号：US10420654

申请日：2003-04-21

申请人： Yang Gao , Adil Benyassine , Jes Thyssen , Eyal Shlomot , Huan-yu Su

发明人： Yang Gao , Adil Benyassine , Jes Thyssen , Eyal Shlomot , Huan-yu Su

IPC分类号： G10L19/00 , G10L13/00 , G10L13/04 , G10L19/02 , G10L19/04 , G10L19/08 , G10L19/10 , G10L19/12 , G10L19/14 , H03M7/30 , H03M7/36

CPC分类号： G10L19/00 , G10L19/167 , G10L19/24 , H03G3/00

摘要： A speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed. The bitstream comprises a type component and a gain component. The type component is representative of a type classification of a frame of speech signal that is transmitted. The type component comprises a first type and second type. The gain component represents an adaptive codebook gain and a fixed codebook gain component comprises a fixed codebook gain component and an adaptive codebook gain component exclusively encoded as separate components of the bitstream as a function of the bit rate when the type classification is the second type.

摘要翻译： 公开了能够将语音信号编码为比特流以进行后续解码以产生合成语音的语音压缩系统。比特流包括类型分量和增益分量。类型分量代表传输的语音信号帧的类型分类。类型组件包括第一类型和第二类型。增益分量表示自适应码本增益，并且固定码本增益分量包括固定码本增益分量和自适应码本增益分量，该类型分类作为第二类型时，作为比特率的单独分量专门编码。

6.

发明授权
Conference bridge processing of speech in a packet network environment 有权
标题翻译：会议桥处理语音在分组网环境中

公开(公告)号：US06463414B1

公开(公告)日：2002-10-08

申请号：US09547832

申请日：2000-04-12

申请人： Huan-Yu Su , Eyal Shlomot , Jes Thyssen , Adil Benyassine , Yang Gao

发明人： Huan-Yu Su , Eyal Shlomot , Jes Thyssen , Adil Benyassine , Yang Gao

IPC分类号： G10L1102

CPC分类号： G10L19/173

摘要： There is provided a conference bridge or transcoder configured to intelligently handle multiple speech channels in the contest of a packet network, wherein various speech channels may adhere to variety of speech encoding standards. For example, the conference bridge establishes framing and alignment of multiple incoming speech channels associated with multiple participants, extracts parameters from the speech samples, mixes the parameters, and re-encodes the resulting speech samples for transmission to the participants. In one aspect, a speech processing method comprises decoding a first bitstream according to a first coding scheme to generate first speech samples and a first side information; generating second speech samples and a second side information using the first speech samples and the first side information, for use according to a second coding scheme; and creating a second bitstream, encoded based on the second coding scheme, using the second speech samples and the second side information.

摘要翻译： 提供了一种配置成在分组网络的比赛中智能地处理多个语音信道的会议桥或代码转换器，其中各种语音信道可以遵循各种语音编码标准。例如，会议桥建立与多个参与者相关联的多个输入语音信道的成帧和对准，从语音样本中提取参数，混合参数，并对所得到的语音样本进行重新编码以传输给参与者。一方面，语音处理方法包括根据第一编码方案对第一比特流进行解码，以产生第一语音样本和第一侧信息; 使用第一语音样本和第一侧信息生成第二语音样本和第二侧信息，以便根据第二编码方案使用; 以及使用所述第二语音样本和所述第二侧信息来创建基于所述第二编码方案编码的第二比特流。

7.

发明授权
Speech codec employing noise classification for noise compensation 有权
标题翻译：语音编解码器采用噪声分类进行噪声补偿

公开(公告)号：US06240386B1

公开(公告)日：2001-05-29

申请号：US09198414

申请日：1998-11-24

申请人： Jes Thyssen , Huan-yu Su , Yang Gao , Adil Benyassine

发明人： Jes Thyssen , Huan-yu Su , Yang Gao , Adil Benyassine

IPC分类号： G10L2100

CPC分类号： G10L19/265 , G10L19/002 , G10L19/005 , G10L19/012 , G10L19/08 , G10L19/083 , G10L19/09 , G10L19/10 , G10L19/12 , G10L19/125 , G10L19/18 , G10L19/20 , G10L21/0364 , G10L2019/0005 , G10L2019/0007 , G10L2019/0011

摘要： A multi-rate speech codec supports a plurality of encoding bit rate modes by adaptively selecting encoding bit rate modes to match communication channel restrictions. In higher bit rate encoding modes, an accurate representation of speech through CELP (code excited linear prediction) and other associated modeling parameters are generated for higher quality decoding and reproduction. For each bit rate mode selected, pluralities of fixed or innovation subcodebooks are selected for use in generating innovation vectors. The speech coder distinguishes various voice signals as a function of their voice content. For example, a Voice Activity Detection (VAD) algorithm selects an appropriate coding scheme depending on whether the speech signal comprises active or inactive speech. The encoder may consider varying characteristics of the speech signal including sharpness, a delay correlation, a zero-crossing rate, and a residual energy. In another embodiment of the present invention, code excited linear prediction is used for voice active signals whereas random excitation is used for voice inactive signals; the energy level and spectral content of the voice inactive signal may also be used for noise coding. The multi-rate speech codec may employ distributed detection and compensation processing the speech signal. For high quality perceptual speech reproduction, the speech codec may perform noise detection in both an encoder and a decoder. The noise detection may be coordinated between the encoder and decoder. Similarly, noise compensation may be performed in a distributed manner among both the decoder and the encoder.

摘要翻译： 多速率语音编解码器通过自适应地选择编码比特率模式以匹配通信信道限制来支持多种编码比特率模式。在较高的比特率编码模式中，通过CELP（码激励线性预测）和其他相关联的建模参数的语音的精确表示被生成用于更高质量的解码和再现。对于所选择的每个比特率模式，选择多个固定或创新子码本来用于产生创新向量。语音编码器将各种语音信号区分为其语音内容的函数。例如，语音活动检测（VAD）算法根据语音信号是否包括有源或非活动语音来选择适当的编码方案。编码器可以考虑包括锐度，延迟相关性，零交叉速率和剩余能量的语音信号的变化特性。在本发明的另一实施例中，码激励线性预测用于语音有源信号，而随机激励用于语音无效信号; 语音无效信号的能级和频谱内容也可用于噪声编码。多速率语音编解码器可以采用语音信号的分布式检测和补偿处理。对于高质量的感知语音再现，语音编解码器可以在编码器和解码器中执行噪声检测。可以在编码器和解码器之间协调噪声检测。类似地，可以在解码器和编码器之间以分布式方式执行噪声补偿。

8.

发明授权
Selection of scalar quantixation (SQ) and vector quantization (VQ) for speech coding 有权

公开(公告)号：US08620647B2

公开(公告)日：2013-12-31

申请号：US12321935

申请日：2009-01-26

申请人： Yang Gao , Adil Benyassine

发明人： Yang Gao , Adil Benyassine

IPC分类号： G10L11/06

CPC分类号： G10L19/12 , G10L19/0204 , G10L19/09 , G10L19/18 , G10L19/20 , G10L25/90 , G10L2019/0002 , G10L2019/0016

摘要： In accordance with one aspect of the invention, a selector supports the selection of a first encoding scheme or the second encoding scheme based upon the detection or absence of the triggering characteristic in the interval of the input speech signal. The first encoding scheme has a pitch pre-processing procedure for processing the input speech signal to form a revised speech signal biased toward an ideal voiced and stationary characteristic. The pre-processing procedure allows the encoder to fully capture the benefits of a bandwidth-efficient, long-term predictive procedure for a greater amount of speech components of an input speech signal than would otherwise be possible. In accordance with another aspect of the invention, the second encoding scheme entails a long-term prediction mode for encoding the pitch on a sub-frame by sub-frame basis. The long-term prediction mode is tailored to where the generally periodic component of the speech is generally not stationary or less than completely periodic and requires greater frequency of updates from the adaptive codebook to achieve a desired perceptual quality of the reproduced speech under a long-term predictive procedure.

9.

发明授权
Embedded silence and background noise compression 有权
标题翻译：嵌入式静音和背景噪声压缩

公开(公告)号：US08032359B2

公开(公告)日：2011-10-04

申请号：US12002131

申请日：2007-12-14

申请人： Eyal Shlomot , Yang Gao , Adil Benyassine

发明人： Eyal Shlomot , Yang Gao , Adil Benyassine

IPC分类号： G10L21/00 , G10L11/06

CPC分类号： G10L19/24 , G10L19/012 , G10L19/0208

摘要： There is provided a method for use by a speech encoder to encode an input speech signal. The method comprises receiving the input speech signal; determining whether the input speech signal includes an active speech signal or an inactive speech signal; low-pass filtering the inactive speech signal to generate a narrowband inactive speech signal; high-pass filtering the inactive speech signal to generate a high-band inactive speech signal; encoding the narrowband inactive speech signal using a narrowband inactive speech encoder to generate an encoded narrowband inactive speech; generating a low-to-high auxiliary signal by the narrowband inactive speech encoder based on the narrowband inactive speech signal; encoding the high-band inactive speech signal using a wideband inactive speech encoder to generate an encoded wideband inactive speech based on the low-to-high auxiliary signal from the narrowband inactive speech encoder; and transmitting the encoded narrowband inactive speech and the encoded wideband inactive speech.

摘要翻译： 提供了一种由语音编码器用于对输入语音信号进行编码的方法。该方法包括接收输入语音信号; 确定所述输入语音信号是否包括活动语音信号或无效语音信号; 低通滤波无效语音信号以产生窄带无效语音信号; 高通滤波无效语音信号以产生高频带无效语音信号; 使用窄带无源语音编码器对窄带无源语音信号进行编码，以生成编码窄带无效语音; 基于窄带无效语音信号，由窄带无源语音编码器生成低到高的辅助信号; 使用宽带无源语音编码器对高频带无效语音信号进行编码，以根据来自窄带无源语音编码器的低到高辅助信号产生编码的宽带无效语音; 以及发送编码的窄带无效语音和编码的宽带无效语音。

10.

发明申请
Speech compression system and method 有权

公开(公告)号：US20070136052A1

公开(公告)日：2007-06-14

申请号：US11700481

申请日：2007-01-30

申请人： Yang Gao , Adil Benyassine , Jes Thyssen , Eyal Shlomot , Huan-yu Su

发明人： Yang Gao , Adil Benyassine , Jes Thyssen , Eyal Shlomot , Huan-yu Su

IPC分类号： G10L11/04

CPC分类号： G10L19/00 , G10L19/167 , G10L19/20 , G10L19/22 , G10L19/24 , G10L2019/0001 , H03G3/00

摘要： The invention improves the encoding and decoding of speech by focusing the encoding on the perceptually important characteristics of speech. The system analyzes selected features of an input speech signal, and first performing a common frame based speech coding of an input speech signal. The system then performs a speech coding based on either a first speech coding mode or a second speech coding mode. The selection of a mode is based on characteristics of the input speech signal. The first speech coding mode uses a first framing structure and the second speech coding mode uses a second framing structure.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类