EFFICIENT SPEECH STREAM CONVERSION
    81.
    发明申请
    EFFICIENT SPEECH STREAM CONVERSION 有权
    有效的语音流转换

    公开(公告)号:US20100223053A1

    公开(公告)日:2010-09-02

    申请号:US12095709

    申请日:2005-11-30

    IPC分类号: G10L19/04

    CPC分类号: G10L19/012 G10L19/173

    摘要: Speech frames of a first speech coding scheme are utilized as speech frames of a second speech coding scheme, where the speech coding schemes use similar core compression schemes for the speech frames, preferably bit stream compatible. An occurrence of a state mismatch in an energy parameter between the first speech coding scheme and the second speech coding scheme is identified, preferably either by determining an occurrence of a predetermined speech evolution, such as a speech type transition, e.g. an onset of speech following a period of speech inactivity, or by tentative decoding of the energy parameter in the two encoding schemes followed by a comparison. Subsequently, the energy parameter in at least one frame of the second speech coding scheme following the occurrence of the state mismatch is adjusted. The present invention also presents transcoders and communications systems providing such transcoding functionality.

    摘要翻译: 第一语音编码方案的语音帧被用作第二语音编码方案的语音帧,其中语音编码方案对于语音帧使用类似的核心压缩方案,优选地与比特流兼容。 识别出第一语音编码方案和第二语音编码方案之间的能量参数中的状态失配的发生,优选地通过确定诸如语音类型转换的预定语音演进的发生,例如, 在语音不活动的时期之后的语音开始,或者通过对两个编码方案中的能量参数的暂时解码进行比较。 随后,调整在发生状态失配之后的第二语音编码方案的至少一帧中的能量参数。 本发明还提供了提供这种代码转换功能的代码转换器和通信系统。

    ADAPTIVE SOUND SOURCE VECTOR QUANTIZATION DEVICE, ADAPTIVE SOUND SOURCE VECTOR INVERSE QUANTIZATION DEVICE, AND METHOD THEREOF
    82.
    发明申请
    ADAPTIVE SOUND SOURCE VECTOR QUANTIZATION DEVICE, ADAPTIVE SOUND SOURCE VECTOR INVERSE QUANTIZATION DEVICE, AND METHOD THEREOF 有权
    自适应声源矢量量化装置,自适应声源矢量反相量化装置及其方法

    公开(公告)号:US20100082337A1

    公开(公告)日:2010-04-01

    申请号:US12518944

    申请日:2007-12-14

    IPC分类号: G10L19/04

    CPC分类号: G10L19/12 G10L19/038

    摘要: Disclosed is an adaptive sound source vector quantization device capable of improving quantization accuracy of adaptive sound source vector quantization while suppressing increase of the calculation amount in CELP sound encoding which performs encoding in sub-frame unit. In the device, a search adaptive sound source vector generation unit (103) cuts out an adaptive sound source vector of a frame length (n) from an adaptive sound source codebook (102), a search impulse response matrix generation unit (105) generates a search impulse response matrix of n n by using an impulse response matrix for each of sub-frames inputted from a synthesis filter (104), a search target vector generation unit (106) adds the target vector of each sub-frame so as to generate a search target vector of frame length (n), an evaluation scale calculation unit (107); calculates the evaluation scale of the adaptive sound source vector quantization by using the search adaptive sound source vector, the search impulse response matrix, and the search target vector.

    摘要翻译: 公开了一种能够提高自适应声源矢量量化的量化精度的自适应声源矢量量化装置,同时抑制以子帧为单位执行编码的CELP声音编码中的计算量的增加。 在该装置中,搜索自适应声源矢量生成部(103)从自适应声源码本(102)切出帧长度(n)的自适应声源矢量,搜索脉冲响应矩阵生成部(105)生成 通过使用从合成滤波器(104)输入的每个子帧的脉冲响应矩阵,nn的搜索脉冲响应矩阵,搜索目标矢量生成单元(106)将每个子帧的目标矢量相加以产生 帧长度(n)的搜索目标矢量,评价比例计算单元(107); 通过使用搜索自适应声源矢量,搜索脉冲响应矩阵和搜索目标矢量来计算自适应声源矢量量化的评估量表。

    CODING DEVICE AND CODING METHOD
    83.
    发明申请
    CODING DEVICE AND CODING METHOD 有权
    编码设备和编码方法

    公开(公告)号:US20090094024A1

    公开(公告)日:2009-04-09

    申请号:US12282287

    申请日:2007-03-08

    IPC分类号: G10L19/04 G10L19/00

    CPC分类号: G10L19/24

    摘要: A coding device is provided with features in which optimum coding in a higher layer is flexibly carried out based on a coding result of a lower layer and a quality audio signal in limited circumstances is served to users. In this coding device, a basic layer coding unit codes an input signal to generate a basic layer information source code and outputs a linear prediction coefficient (LPC) and a quantum LPC, which are parameters calculated at coding, to an expanded layer control unit. A basic layer decoding unit decodes the basic layer information source code. An adding unit reverses a polarity of a basic layer decoded signal, adds the same to the input signal, and calculates a difference signal. The expanded layer control unit generates expanded layer mode information indicative of a coding mode in an expanded layer based on the LPC and the quantum LPC. An expanded layer coding unit codes the difference signal obtained from the adding unit under control of the expanded layer control unit.

    摘要翻译: 编码装置具有这样的特征,其中基于较低层的编码结果灵活地执行较高层中的最佳编码,并且在有限的情况下向用户提供质量音频信号。 在该编码装置中,基本层编码单元对输入信号进行编码以生成基本层信息源代码,并将作为编码计算出的参数的线性预测系数(LPC)和量子LPC输出到扩展层控制单元。 基本层解码单元解码基本层信息源代码。 加法单元反转基本层解码信号的极性,将其相加于输入信号,并计算差分信号。 扩展层控制单元基于LPC和量子LPC生成表示扩展层中的编码模式的扩展层模式信息。 扩展层编码单元在扩展层控制单元的控制下对从加法单元获得的差异信号进行编码。

    Adaptive equalizer for a coded speech signal
    84.
    发明授权
    Adaptive equalizer for a coded speech signal 有权
    编码语音信号的自适应均衡器

    公开(公告)号:US07490036B2

    公开(公告)日:2009-02-10

    申请号:US11254823

    申请日:2005-10-20

    IPC分类号: G10L19/04

    CPC分类号: G10L19/26

    摘要: A speech communication system provides a speech encoder that generates a set of coded parameters representative of the desired speech signal characteristics. The speech communication system also provides a speech decoder that receives the set of coded parameters to generate reconstructed speech. The speech decoder includes an equalizer that computes a matching set of parameters from the reconstructed speech generated by the speech decoder, undoes the set of characteristics corresponding to the computed set of parameters, and imposes the set of characteristics corresponding to the coded set of parameters, thereby producing equalized reconstructed speech.

    摘要翻译: 语音通信系统提供语音编码器,其生成表示所需语音信号特性的一组编码参数。 语音通信系统还提供语音解码器,其接收编码参数集合以产生重构语音。 语音解码器包括一个均衡器,该均衡器从语音解码器产生的重构语音中计算匹配的一组参数,取消对应于所计算的参数集合的一组特征,并施加与编码的参数集相对应的特征集合, 从而产生均衡的重建语音。

    Data processing apparatus
    85.
    发明授权
    Data processing apparatus 失效
    数据处理装置

    公开(公告)号:US07467083B2

    公开(公告)日:2008-12-16

    申请号:US10239591

    申请日:2002-01-24

    IPC分类号: G10L19/04 G10L19/10

    CPC分类号: G10L19/12

    摘要: The present invention relates to a data processing apparatus capable of obtaining high-quality sound data. A tap generation section 121 generates a prediction tap used for a process in a prediction section 125 by extracting decoded speech data in a predetermined positional relationship with subject data of interest within the decoded speech data such that coded data is decoded by a CELP method and by extracting an I code located in a subframe according to a position of the subject data in the subject subframe. Similarly to the tap generation section 122, a tap generation section 122 generates a class tap used for a process in a classification section 123. The classification section 123 performs classification on the basis of the class tap, and a coefficient memory 124 outputs a tap coefficient corresponding to the classification result. The prediction section 125 performs a linear prediction computation by using the prediction tap and the tap coefficient and outputs high-quality decoded speech data. The present invention can be applied to mobile phones for transmitting and receiving speech.

    摘要翻译: 本发明涉及能够获得高质量声音数据的数据处理装置。 抽头生成部121通过以解码语音数据中的感兴趣对象数据以预定的位置关系提取解码语音数据,生成用于预测部分125中的处理的预测抽头,使得编码数据通过CELP方法和 根据主体子帧中的被摄体数据的位置提取位于子帧中的I码。 抽头产生部分122类似于抽头生成部分122,生成用于分类部分123中的处理的类别抽头。分类部分123基于类别抽头执行分类,并且系数存储器124输出抽头系数 对应于分类结果。 预测部125通过使用预测抽头和抽头系数进行线性预测计算,并输出高质量解码语音数据。 本发明可以应用于用于发送和接收语音的移动电话。

    Method and apparatus for formant tracking using a residual model
    86.
    发明授权
    Method and apparatus for formant tracking using a residual model 有权
    使用残差模型进行共振峰跟踪的方法和装置

    公开(公告)号:US07424423B2

    公开(公告)日:2008-09-09

    申请号:US10404411

    申请日:2003-04-01

    IPC分类号: G10L19/04

    CPC分类号: G10L15/02 G10L25/15

    摘要: A method of tracking formants defines a formant search space comprising sets of formants to be searched. Formants are identified for a first frame in the speech utterance by searching the entirety of the formant search space using the codebook, and for the remaining frames by searching the same space using both the codebook and the continuity constraint across adjacent frames. Under one embodiment, the formants are identified by mapping sets of formants into feature vectors and applying the feature vectors to a model. Formants are also identified by applying dynamic programming to search for the best sequence that optimally satisfies the continuity constraint required by the model.

    摘要翻译: 跟踪共享器的方法定义了包括要搜索的共振峰集合的共振峰搜索空间。 通过使用码本搜索整体的共振峰搜索空间,并且通过使用码本和相邻帧之间的连续性约束搜索相同的空间,为语音语音中的第一帧识别共振峰。 在一个实施例中,通过将共振峰集合映射到特征向量中并将特征向量应用于模型来识别共振峰。 还通过应用动态规划来搜索最优序列,以最佳地满足模型所需的连续性约束,来确定共振峰。

    Reduced computational complexity of bit allocation for perceptual coding
    87.
    发明授权
    Reduced computational complexity of bit allocation for perceptual coding 有权
    降低感知编码的位分配的计算复杂度

    公开(公告)号:US07406412B2

    公开(公告)日:2008-07-29

    申请号:US10829453

    申请日:2004-04-20

    IPC分类号: G10L19/04

    CPC分类号: G10L19/035

    摘要: A process that allocates bits for quantizing spectral components in a perceptual coding system is performed more efficiently by obtaining an accurate estimate of the optimal value for one or more coding parameters that are used in the bit allocation process. In one implementation for a perceptual audio coding system, an accurate estimate of an offset from a calculated psychoacoustic masking curve is derived by selecting an initial value for the offset, calculating the number of bits that would be allocated if the initial offset were used for coding, and estimating the optimum value of the offset from a difference between this calculated number and the number of bits that are actually available for allocation.

    摘要翻译: 通过获得用于比特分配处理中使用的一个或多个编码参数的最佳值的精确估计,更有效地执行分配用于量化感知编码系统中的频谱分量的比特的处理。 在感知音频编码系统的一个实现中,通过选择偏移的初始值来导出从计算的心理声学屏蔽曲线的偏移的精确估计,如果初始偏移用于编码,则计算将被分配的比特数 ,并且从该计算出的数量与实际可用于分配的位数之间的差估计偏移的最佳值。

    Hybrid speech coding and system
    88.
    发明授权
    Hybrid speech coding and system 有权
    混合语音编码与系统

    公开(公告)号:US07386444B2

    公开(公告)日:2008-06-10

    申请号:US10769501

    申请日:2004-01-30

    申请人: Jacek Stachurski

    发明人: Jacek Stachurski

    IPC分类号: G10L19/08 G10L19/04

    CPC分类号: B41F27/1281 G01R23/20

    摘要: Hybrid linear predictive speech coding system with phase alignment predictive quantization zero phase alignment of speech prior to waveform coding aligns synthesized speech frames of a waveform coder with frames synthesized with a parametric coder. Inter-frame interpolation of LP coefficients suppresses artifacts in resultant synthesized speech frames.

    摘要翻译: 具有波形编码之前的语音的相位对准预测量化零相位对准的混合线性预测语音编码系统将波形编码器的合成语音帧与由参数编码器合成的帧对齐。 LP系数的帧间插值抑制了合成语音帧中的伪像。

    Encoding and decoding of overlapping audio signal values by differential encoding/decoding
    89.
    发明授权
    Encoding and decoding of overlapping audio signal values by differential encoding/decoding 失效
    通过差分编码/解码对重叠音频信号值进行编码和解码

    公开(公告)号:US07376555B2

    公开(公告)日:2008-05-20

    申请号:US10496710

    申请日:2002-11-13

    IPC分类号: G10L19/04 H04B14/06 H04B1/66

    CPC分类号: G10L19/022

    摘要: Coding a signal is provided, wherein a first set of values is provided related to subsequent times in a first time interval of the signal, a second set of values is provided related to subsequent times in a second time interval of the signal, the first time interval having an overlap with the second time interval, the overlap including at least two subsequent times of the second interval, wherein at least one of the values of the second set related to the at least two subsequent times in the overlap is encoded with reference to a value of the first set which is closer in time to the at least one value of the second set than any other value in the second set.

    摘要翻译: 提供对信号进行编码,其中提供与信号的第一时间间隔中的后续时间相关的第一组值,在信号的第二时间间隔中提供与后续时间相关的第二组值,第一次 间隔与第二时间间隔重叠,重叠包括第二间隔的至少两个后续时间,其中与重叠中的至少两个后续时间相关的第二组的值中的至少一个参考 与第二组中的任何其他值相比,时间上比第二组的至少一个值更接近的第一组的值。

    Perfected device and method for the spatialization of sound
    90.
    发明授权
    Perfected device and method for the spatialization of sound 有权
    完善的声音空间化设备和方法

    公开(公告)号:US07356465B2

    公开(公告)日:2008-04-08

    申请号:US10748125

    申请日:2003-12-31

    IPC分类号: G10L19/04

    CPC分类号: H04S7/30 H04S2400/11

    摘要: The invention relates to a computer device comprising a memory 108 for storing audio signals 114, in part pre-recorded, each corresponding to a defined source, by means of spatial position data 116, and a processing module 110 for processing these audio signals in real time as a function of the spatial position data. The processing module 110 allows for the instantaneous power level parameters to be calculated on the basis of audio signals 114, the corresponding sources being defined by instantaneous power level parameters. The processing module 110 comprises a selection module 120 for regrouping certain of the audio signals into a variable number of audio signal groups, and the processing module 110 is capable of calculating spatial position data which is representative of a group of audio signals as a function of the spatial position data 116 and instantaneous power level parameters for each corresponding source.

    摘要翻译: 本发明涉及一种包括存储器108的计算机设备,该存储器108通过空间位置数据116存储音频信号114,部分预先记录,每个对应于定义的源,以及用于处理这些音频信号的处理模块110 时间作为空间位置数据的函数。 处理模块110允许基于音频信号114来计算瞬时功率电平参数,相应的源由瞬时功率电平参数定义。 处理模块110包括用于将某些音频信号重新分组成可变数量的音频信号组的选择模块120,并且处理模块110能够计算代表一组音频信号的空间位置数据,该组音频信号作为 空间位置数据116和每个对应的源的瞬时功率电平参数。