Coding of acoustic waveforms
    1.
    发明授权
    Coding of acoustic waveforms 失效
    声波编码

    公开(公告)号:US5054072A

    公开(公告)日:1991-10-01

    申请号:US456183

    申请日:1989-12-15

    IPC分类号: G10L19/02

    CPC分类号: G10L19/02

    摘要: Encoding techniques and devices are based on a sinusoidal speech representation model. In one aspect of the invention, a pitch-adaptive channel encoding technique for amplitude coding varies the channel spacing in accordance with the pitch of the speaker's voice. In another aspect of the invention, a phase synthesis technique locks rapidly-varying phases into synchrony with the phase of the fundamental. Phase coding techniques which introduce a voice-dependent random phase and a pitch-adaptive quadratic phase dispersion are also performed.

    摘要翻译: 编码技术和设备基于正弦语音表示模型。 在本发明的一个方面,用于幅度编码的音调自适应信道编码技术根据扬声器的语音的音调来改变信道间隔。 在本发明的另一方面,相位合成技术将快速变化的相锁定与基波的相位同步。 还执行引入语音相关随机相位和音调自适应二次相位色散的相位编码技术。

    Processing of acoustic waveforms
    2.
    发明授权
    Processing of acoustic waveforms 失效
    声波形的处理

    公开(公告)号:US4885790A

    公开(公告)日:1989-12-05

    申请号:US339957

    申请日:1989-04-18

    IPC分类号: G10L19/02

    CPC分类号: G10L19/02

    摘要: A sinusoidal model for acoustic waveforms is applied to develop a new analysis/synthesis technique which characterizes a waveform by the amplitudes, frequencies, and phases of component sine waves. These parameters are estimated from a short-time Fourier transform. Rapid changes in the highly-resolved spectral components are tracked using the concept of "birth" and "death" of the underlying sine waves. The component values are interpolated from one frame to the next to yield a respresentation that is applied to a sine wave generator. The resulting synthetic waveform preserves the general waveform shape and is perceptually indistinguishable from the original. Furthermore, in the presence of noise the perceptual characteristics of the waveform as well as the noise are maintained. The method and devices are particularly useful in speech coding, time-scale modification, frequency scale modification and pitch modification.

    摘要翻译: 应用用于声波形的正弦模型来开发新的分析/合成技术,其通过分量正弦波的幅度,频率和相位来表征波形。 这些参数是从短时傅里叶变换估计的。 高分辨率光谱分量的快速变化使用基础正弦波“出生”和“死亡”的概念进行跟踪。 分量值从一帧插值到下一帧,产生应用于正弦波发生器的表示。 所得到的合成波形保留了一般的波形形状,并且在感觉上与原始波形不可区分。 此外,在存在噪声的情况下,维持波形的感知特性以及噪声。 该方法和装置在语音编码,时间尺度修改,频率规模修改和音高修改中特别有用。

    Processing of acoustic waveforms
    3.
    再颁专利
    Processing of acoustic waveforms 失效
    声波形的处理

    公开(公告)号:USRE36478E

    公开(公告)日:1999-12-28

    申请号:US631222

    申请日:1996-04-12

    CPC分类号: G10L19/02

    摘要: A sinusoidal model for acoustic waveforms is applied to develop a new analysis/synthesis technique which characterizes a waveform by the amplitudes, frequencies, and phases of component sine waves. These parameters are estimated from a short-time Fourier transform. Rapid changes in the highly-resolved spectral components are tracked using the concept of "birth" and "death" of the underlying sine waves. The component values are interpolated from one frame to the next to yield a representation that is applied to a sine wave generator. The resulting synthetic waveform preserves the general waveform shape and is perceptually indistinguishable from the original. Furthermore, in the presence of noise the perceptual characteristics of the waveform as well as the noise are maintained. The method and devices are particularly useful in speech coding, time-scale modification, frequency scale modification and pitch modification.

    摘要翻译: 应用用于声波形的正弦模型来开发新的分析/合成技术,其通过分量正弦波的幅度,频率和相位来表征波形。 这些参数是从短时傅里叶变换估计的。 高分辨率光谱分量的快速变化使用基础正弦波“出生”和“死亡”的概念进行跟踪。 分量值从一帧内插到下一帧以产生应用于正弦波发生器的表示。 所得到的合成波形保留了一般的波形形状,并且在感觉上与原始波形不可区分。 此外,在存在噪声的情况下,维持波形的感知特性以及噪声。 该方法和装置在语音编码,时间尺度修改,频率规模修改和音高修改中特别有用。

    Audio pre-processing methods and apparatus
    4.
    发明授权
    Audio pre-processing methods and apparatus 失效
    音频预处理方法和装置

    公开(公告)号:US4856068A

    公开(公告)日:1989-08-08

    申请号:US34204

    申请日:1987-04-02

    IPC分类号: G10L13/00 G10L19/02

    CPC分类号: G10L19/02

    摘要: A lower threshold for dynamic range compression and clipping is allowed by sinusoidal estimation and phase adjustment of the original speech signal to obtain a lower Peak to RMS ratio. A sinusoidal speech representation system is applied to the problem of speech dispersion by pre-processing the waveform prior to transmission to reduce the peak-to-RMS ratio of the waveform. The sinusoidal system first estimates and then removes the natural phase dispersion in the frequency components of the speech signal. Artificial dispersion based on pulse compression techniques is then introduced with little change in speech quality. The new phase dispersion allocation serves to preprocess the waveform prior to dynamic range compression and clipping, allowing considerably deeper thresholding than can be tolerated on the original waveform.

    摘要翻译: 通过正弦估计和原始语音信号的相位调整允许动态范围压缩和限幅的较低阈值,以获得较低的峰值对RMS比。 通过在传输之前预处理波形来将正弦语音表示系统应用于语音色散问题,以降低波形的峰均比。 正弦系统首先估计然后去除语音信号的频率分量中的自然相位色散。 然后引入基于脉冲压缩技术的人造色散,语音质量几乎没有变化。 新的相位色散分配用于在动态范围压缩和削波之前预处理波形,允许比原始波形上允许的更深的阈值。

    Scalable and embedded codec for speech and audio signals
    5.
    发明授权
    Scalable and embedded codec for speech and audio signals 有权
    用于语音和音频信号的可扩展和嵌入式编解码器

    公开(公告)号:US09047865B2

    公开(公告)日:2015-06-02

    申请号:US11889332

    申请日:2007-08-10

    摘要: A system and method for processing of audio and speech signals is disclosed, which provide compatibility over a range of communication devices operating at different sampling frequencies and/or bit rates. The analyzer of the system divides the input signal in different portions, at least one of which carries information sufficient to provide intelligible reconstruction of the input signal. The analyzer also encodes separate information about other portions of the signal in an embedded manner, so that a smooth transition can be achieved from low bit-rate to high bit-rate applications. Accordingly, communication devices operating at different sampling rates and/or bit-rates can extract corresponding information from the output bit stream of the analyzer. In the present invention embedded information generally relates to separate parameters of the input signal, or to additional resolution in the transmission of original signal parameters. Non-linear techniques for enhancing the overall performance of the system are also disclosed. Also disclosed is a novel method of improving the quantization of signal parameters. In a specific embodiment the input signal is processed in two or more modes dependent on the state of the signal in a frame. When the signal is determined to be in a transition state, the encoder provides phase information about N sinusoids, which the decoder end uses to improve the quality of the output signal at low bit rates.

    摘要翻译: 公开了一种用于处理音频和语音信号的系统和方法,其提供了在不同采样频率和/或比特率下操作的通信设备的范围上的兼容性。 系统的分析器将输入信号分成不同的部分,其中至少一个传送足以提供输入信号的可理解的重建的信息。 分析器还以嵌入的方式编码关于信号的其他部分的单独信息,从而可以从低比特率到高比特率应用实现平滑的转换。 因此,以不同的采样率和/或比特率工作的通信设备可以从分析器的输出比特流中提取相应的信息。 在本发明中,嵌入信息通常涉及输入信号的单独参数,或者涉及原始信号参数传输中的附加分辨率。 还公开了用于增强系统的整体性能的非线性技术。 还公开了一种改进信号参数量化的新方法。 在具体实施例中,输入信号以两个或更多个模式被处理,这取决于帧中的信号的状态。 当信号被确定为处于转换状态时,编码器提供关于N个正弦曲线的相位信息,解码器端用于以低比特率提高输出信号的质量。

    Computationally efficient sine wave synthesis for acoustic waveform
processing
    6.
    发明授权
    Computationally efficient sine wave synthesis for acoustic waveform processing 失效
    用于声波形处理的计算效率正弦波合成

    公开(公告)号:US4937873A

    公开(公告)日:1990-06-26

    申请号:US179528

    申请日:1988-04-08

    IPC分类号: G10L19/02 G10L19/08

    CPC分类号: G10L19/093

    摘要: Methods and apparatus for reducing discontinuities between frames of sinusoidally modeled acoustic waveforms, such as speech, which occur when sampling at low frame rates. A Fast Fourier Transform-based overlap-add technique is applied to amplitude, frequency and phase components of sinusoidal waves after frame-to-frame sine wave matching has been performed. Matched sine wave amplitudes and frequencies are linearly interpolated and mid-point phase is estimated such that the mid-frame sine wave is best fit to the most recent half-frame segments of the lagging and leading sine waves. Synthetic mid-frame sine waves are generated using the interpolated amplitude and frequency and estimated phase values. Synthesized acoustic waveforms of high quality from original source waveforms can be produced in sinusoidal analysis/synthesis operations at coding frame rates of 50 Hz and lower. The methods and devices disclosed herein are particularly useful for computationally efficient coding and synthesis of speech waveforms.

    摘要翻译: 用于减少在低帧速率下采样时发生的正弦模型声波形(例如语音)的帧之间的不连续性的方法和装置。 基于快速傅里叶变换的叠加技术被应用于正弦波匹配后的正弦波的振幅,频率和相位分量。 匹配的正弦波幅度和频率被线性插值,并且估计中点相位,使得中间帧正弦波最适合滞后和正弦波的最近的半帧段。 使用内插的幅度和频率以及估计的相位值产生合成的中帧正弦波。 在50Hz及以下的编码帧速率下,可以在正弦分析/合成操作中产生来自原始源波形的高质量的合成声波形。 本文公开的方法和装置对语音波形的计算有效编码和合成特别有用。

    Scalable and embedded codec for speech and audio signals
    7.
    发明授权
    Scalable and embedded codec for speech and audio signals 有权
    用于语音和音频信号的可扩展和嵌入式编解码器

    公开(公告)号:US07272556B1

    公开(公告)日:2007-09-18

    申请号:US09159481

    申请日:1998-09-23

    IPC分类号: G10L21/00

    摘要: A system and method for processing of audio and speech signals is disclosed, which provide compatibility over a range of communication devices operating at different sampling frequencies and/or bit rates. The analyzer of the system divides the input signal in different portions, at least one of which carries information sufficient to provide intelligible reconstruction of the input signal. The analyzer also encodes separate information about other portions of the signal in an embedded manner, so that a smooth transition can be achieved from low bit-rate to high bit-rate applications. Accordingly, communication devices operating at different sampling rates and/or bit-rates can extract corresponding information from the output bit stream of the analyzer. In the present invention embedded information generally relates to separate parameters of the input signal, or to additional resolution in the transmission of original signal parameters. Non-linear techniques for enhancing the overall performance of the system are also disclosed. Also disclosed is a novel method of improving the quantization of signal parameters. In a specific embodiment the input signal is processed in two or more modes dependent on the state of the signal in a frame. When the signal is determined to be in a transition state, the encoder provides phase information about N sinusoids, which the decoder end uses to improve the quality of the output signal at low bit rates.

    摘要翻译: 公开了一种用于处理音频和语音信号的系统和方法,其提供了在不同采样频率和/或比特率下操作的通信设备的范围上的兼容性。 系统的分析器将输入信号分成不同的部分,其中至少一个传送足以提供输入信号的可理解的重建的信息。 分析器还以嵌入的方式编码关于信号的其他部分的单独信息,从而可以从低比特率到高比特率应用实现平滑的转换。 因此,以不同的采样率和/或比特率工作的通信设备可以从分析器的输出比特流中提取相应的信息。 在本发明中,嵌入信息通常涉及输入信号的单独参数,或者涉及原始信号参数传输中的附加分辨率。 还公开了用于增强系统的整体性能的非线性技术。 还公开了一种改进信号参数量化的新方法。 在具体实施例中,输入信号以两个或更多个模式被处理,这取决于帧中的信号的状态。 当信号被确定为处于转换状态时,编码器提供关于N个正弦曲线的相位信息,解码器端用于以低比特率提高输出信号的质量。