Method and apparatus for coding an information signal
    1.
    发明授权
    Method and apparatus for coding an information signal 失效
    用于对信息信号进行编码的方法和装置

    公开(公告)号:US06141638A

    公开(公告)日:2000-10-31

    申请号:US86149

    申请日:1998-05-28

    CPC分类号: G10L19/18 G10L19/10

    摘要: A speech coder (400) for coding an information signal varies the codebook configuration based on parameters inherent in the information signal. The speech coder (400) requires no additional overhead for sending of mode parameters while allowing subframe resolution. The configurations vary not only for voicing level, but also for pitch period since different physiological traits yield different codebook configurations. A dispersion matrix (406) within the speech coder (400) facilitates a codebook search which is performed on vectors whose length can be less than a subframe length. Additionally, use of the dispersion matrix (406) allows the addition of random events for very slightly voiced speech which incurs little computational overhead but produces a rich excitation.

    摘要翻译: 用于编码信息信号的语音编码器(400)根据信息信号中固有的参数来改变码本配置。 语音编码器(400)不需要用于发送模式参数的额外开销,同时允许子帧分辨率。 这些配置不仅对于发声水平而言也是变化的,而且对于音调周期也是不同的,因为不同的生理特征产生不同的码本配置。 语音编码器(400)内的分散矩阵(406)便于对长度可以小于子帧长度的矢量执行的码本搜索。 此外,使用色散矩阵(406)允许为非常轻微的语音语音添加随机事件,这引起很少的计算开销,但产生了丰富的激励。

    Method and apparatus for synthesizing signals using transform-domain
match-output extension
    2.
    发明授权
    Method and apparatus for synthesizing signals using transform-domain match-output extension 失效
    使用变换域匹配输出扩展来合成信号的方法和装置

    公开(公告)号:US6073100A

    公开(公告)日:2000-06-06

    申请号:US828592

    申请日:1997-03-31

    IPC分类号: G10L21/04 G10L9/00

    CPC分类号: G10L21/04

    摘要: A method of synthesizing audio signals provides outputs of high subjective quality which retain the semblance of natural origin. Unlike frequency scaling methods, the pitch of a signal can be modified independently of the spectrum envelope. A set of candidate input sections is defined based on input transform-domain signal representations. A match-output transform-domain section is formed using the result of a matching process which compares candidate input sections to a reference section. The reference section for this matching process is defined based on one or more previously formed match-output sections. Main-output transform-domain signal representations are formed based on one or more match-output sections, whereby such main-output transform-domain signal representations can be inverse-transformed and combined with the output time-domain signal. This method is referred to as "Transform-Domain Match-Output Extension" (TDMOX). One embodiment of the invention implements block-transform processing using an FFT algorithm. Matching processes search over ranges of frequency shifts, ranges of time shifts, and ranges of resampling factors. Selections are based on maximum cross-correlation, maximum sum of dot products, and minimum sum of squared differences, respectively. Applications include text-to-speech synthesis, audio editing, musical effects processing, real-time low-delay voice transformation, internet telephony, voice mail, Karaoke, hearing aids, and film animation.

    摘要翻译: 合成音频信号的方法提供了保持自然起源的外观的高主观质量的输出。 与频率缩放方法不同,可以独立于频谱包络来修改信号的音调。 基于输入变换域信号表示来定义一组候选输入部分。 使用将候选输入部分与参考部分进行比较的匹配处理的结果形成匹配输出变换域部分。 该匹配过程的参考部分基于一个或多个先前形成的匹配输出部分来定义。 主输出变换域信号表示基于一个或多个匹配输出部分形成,由此可以将这样的主输出变换域信号表示逆变换并与输出时域信号组合。 该方法称为“变换域匹配输出扩展”(TDMOX)。 本发明的一个实施例使用FFT算法来实现块变换处理。 匹配过程搜索频移范围,时移范围和重采样因子的范围。 选择是基于最大互相关,点积的最大和和最小平方和的和。 应用包括文本到语音合成,音频编辑,音乐效果处理,实时低延迟语音转换,互联网电话,语音邮件,卡拉OK,助听器和电影动画。

    Speech coding and decoding apparatus
    3.
    再颁专利
    Speech coding and decoding apparatus 失效
    语音编解码装置

    公开(公告)号:USRE36721E

    公开(公告)日:2000-05-30

    申请号:US561751

    申请日:1995-11-22

    IPC分类号: G10L9/00

    摘要: A speech signal is input to an excitation signal generating section, a prediction filter and a prediction parameter calculator. The prediction parameter calculator calculates a predetermined number of prediction parameters (LPC parameter or reflection coefficient) by an autocorrelation method or covariance method, and supplies the acquired prediction parameters to a prediction parameter coder. The codes of the prediction parameters are sent to a decoder and a multiplexer. The decoder sends decoded values of the codes of the prediction parameters to the prediction filter and the excitation signal generating section. The prediction filter calculates a prediction residual signal, which is the difference between the input speech signal and the decoded prediction parameter, and sends it to the excitation signal generating section. The excitation signal generating section calculates the pulse interval and amplitude for each of a predetermined number of subframes based on the input speech signal, the prediction residual signal and the quantized value of the prediction parameter, and sends them to the multiplexer. The multiplexer combines these codes and the codes of the prediction parameters, and send the results as an output signal of a coding apparatus to a transmission path or the like.

    摘要翻译: 语音信号被输入到激励信号产生部分,预测滤波器和预测参数计算器。 预测参数计算器通过自相关方法或协方差方法计算预定数量的预测参数(LPC参数或反射系数),并将所获取的预测参数提供给预测参数编码器。 预测参数的代码被发送到解码器和多路复用器。 解码器将预测参数的代码的解码值发送到预测滤波器和激励信号生成部。 预测滤波器计算作为输入语音信号和解码预测参数之间的差的预测残差信号,并将其发送到激励信号生成部。 激励信号生成部基于输入的语音信号,预测残差信号和预测参数的量化值,计算预定数量的子帧中的每一个的脉冲间隔和幅度,并将其发送到多路复用器。 多路复用器组合这些代码和预测参数的代码,并将结果作为编码装置的输出信号发送到传输路径等。

    Spread sheet reading-out/collating apparatus, spread sheet
reading-out/collating method, and a computer-readable recording medium
with program making computer execute method stored therein
    4.
    发明授权
    Spread sheet reading-out/collating apparatus, spread sheet reading-out/collating method, and a computer-readable recording medium with program making computer execute method stored therein 失效
    扩展纸读出/整理装置,电子表格读出/整理方法以及存储有程序制作计算机执行方法的计算机可读记录介质

    公开(公告)号:US6065023A

    公开(公告)日:2000-05-16

    申请号:US14571

    申请日:1998-01-28

    申请人: Nobuhide Yamazaki

    发明人: Nobuhide Yamazaki

    CPC分类号: G10L13/00

    摘要: A spread sheet reading-out/collating apparatus, in which a spread sheet preparation module obtains a range to be read out from a position of a header cell specified by a read-out object specifying module using a read-out range determining module and outputs cell data within the range to be read out as well as the display format to a voice-generating data generation module, a voice-generating data generation module generates voice-generating data for a text comprising a Chinese and a Japanese characters mixed therein, and a voice synthesis module outputs voices based on the voice-generating data.

    摘要翻译: 一种电子表格读出/整理装置,其中电子表格制备模块使用读出范围确定模块从读出对象指定模块指定的标题单元的位置获取要读出的范围,并输出 将要读出的范围内的单元数据以及显示格式发送到语音产生数据生成模块,语音生成数据生成模块生成包含混合在其中的中文和日文的文本的语音生成数据,以及 语音合成模块基于语音产生数据输出语音。

    Counter homeostasis oscillation perturbation signals (CHOPS) detection
    5.
    发明授权
    Counter homeostasis oscillation perturbation signals (CHOPS) detection 失效
    反动态平衡振荡扰动信号(CHOPS)检测

    公开(公告)号:US6055501A

    公开(公告)日:2000-04-25

    申请号:US108926

    申请日:1998-07-01

    IPC分类号: G10L9/00

    CPC分类号: G10L25/48

    摘要: A method and apparatus for detecting counter homeostasis oscillation perturbation signals (CHOPS) found within the wave form of human speech that reflects either arousal in the autonomic nervous system or other biological processes. The apparatus is a speech analysis system for obtaining biofeedback information from human speech samples having variable duration. The speech analysis system comprises means for digitizing the human speech samples, storage means for receiving the digitized speech samples from the digitizing means and storing the digitized speech samples, processing means for detecting and analyzing CHOPS in the digitized speech samples and display means for presenting the analyzed speech samples in a visual representation. The speech analysis system may further include transducer means for collecting and transducing human speech samples into electrical signals and input means for configuring the analysis parameters of the processing means. The present invention does not require any electrode or probe attachment from the speech analysis system to a subject. The method provides biofeedback from physiological indicators of stress using the speech analysis system. The method includes recording a human speech sample having variable duration with the transducer means, digitizing the human speech sample with the means for digitizing, storing the digitized speech sample in the storage means, determining CHOPS in the digitized speech sample with the processing means based on pre-determined parameters and identifying relationships between the CHOPS in the digitized speech sample with the processing means.

    摘要翻译: 用于检测在人类言语波形中发现的反自动神经系统或其他生物过程中的唤醒的反向稳态振荡扰动信号(CHOPS)的方法和装置。 该装置是用于从具有可变持续时间的人类语音样本获得生物反馈信息的语音分析系统。 语音分析系统包括用于数字化人类语音样本的装置,用于从数字化装置接收数字化语音样本并存储数字化语音样本的存储装置,用于检测和分析数字化语音样本中的CHOPS的处理装置和用于呈现 在视觉表示中分析了语音样本。 语音分析系统还可以包括用于收集和将人类语音样本转换成电信号的换能器装置和用于配置处理装置的分析参数的输入装置。 本发明不需要任何电极或探针从语音分析系统附着到被摄体。 该方法使用语音分析系统从生理指标的应力提供生物反馈。 该方法包括用换能器装置记录具有可变持续时间的人类语音样本,将人类语音样本与用于数字化的装置进行数字化,将数字化语音样本存储在存储装置中,利用处理装置确定数字化语音样本中的CHOPS,基于 预定参数和识别数字化语音样本中的CHOPS与处理装置之间的关系。

    Synthesis of speech signals in the absence of coded parameters
    6.
    发明授权
    Synthesis of speech signals in the absence of coded parameters 失效
    在没有编码参数的情况下合成语音信号

    公开(公告)号:US6014621A

    公开(公告)日:2000-01-11

    申请号:US831841

    申请日:1997-04-02

    申请人: Juin-Hwey Chen

    发明人: Juin-Hwey Chen

    摘要: A speech compression system called "Transform Predictive Coding", or TPC, provides for encoding 7 kHz wideband speech (16 kHz sampling) at a target bit-rate range of 16 to 32 kb/s (1 to 2 bits/sample). The system uses short-term and long-term prediction to remove the redundancy in speech. A prediction residual is transformed and coded in the frequency domain to take advantage of knowledge in human auditory perception. The TPC coder uses only open-loop quantization and therefore has a fairly low complexity. The speech quality of TPC is essentially transparent at 32 kb/s, very good at 24 kb/s, and acceptable at 16 kb/s.

    摘要翻译: 称为“变换预测编码”或TPC的语音压缩系统提供以16至32kb / s(1至2位/样本)的目标比特率范围对7 kHz宽带语音(16 kHz采样)进行编码。 该系统使用短期和长期预测来消除语音冗余。 预测残差在频域中进行变换和编码,以利用人类听觉中的知识。 TPC编码器仅使用开环量化,因此具有相当低的复杂度。 TPC的语音质量在32kb / s基本上是透明的,非常好的是24kb / s,在16kb / s下可接受。

    Speech encoding method and apparatus
    7.
    发明授权
    Speech encoding method and apparatus 失效
    语音编码方法和装置

    公开(公告)号:US6003001A

    公开(公告)日:1999-12-14

    申请号:US882156

    申请日:1997-06-25

    申请人: Yuji Maeda

    发明人: Yuji Maeda

    CPC分类号: G10L19/12

    摘要: In encoding in which an adaptive codebook such as PSI-CELP or a fixed codebook is used on switching selection, waveform distortion caused by selection of the fixed codebook in case input speech frequency components are changed significantly is diminished. An output of an adaptive codebook 21 or an output of a fixed codebook 22 is selected by a changeover selection switch 26 and summed to an output of noise codebooks 23, 24 so as to be sent to a linear prediction synthesis filter 16. A switching control circuit 19 for controlling the switching of a changeover control switch 26 operates in response to a prediction gain which is a ratio of the linear prediction residual energy to the initial signal energy from a linear prediction analysis circuit 14 so that, if the prediction gain is smaller than a pre-set threshold value, the switching control circuit 19 judges the input signal to be voiced and controls the changeover control switch 26 for compulsorily selecting the output of the adaptive codebook 21.

    摘要翻译: 在对诸如PSI-CELP或固定码本的自适应码本进行编码的切换选择中,输入的语音频率成分的情况下,在固定码本的选择引起的波形失真显着变化。 自适应码本21的输出或固定码本22的输出由切换选择开关26选择,并且相加到噪声码本23,24的输出,以被发送到线性预测合成滤波器16.切换控制 用于控制切换控制开关26的切换的电路19响应于来自线性预测分析电路14的线性预测残余能量与初始信号能量的比率的预测增益而工作,使得如果预测增益较小 切换控制电路19比预设的阈值判断要被发音的输入信号,并控制转换控制开关26强制选择自适应码本21的输出。

    System and method for noise threshold adaptation for voice activity
detection in nonstationary noise environments
    9.
    发明授权
    System and method for noise threshold adaptation for voice activity detection in nonstationary noise environments 失效
    用于非平稳噪声环境中语音活动检测的噪声阈值适应的系统和方法

    公开(公告)号:US5991718A

    公开(公告)日:1999-11-23

    申请号:US31726

    申请日:1998-02-27

    申请人: David Malah

    发明人: David Malah

    IPC分类号: G10L25/78 G10L9/00

    CPC分类号: G10L25/78 G10L2025/786

    摘要: The system and method of the invention relates to voice detection technology for determining instants of time at which a snapshot of noise characteristics results in improved adaptation of noise floors used in voice detection. The approach is based on the "lower envelope" of the smoothed input signal power. Incorporation of this approach in a simple time domain VAD (Voice Activity Detector) results in an effective low-complexity system which, on the basis of simulations, gives good performance down to SNR values of about 0 dB. In the invention the lower envelope also provides the updated value of the noise threshold during the presence of speech. The invention can also be embedded in other, more complex (e.g., frequency domain) VADs at low computational cost.

    摘要翻译: 本发明的系统和方法涉及用于确定时间的瞬间的语音检测技术,其中噪声特征的快照导致在语音检测中使用的噪声底层的改进的适应。 该方法基于平滑的输入信号功率的“下限”。 将这种方法结合在简单的时域VAD(语音活动检测器)中产生了一种有效的低复杂度系统,其在模拟的基础上提供了低于约0dB的SNR值的良好性能。 在本发明中,下部信封还在语音存在期间提供噪声阈值的更新值。 本发明也可以以低的计算成本嵌入在其他更复杂(例如,频域)VAD中。

    Method and system for efficiently avoiding partial matching in voice
recognition
    10.
    发明授权
    Method and system for efficiently avoiding partial matching in voice recognition 失效
    有效避免语音识别部分匹配的方法和系统

    公开(公告)号:US5974381A

    公开(公告)日:1999-10-26

    申请号:US995258

    申请日:1997-12-19

    申请人: Syuji Kubota

    发明人: Syuji Kubota

    CPC分类号: G10L15/10 G10L2015/088

    摘要: To avoid a predetermined amount of time and or a certain amount of processing time prior to determining a number of frames for each speech input portion, a fast voice recognition system enables real-time frame counting based upon a comparison between a decreasing number of frames and an increasing time-dependent threshold. The real-time voice recognition also enables a substantially reduced rate for erroneous partial matching.

    摘要翻译: 为了在确定每个语音输入部分的帧数之前避免预定量的时间和或一定量的处理时间,快速语音识别系统基于帧数减少和帧数减少之间的比较来实现实时帧计数 增加的时间依赖阈值。 实时语音识别还能够大大降低错误部分匹配的速率。