Automatic transmission of voice-to-text converted voice message
    1.
    发明授权
    Automatic transmission of voice-to-text converted voice message 失效
    自动传输语音转换语音消息

    公开(公告)号:US07103154B1

    公开(公告)日:2006-09-05

    申请号:US09007770

    申请日:1998-01-16

    摘要: A voice messaging system includes an input device to accept a destination electronic messaging address, a voice-to-text converter to convert or transcribe a received voice message into a converted text message, and a processor to operate an electronic messaging program and to prepare the text message for automatic electronic transmission to the destination electronic messaging address. A method is also provided to convert a voice message into a text message and to transmit the same to a destination electronic messaging address. The method includes inputting a destination electronic messaging address. A received voice message is converted or transcribed into a text message and prepared as a text file. The prepared text file is automatically transmitted to the destination electronic messaging address. A log file including the text of the voice messages may be maintained with updates at a predetermined time interval. A database of the converted text messages may be generated and maintained to provide a searchable mechanism of archived messages.

    摘要翻译: 语音消息系统包括接收目的地电子消息地址的输入设备,将接收的语音消息转换或转录成转换的文本消息的语音到文本转换器,以及处理器来操作电子消息传递程序并准备 用于自动电子传输到目的地电子消息地址的短信。 还提供了一种用于将语音消息转换为文本消息并将其发送到目的地电子消息地址的方法。 该方法包括输入目的地电子消息地址。 接收到的语音消息被转换或转录成文本消息并准备为文本文件。 准备好的文本文件将自动发送到目标电子消息地址。 可以以预定的时间间隔更新包括语音消息的文本的日志文件。 可以生成和维护转换的文本消息的数据库,以提供可归档消息的可搜索机制。

    Split vector quantization using unequal subvectors
    2.
    发明授权
    Split vector quantization using unequal subvectors 失效
    使用不等分子矢量分解矢量量化

    公开(公告)号:US6134520A

    公开(公告)日:2000-10-17

    申请号:US578441

    申请日:1995-12-26

    IPC分类号: G10L19/02 G10L19/06 G10L5/00

    CPC分类号: G10L19/038 G10L19/07

    摘要: A 1200 b/s vocoder providing a high degree of speech intelligibility and natural voice quality includes a tenth-order linear prediction analyzer, a split vector quantizer for line spectral frequencies, circuitry providing voicing classification and pitch estimation, a differential pitch and gain quantizer and a multiplexer for producing an encoded word transmitted to a receptive demultiplexer. The vocoder provides a characteristic encoded word including a first codeword, a second codeword, a pitch codeword and a gain codeword, wherein the first and second codewords are selected from respective first and second codebooks having a equal number of codewords and wherein the first and second codewords represent unequal numbers of elements of respective first and second sub-vectors. A codebook populating method for a split vector quantizer vocoder is also utilized.

    摘要翻译: 提供高度语音清晰度和自然语音质量的1200b / s声码器包括十阶线性预测分析器,用于线谱频率的分割向量量化器,提供语音分类和音调估计的电路,差分音调和增益量化器,以及 多路复用器,用于产生被发送到接收解复用器的编码字。 声码器提供包括第一码字,第二码字,音调码字和增益码字的特征编码字,其中从相应数量的码字的相应第一和第二码本中选择第一和第二码字,其中第一和第二码字 码字表示相应的第一和第二子向量的不等数量的元素。 还使用用于分割矢量量化器声码器的码本填充方法。

    Text-to-speech system and a method and apparatus for training the same
based upon intonational feature annotations of input text
    3.
    发明授权
    Text-to-speech system and a method and apparatus for training the same based upon intonational feature annotations of input text 失效
    文本到语音系统以及基于输入文本的语义特征注释来对其进行训练的方法和装置

    公开(公告)号:US6003005A

    公开(公告)日:1999-12-14

    申请号:US978359

    申请日:1997-11-25

    申请人: Julia Hirschberg

    发明人: Julia Hirschberg

    CPC分类号: G10L13/04

    摘要: A method of training a TTS or other system to assign intonational features, such as intonational phrase boundaries, to input text that overcome the shortcomings of the known methods is described. The method of training involves taking a set of predetermined text (not speech or a signal representative of speech) and having a human annotate it with intonational feature annotations. This results in annotated text. Next, the structure of the set of predetermined text is analyzed to generate information. This information is used, along with the intonational feature annotations, to generate a statistical representation. The statistical representation may then be stored and repeatedly used to generate synthesized speech from new sets of input text without training the TTS system further. The resulting trained system and use thereof are also part of the invention.

    摘要翻译: 描述了训练TTS或其他系统以将诸如语音短语边界之类的语调特征分配给输入文本以克服已知方法的缺点的方法。 训练的方法涉及采用一组预定文本(而不是语音或代表语音的信号),并且使人们用语言特征注释来注释它。 这将导致注释文本。 接下来,分析该组预定文本的结构以生成信息。 该信息与语义特征注释一起用于生成统计表示。 然后可以存储并重复使用统计表示,以从新的输入文本集合生成合成语音,而不进一步训练TTS系统。 所得到的训练系统及其用途也是本发明的一部分。

    Speech compression by speech recognition
    4.
    发明授权
    Speech compression by speech recognition 失效
    语音识别语音压缩

    公开(公告)号:US5987405A

    公开(公告)日:1999-11-16

    申请号:US881435

    申请日:1997-06-24

    IPC分类号: H04B1/66 G10L5/00

    CPC分类号: G10L19/0018

    摘要: A method of transmitting speech signals with reduced bandwith requirements. With this invention an original speech signal is first converted to a textual representation, and a facsimile of the original speech is determined from the textual representation. Then a minimum error turn is derived from the difference between the original speech signal and the facsimile of the original speech signal. The minimum error turn is then compressed, and it is this compressed minimum error turn, along with the textual representation, that is transmitted on the communications medium. At the receiving end, the textual representation and the difference representation are split through a demultiplexer. The textual representation is then passed through a synthesizer while the difference representation is passed through a mapper. The synthesizer along with synthesis parameter storage converts the textual representation into a digital representation of speech, while the mapper modifies the received difference representation by applying sub or super sampling corrections.

    摘要翻译: 一种传输带宽要求较低的语音信号的方法。 利用本发明,原始语音信号首先被转换为文本表示,并且从文本表示确定原始语音的传真。 然后从原始语音信号和原始语音信号的传真之间的差异导出最小误差转向。 然后压缩最小误差转换,并且这是在通信介质上传输的压缩最小误差以及文本表示。 在接收端,通过解复用器分割文本表示和差异表示。 然后,文本表示通过合成器,而差分表示通过映射器。 合成器以及合成参数存储将文本表示转换成语音的数字表示,而映射器通过应用子或超采样校正来修改所接收的差异表示。

    Voice-activated interactive speech recognition device and method
    5.
    发明授权
    Voice-activated interactive speech recognition device and method 失效
    语音激活交互式语音识别装置及方法

    公开(公告)号:US5983186A

    公开(公告)日:1999-11-09

    申请号:US700181

    申请日:1996-08-20

    CPC分类号: G10L15/26 G10L2025/783

    摘要: Techniques for implementing adaptable voice activation operations for interactive speech recognition devices and instruments. Specifically, such speech recognition devices and instruments include an input sound signal power or volume detector in communication with a central CPU for bringing the CPU out of an initial sleep state upon detection of perceived voice exceeding a predetermined threshold volume level and is continuously perceived for at least a certain period of time. If both these conditions are satisfied, the CPU is transitioned into an active mode so that the perceived voice can be analyzed against a set of registered key words to determine if a "power on" command or similar instruction has been received. If so, the CPU maintains an active state in normal speech recognition processing ensues until a "power off" command is received. However, if the perceived and analyzed voice can not be recognized, it is deemed to be background noise and the minimum threshold is selectively updated to accommodate the volume level of the perceived but unrecognized voice. Other aspects include tailoring the volume level of the synthesized voice response according to the perceived volume level as detected by the input sound signal power detector, as well as modifying audible response volume in accordance with updated volume threshold levels.

    摘要翻译: 用于实现交互式语音识别设备和仪器的适应性语音激活操作的技术。 具体地说,这样的语音识别装置和仪器包括与中央CPU通信的输入声音信号功率或音量检测器,用于在检测到超过预定阈值音量水平的感知语音时使CPU退出初始睡眠状态,并且持续感觉到 至少一段时间。 如果满足这两个条件,则CPU转换到活动模式,使得可以针对一组注册的关键字分析感知的语音,以确定是否已经接收到“开机”命令或类似的指令。 如果是这样,则在正常语音识别处理中CPU保持活动状态,直到接收到“断电”命令。 然而,如果感知和分析的语音不能被识别,则将其视为背景噪声,并且选择性地更新最小阈值以适应感知但未被识别的语音的音量级别。 其他方面包括根据由输入声音信号功率检测器检测到的感知音量水平定制合成语音响应的音量级别,以及根据更新的音量阈值水平修改可听见的响应音量。

    Methods and apparatus for decreasing the size of pattern recognition
models by pruning low-scoring models from generated sets of models
    6.
    发明授权
    Methods and apparatus for decreasing the size of pattern recognition models by pruning low-scoring models from generated sets of models 失效
    通过从生成的模型集修剪低分数模型来减小模式识别模型的大小的方法和装置

    公开(公告)号:US5950158A

    公开(公告)日:1999-09-07

    申请号:US903532

    申请日:1997-07-30

    申请人: Kuansan Wang

    发明人: Kuansan Wang

    IPC分类号: G06K9/62 G10L15/14 G10L5/00

    CPC分类号: G10L15/144 G06K9/6228

    摘要: Methods and apparatus for producing efficiently sized models suitable for pattern recognition purposes are described. Various embodiments are directed to the automated generation, evaluation, and selection of reduced size models from an initial model having a relatively large number of components, e.g., more components than can be stored for a particular intended application. To achieve model size reduction in an automated iterative manner, expectation maximization (EM) model training techniques are combined, in accordance with the present invention, with model size constraints. In one embodiment, a plurality of reduced size models are generated using a LaGrange multiplier from an input model and input size constraints. The plurality of reduced size models are stored in a buffer and scored using a likelihood scoring technique. The one of the reduced size models receiving the highest score may be selected as the reduced size model to be output or used as an input model during future iterations of the model size reduction process. The reduced size model to be used, e.g., for speech, image or other pattern recognition purposes, may be selected from the buffered models produced during multiple iterations of the model size reduction process.

    摘要翻译: 描述了用于生产适合于图案识别目的的有效尺寸模型的方法和装置。 各种实施例涉及从具有相对大量组件的初始模型(例如,比可针对特定预期应用可以存储的组件更多的组件)自动生成,评估和选择减小尺寸模型。 为了以自动化迭代方式实现模型尺寸减小,根据本发明,将期望最大化(EM)模型训练技术与模型大小约束相结合。 在一个实施例中,使用来自输入模型和输入大小约束的LaGrange乘法器来生成多个缩小尺寸模型。 将多个减小尺寸的模型存储在缓冲器中并使用似然评分技术进行评分。 在模型尺寸缩小过程的未来迭代期间,可以选择接收最高分数的缩小尺寸模型之一作为要输出的缩小尺寸模型或用作输入模型。 可以从在模型尺寸缩小处理的多次迭代期间产生的缓冲模型中选择要使用的缩小尺寸模型,例如用于语音,图像或其它模式识别目的。

    Device and method for dubbing an audio-visual presentation which
generates synthesized speech and corresponding facial movements
    7.
    发明授权
    Device and method for dubbing an audio-visual presentation which generates synthesized speech and corresponding facial movements 失效
    用于复制产生合成语音和相应面部动作的视听呈现的设备和方法

    公开(公告)号:US5826234A

    公开(公告)日:1998-10-20

    申请号:US760811

    申请日:1996-12-05

    申请人: Bertil Lyberg

    发明人: Bertil Lyberg

    摘要: A device and method in which polyphones of speech of a first language is received and stored as well as a movement pattern in a person's face and/or body is registered. The registration of the movement pattern is made by measuring movement at a number of measuring points in the face/body of the speaker, where the measurements are made at the same time that the polyphones are registered. In connection with translation of a person's speech from one language into another, the polyphones and corresponding movement patterns in the face are linked up to a movement model in the face. A picture image of a face of the real person is after that pasted over the model, at which one to the language corresponding movement pattern is obtained. The invention consequently gives the impression that the person really speaks the language in question.

    摘要翻译: 接收和存储第一语言的多媒体话音以及人脸和/或身体中的运动模式的装置和方法。 通过测量扬声器的面部/身体中的多个测量点的移动来进行移动模式的登记,其中在同时注册多媒体的同时进行测量。 关于将一个人的言语从一种语言翻译成另一种语言,面部的多媒体和相应的运动模式被连接到脸部的运动模型。 在真实人物的脸部的图像之后,粘贴在模型上,获得与语言对应的运动模式相对应的图像。 因此,本发明给人的印象是,该人真正地说出所讨论的语言。

    Speech coding apparatus having amplitude information set to correspond
with position information
    8.
    发明授权
    Speech coding apparatus having amplitude information set to correspond with position information 失效
    语音编码装置具有设定为与位置信息对应的振幅信息

    公开(公告)号:US5826226A

    公开(公告)日:1998-10-20

    申请号:US722635

    申请日:1996-09-27

    申请人: Kazunori Ozawa

    发明人: Kazunori Ozawa

    CPC分类号: G10L19/10

    摘要: The invention provides a speech coding apparatus by which a good sound quality can be obtained even when the bit rate is low. The speech coding apparatus includes an excitation quantization circuit which quantizes an excitation signal using a plurality of pulses. The position of at least one of the pulses is represented by a number of bits determined in advance, and the amplitude of the pulse is determined in advance depending upon the position of the pulse.

    摘要翻译: 本发明提供了一种即使在比特率低的情况下也可以获得良好的音质的语音编码装置。 语音编码装置包括使用多个脉冲对激励信号进行量化的激励量化电路。 至少一个脉冲的位置由预先确定的位数表示,并且根据脉冲的位置预先确定脉冲的幅度。

    Learning vector quantization and a temporary memory such that the
codebook contents are renewed when a first speaker returns
    9.
    发明授权
    Learning vector quantization and a temporary memory such that the codebook contents are renewed when a first speaker returns 失效
    学习矢量量化和临时存储器,使得当第一说话者返回时,码本内容被更新

    公开(公告)号:US5797118A

    公开(公告)日:1998-08-18

    申请号:US512311

    申请日:1995-08-08

    申请人: Akitoshi Saito

    发明人: Akitoshi Saito

    摘要: An encoding/decoding system employing vector quantization realizes a high quality encoding and decoding with decreased quantizing errors, employing a small sized codebook which faithfully represents each of the inputted waveform vectors. An encoding/decoding system includes an encoding apparatus and a decoding apparatus, each having a codebook for storing information vectors representative of a predetermined number of signal patterns and index that determine the information vectors. The encoding apparatus compares a vector representing an object signal to be quantized with each information vector in the codebook, selects an information vector that is closest to the vector and outputs an index for the information vector. The decoding apparatus obtains an information vector corresponding to the index obtained at the encoding apparatus side by referring to the codebook and decodes the object signal. The codebook utilizes a temporary memory connected thereto. The content of the codebook is temporarily moved to the temporary memory when the identity of the speaker changes. The contents of the temporary memory are read out when the original speakers returns to the system.

    摘要翻译: 采用矢量量化的编码/解码系统采用一种忠实地表示每个输入的波形向量的小尺寸码本,实现具有降低的量化误差的高质量编码和解码。 编码/解码系统包括编码装置和解码装置,每个编码装置和解码装置具有用于存储表示预定数量的信号模式的信息矢量的码本和确定信息矢量的索引。 编码装置将表示要量化的对象信号的矢量与码本中的每个信息矢量进行比较,选择最靠近矢量的信息矢量并输出信息矢量的索引。 解码装置通过参照码本获取与在编码装置一侧获得的索引相对应的信息矢量,对该对象信号进行解码。 码本利用与其连接的临时存储器。 当扬声器的身份改变时,码本的内容临时移动到临时存储器。 当原始扬声器返回到系统时,临时存储器的内容被读出。

    Speech coding apparatus, linear prediction coefficient analyzing
apparatus and noise reducing apparatus
    10.
    发明授权
    Speech coding apparatus, linear prediction coefficient analyzing apparatus and noise reducing apparatus 失效
    语音编码装置,线性预测系数分析装置和降噪装置

    公开(公告)号:US5774846A

    公开(公告)日:1998-06-30

    申请号:US559667

    申请日:1995-11-20

    申请人: Toshiyuki Morii

    发明人: Toshiyuki Morii

    摘要: A sample speech is analyzed by a speech analyzing unit to obtain sample characteristic parameters, and a coding distortion is calculated from the sample characteristic parameters in each of a plurality of coding modules. The sample characteristic parameters and the coding distortions are statistically processed by a statistical processing unit to obtain a coding module selecting rule. Thereafter, when a speech is analyzed by the speech analyzing unit to obtain characteristic parameters, an appropriate coding module is selected by a coding module selecting unit from the coding modules according to the coding module selecting rule on condition that a coding distortion for the characteristic parameters is minimized in the appropriate coding module. Thereafter, the characteristic parameters of the speech are coded in the appropriate coding module, and a coded speech is obtained. When the coded speech is decoded, a reproduced speech is obtained. Accordingly, because an appropriate coding module can be easily selected from a plurality of coding modules according to the coding module selecting rule, any allophone occurring in a reproduced speech can be prevented at a low calculation volume.

    摘要翻译: 通过语音分析单元分析样本语音以获得样本特征参数,并且根据多个编码模块中的每一个中的样本特征参数计算编码失真。 样本特征参数和编码失真由统计处理单元统计处理,以获得编码模块选择规则。 此后,当通过语音分析单元分析语音以获得特征参数时,根据编码模块选择规则,由编码模块选择单元从编码模块选择单元选择适当的编码模块,条件是特征参数的编码失真 在适当的编码模块中被最小化。 此后,将语音的特征参数编码在适当的编码模块中,并获得编码语音。 当编码语音被解码时,获得再现语音。 因此,由于可以根据编码模块选择规则从多个编码模块中容易地选择适当的编码模块,所以可以以较低的计算量来防止出现在再现语音中的任何异音素。