Method for Animating an Image Using Speech Data
    1.
    发明申请
    Method for Animating an Image Using Speech Data 审中-公开
    使用语音数据来动画化图像的方法

    公开(公告)号:US20080259085A1

    公开(公告)日:2008-10-23

    申请号:US12147840

    申请日:2008-06-27

    IPC分类号: G06T15/70

    摘要: A method for animating an image is useful for animating avatars using real-time speech data. According to one aspect, the method includes identifying an upper facial part and a lower facial part of the image (step 705); animating the lower facial part based on speech data that are classified according to a reduced vowel set (step 710); tilting both the upper facial part and the lower facial part using a coordinate transformation model (step 715); and rotating both the upper facial part and the lower facial part using an image warping model (step 720).

    摘要翻译: 用于使图像动画化的方法对于使用实时语音数据来动画化头像是有用的。 根据一个方面,该方法包括识别图像的上脸部和下脸部(步骤705)。 基于根据减少的元音组分类的语音数据来对下面部部分进行动画化(步骤710); 使用坐标变换模型倾斜上脸部和下面部部分(步骤715); 并使用图像扭曲模型旋转上脸部和下面部部分(步骤720)。

    Speech dialog method and device
    2.
    发明申请
    Speech dialog method and device 审中-公开
    语音对话方法和设备

    公开(公告)号:US20070055524A1

    公开(公告)日:2007-03-08

    申请号:US11222215

    申请日:2005-09-08

    IPC分类号: G10L15/18

    CPC分类号: G10L15/22 G10L13/04

    摘要: An electronic device (200) for speech dialog includes functions that receive (205, 105) an utterance that includes an instantiated variable (215), perform voice recognition (210, 115, 120) of the instantiated variable to determine a most likely set of acoustic states (220) and a corresponding sequence of phonemes with stress information (215), determine prosodic characteristics (272, 274, 276, 130) for a synthesized value of the instantiated variable (236) from the sequence of phonemes with stress information and a set of stored prosody models. The electronic device generates (335, 140) a synthesized value of the instantiated variable using the most likely set of acoustic states and the prosodic characteristics of the instantiated variable.

    摘要翻译: 用于语音对话的电子设备(200)包括接收(205,105)包括实例变量(215)的话语的功能,执行所述实例化变量的语音识别(210,115,120)以确定最可能的一组 声学状态(220)和具有应力信息(215)的相应的音素序列确定来自具有应力信息的音素序列的实例化变量(236)的合成值的韵律特征(272,274,276,130),以及 一组存储的韵律模型。 电子设备使用最可能的声学状态集合和实例化变量的韵律特征来生成(335,140)所述实例化变量的合成值。

    Orientation determination for handwritten characters for recognition thereof
    3.
    发明申请
    Orientation determination for handwritten characters for recognition thereof 审中-公开
    用于识别手写字符的方向确定

    公开(公告)号:US20050041865A1

    公开(公告)日:2005-02-24

    申请号:US10955581

    申请日:2004-09-30

    IPC分类号: G06K9/32 G06K9/00

    CPC分类号: G06K9/3283 G06K2209/01

    摘要: According to one aspect of the invention there is provided a method (20) and electronic device (1) for determining orientation and recognition of handwritten characters scribed on touchscreen (5). The method (20) includes receiving (22) the hand written character and then normalizing (23) the character to provide a scaled character that fits within a defined boundary. The scaled character comprises at least one line and a step of identifying (24) the lines of the scaled character as a vector is effected and thereafter a step of rotating (26) rotates the scaled character from an initial orientation to a final orientation through a plurality of discrete orientations. A step of calculating (27) then calculates, for each of the discrete orientations, magnitudes of co-ordinate components of each vector and then a summing step (28) then sums, for each of said discrete orientations, the co-ordinate components to provide a summed co-ordinate component for the scaled character at a corresponding discrete orientation. An assessing step (31) then assesses each of the summed co-ordinate components to determine a suitable orientation of the scaled character.

    摘要翻译: 根据本发明的一个方面,提供了一种用于确定在触摸屏(5)上划线的手写字符的取向和识别的方法(20)和电子设备(1)。 方法(20)包括接收(22)手写字符,然后对字符进行归一化(23)以提供适合在限定边界内的缩放字符。 缩放的字符包括至少一行和标识(24)作为向量的缩放字符的行的步骤,此后,旋转(26)的步骤将缩放的字符从初始取向旋转到最终定向,通过 多个离散取向。 计算(27)的步骤然后针对每个离散取向计算每个向量的坐标分量的大小,然后求和步骤(28)对于每个所述离散取向,将坐标分量相加到 在相应的离散方向上为缩放字符提供一个总和坐标分量。 评估步骤(31)然后评估每个总和坐标组件以确定缩放角色的合适取向。

    Method and apparatus for non-speech activity reduction of a low bit rate digital voice message
    4.
    发明授权
    Method and apparatus for non-speech activity reduction of a low bit rate digital voice message 失效
    用于低比特率数字语音消息的非语音活动减少的方法和装置

    公开(公告)号:US06370500B1

    公开(公告)日:2002-04-09

    申请号:US09409187

    申请日:1999-09-30

    IPC分类号: G10L1900

    CPC分类号: G10L19/012 G10L25/78

    摘要: A technique is used in a speech encoder (107) that reduces non-speech activity of a low bit rate digital voice message. Speech model parameters that include quantized speech spectral parameter vectors are generated in a sequence of frames. A determination is made as to which frames of the sequence of frames are voiced frames and which frames are unvoiced frames. A consecutive sequence of frames of unvoiced frames is identified (2330) as an unvoiced burst when a length, NUV, of the consecutive sequence of frames exceeds a predetermined length, Ns. A non-speech activity portion of the unvoiced burst is identified (2335-2365) and removed.

    摘要翻译: 在语音编码器(107)中使用技术来减少低比特率数字语音消息的非语音活动。 包括量化语音频谱参数矢量的语音模型参数在帧序列中生成。 确定帧序列的哪些帧是浊音帧,哪些帧是清音帧。 当连续帧序列的长度NUV超过预定长度Ns时,确定无声帧的连续序列(2330)为无声突发。 确定清音突发的非语音活动部分(2335-2365)并移除。

    Alphanumeric message composing method using telephone keypad

    公开(公告)号:US6137867A

    公开(公告)日:2000-10-24

    申请号:US78733

    申请日:1998-05-14

    IPC分类号: H04M11/00

    摘要: An interactive method for composing an alphanumeric message by a caller using a telephone keypad includes storing (215) a lexical database (135) from which unigram probabilities, forward conditional probabilities, and backward conditional probabilities for a plurality of words can be recovered; storing a received sequence of key codes (405) representing a sequence in which keys on a telephone style keypad are keyed; generating a word trellis including candidate words (415) derived from the sequence and the lexical database; determining a most likely phrase (420) from the candidate words, the unigram probabilities, forward conditional probabilities, and backward conditional probabilities; generating a most likely message (425) from the most likely phrase and presenting the most likely message to the caller; and confirming that the most likely message is the alphanumeric message (430).

    Very low bit rate voice messaging system using asymmetric voice
compression processing
    6.
    发明授权
    Very low bit rate voice messaging system using asymmetric voice compression processing 失效
    使用非对称语音压缩处理的非常低比特率的语音留言系统

    公开(公告)号:US5781882A

    公开(公告)日:1998-07-14

    申请号:US528455

    申请日:1995-09-14

    CPC分类号: G10L19/0212 G10L25/27

    摘要: An apparatus and method for processing a voice message to provide low bit rate speech transmission processes the voice message to generate speech parameters which are arranged into a two dimensional parameter matrix (502) including a sequence of parameter frames. The two dimensional parameter matrix (502) is transformed using a predetermined two dimensional matrix transformation function (414) to obtain a two dimensional transform matrix (506). Distance values representing distances between templates of a set of predetermined templates and the two dimensional transform matrix (506) are then derived. The distance values derived are identified by indexes identifying the templates of the set of predetermined templates. The distance values derived are compared, and an index corresponding to a template of the set of predetermined templates having a shortest distance is selected and then transmitted.

    摘要翻译: 一种用于处理语音消息以提供低比特率语音传输的装置和方法,用于处理语音消息以产生被布置成包括参数帧序列的二维参数矩阵(502)的语音参数。 使用预定的二维矩阵变换函数(414)来变换二维参数矩阵(502),以获得二维变换矩阵(506)。 然后导出表示一组预定模板的模板与二维变换矩阵(506)之间的距离的距离值。 导出的距离值通过标识该组预定模板的模板的索引来识别。 比较导出的距离值,并且选择与具有最短距离的预定模板集合的模板相对应的索引,然后发送。

    MBE synthesizer for very low bit rate voice messaging systems
    7.
    发明授权
    MBE synthesizer for very low bit rate voice messaging systems 失效
    用于非常低比特率语音消息系统的MBE合成器

    公开(公告)号:US5684926A

    公开(公告)日:1997-11-04

    申请号:US592252

    申请日:1996-01-26

    CPC分类号: G10L19/16 G10L19/09 G10L19/10

    摘要: An MBE synthesizer (116) for generating a segment of speech from compressed speech data received by a receiver (2004). The compressed speech data includes one or more indexes (2240, 2242) and pitch data (2248). The MBE synthesizer (116) includes the following: an excitation generator (2222) utilizing a transform function for generating transformed excitation components responsive to the pitch data (2248). A memory (3006) for storing a table of predetermined spectral vectors (2205) and associated predetermined voicing vectors (2203). A harmonic amplitude estimator (2209) that is responsive to the one or more predetermined spectra/vectors identified by the indexes (2240, 2242) received, that generates harmonic amplitude control signals. The harmonic amplitude estimator (2209) which includes a peak detector (2503), a peak enhancer (2505), a valley detector (2507), a valley enhancer (2509). A multi-band voicing controller (2214), responsive to the predetermined voicing vectors which are associated with the one or more predetermined spectral vectors identified, for controlling a selection of the excitation components.

    摘要翻译: 一种用于从由接收机接收的压缩语音数据产生语音段的MBE合成器(116)。 压缩语音数据包括一个或多个索引(2240,2242)和音调数据(2248)。 MBE合成器(116)包括以下:利用变换函数的激励发生器(2222),用于响应于音调数据(2248)产生变换的激励分量。 一种用于存储预定光谱向量(2205)和相关联的预定发声矢量(2203)的表的存储器(3006)。 响应于由所接收的指标(2240,2242)所识别的一个或多个预定光谱/矢量的谐波振幅估计器(2209),其产生谐波幅度控制信号。 谐波振幅估计器(2209)包括峰值检测器(2503),峰值增强器(2505),谷值检测器(2507),谷值增强器(2509)。 多频带发声控制器(2214)响应于与所识别的一个或多个预定频谱矢量相关联的预定语音向量,用于控制激励分量的选择。

    Apparatus and method for coding excitation parameters in a very low bit
rate voice messaging system
    8.
    发明授权
    Apparatus and method for coding excitation parameters in a very low bit rate voice messaging system 失效
    用于在非常低比特率语音消息系统中编码激励参数的装置和方法

    公开(公告)号:US5666350A

    公开(公告)日:1997-09-09

    申请号:US603677

    申请日:1996-02-20

    IPC分类号: G10L19/02 H04J3/17

    CPC分类号: G10L19/0212 H04J3/17

    摘要: An apparatus codes excitation parameters for very low bit rate voice messaging using a method that processes a voice message to generating speech parameters. The speech parameters are separated (316) to produce a first group of energy parameters and a second group of pitch and voicing parameters. Subsequently, the first group of energy parameters are encoded and compressed using a non-uniform root-mean-square scalar process (318) to create a first plurality of encoded data. Additionally, the second group of pitch and voicing parameters are compressed, encoded, and combined into a single parameter using a three slope vector encoding process (320) that creates a second plurality of encoded data. Finally, the first and second plurality of encoded data are multiplexed (322) to create a multiplexed signal for transmission, the multiplexed signal representing the voice message.

    摘要翻译: 一种装置使用处理语音消息以产生语音参数的方法来编码用于非常低比特率语音消息的激励参数。 语音参数被分离(316)以产生第一组能量参数和第二组音调和发音参数。 随后,使用非均匀均方根标量过程(318)对第一组能量参数进行编码和压缩,以创建第一多个编码数据。 另外,使用创建第二多个编码数据的三斜率矢量编码处理(320),第二组音调和发声参数被压缩,编码和组合成单个参数。 最后,第一和第二多个编码数据被多路复用(322)以产生用于发送的复用信号,表示语音消息的复用信号。

    AVATAR FOR A PORTABLE DEVICE
    9.
    发明申请
    AVATAR FOR A PORTABLE DEVICE 审中-公开
    用于便携式设备的AVATAR

    公开(公告)号:US20090251484A1

    公开(公告)日:2009-10-08

    申请号:US12062098

    申请日:2008-04-03

    IPC分类号: G09G5/02

    CPC分类号: H04M1/72544 H04M2250/52

    摘要: A portable device comprises a data storage for storing avatar data defining a user avatar. The user avatar is formed by a plurality of visual objects. The portable device further comprises a camera for capturing an image. A visual characteristic processor is arranged to determine a first visual characteristic from the image and an avatar processor is arranged to set an object visual characteristic of an object of the plurality of visual objects in response to the first visual characteristic. The invention may allow improved customization of user avatars. For example, a color of an element of a user avatar may be adapted to a color of a real-life object simply by a user taking a picture thereof.

    摘要翻译: 便携式设备包括用于存储定义用户头像的头像数据的数据存储器。 用户头像由多个视觉对象形成。 便携式设备还包括用于捕获图像的相机。 视觉特征处理器被布置成从图像中确定第一视觉特征,并且化身处理器被布置成响应于第一视觉特性来设置多个视觉对象的对象的对象视觉特征。 本发明可以允许改进用户化身的定制。 例如,用户头像的元素的颜色可以简单地由用户拍摄其图片而适应于现实生活中的对象的颜色。

    Digital signal processor for processing voice messages
    10.
    发明授权
    Digital signal processor for processing voice messages 失效
    用于处理语音信息的数字信号处理器

    公开(公告)号:US06691081B1

    公开(公告)日:2004-02-10

    申请号:US09560110

    申请日:2000-04-28

    IPC分类号: G10L1106

    摘要: A digital signal processor for processing data including voice messaging data that may have both voiced and unvoiced speech components utilizes computer routines stored in a memory used by the digital signal processor. The computer routines programmed provide for control of at least a portion of a selective call receiver; receiving and decoding data received at the selective call receiver; comparing the addresses received at the selective call receiver with addresses stored in a memory location coupled to the digital signal processor; controlling voicing including both voiced and unvoiced speech components; and generating a pitch wave using an inverse discrete Fourier Transform and resample the pitch wave to provide a time domain voiced speech component.

    摘要翻译: 用于处理包括语音消息传送数据的数字信号处理器可以具有有声和无声话音分量两者利用存储在由数字信号处理器使用的存储器中的计算机程序。 编程的计算机程序提供用于控制选呼接收机的至少一部分; 接收和解码在选呼接收机处接收的数据; 将在选呼接收机处接收的地址与存储在耦合到数字信号处理器的存储器位置中的地址进行比较; 控制声音,包括有声和无声语音成分; 以及使用逆离散傅里叶变换产生音调波并重新采样音调波以提供时域有声语音分量。