Method and apparatus for speech recognition using device usage pattern of user

    公开(公告)号:US09824686B2

    公开(公告)日:2017-11-21

    申请号:US11878595

    申请日:2007-07-25

    CPC分类号: G10L15/22 G10L2015/227

    摘要: A method and apparatus for improving the performance of voice recognition in a mobile device are provided. The method of recognizing a voice includes: monitoring the usage pattern of a user of a device for inputting a voice; selecting predetermined words from among words stored in the device based on the result of monitoring, and storing the selected words; and recognizing a voice based on an acoustic model and predetermined words. In this way, a voice can be recognized by using prediction of whom the user mainly makes a call to. Also, by automatically modeling the device usage pattern of the user and applying the pattern to vocabulary for voice recognition based on probabilities, the performance of voice recognition, as actually felt by the user, can be enhanced.

    Speech recognition method and apparatus using lexicon group tree
    33.
    发明授权
    Speech recognition method and apparatus using lexicon group tree 有权
    使用词汇组树的语音识别方法和装置

    公开(公告)号:US07953594B2

    公开(公告)日:2011-05-31

    申请号:US11342701

    申请日:2006-01-31

    IPC分类号: G10L11/06

    CPC分类号: G06F17/2765 G10L15/197

    摘要: A method and an apparatus for selecting a vocabulary closest to an input speech from among lexicons stored in memory, wherein a centroid lexicon representing lexicons belonging to a predetermined lexicon group is generated. Two lexicons, having a longest distance therebetween in the lexicon group, are selected using the centroid lexicon from the lexicon group, and a node indicating the lexicon group branches based on the two selected lexicons. A node having low group similarity is selected from among current terminal nodes, including branch nodes, and the above procedure is repeatedly performed on a lexicon group indicated by the selected node.

    摘要翻译: 一种用于从存储在存储器中的词典中选择最接近输入语音的词汇的方法和装置,其中生成表示属于预定词典组的词典的质心词典。 在词典组中具有最长距离的两个词典使用来自词典组的质心词典进行选择,并且指示词典组的节点基于两个选定的词典进行分支。 从包括分支节点的当前终端节点中选择具有低组相似性的节点,并且对由所选节点指示的词典组重复执行上述过程。

    Multi-stage speech recognition apparatus and method
    34.
    发明申请
    Multi-stage speech recognition apparatus and method 有权
    多级语音识别装置及方法

    公开(公告)号:US20080208577A1

    公开(公告)日:2008-08-28

    申请号:US11889665

    申请日:2007-08-15

    IPC分类号: G10L15/00

    CPC分类号: G10L15/32 G10L15/02 G10L15/16

    摘要: Provided are a multi-stage speech recognition apparatus and method. The multi-stage speech recognition apparatus includes a first speech recognition unit performing initial speech recognition on a feature vector, which is extracted from an input speech signal, and generating a plurality of candidate words; and a second speech recognition unit rescoring the candidate words, which are provided by the first speech recognition unit, using a temporal posterior feature vector extracted from the speech signal.

    摘要翻译: 提供了一种多级语音识别装置和方法。 多级语音识别装置包括:第一语音识别单元,对从输入语音信号提取的特征向量进行初始语音识别,生成多个候选词; 以及第二语音识别单元,使用从所述语音信号提取的时间后向特征向量,对由所述第一语音识别单元提供的候选词进行重新排序。

    Method and apparatus for synthesizing speech from text
    35.
    发明授权
    Method and apparatus for synthesizing speech from text 有权
    从文本合成语音的方法和装置

    公开(公告)号:US07369995B2

    公开(公告)日:2008-05-06

    申请号:US10785113

    申请日:2004-02-25

    IPC分类号: G10L13/02

    CPC分类号: G10L13/07

    摘要: A speech synthesis method, in which speech units are concatenated using a DB, wherein the speech units to be concatenated are determined and divided into a left speech unit and a right speech unit. The length of an interpolation region of each of the left and right speech units is variably determined. An extension is attached to a right boundary of the left speech unit and an extension to a left boundary of the right speech unit. The locations of pitch marks included in the extension of each of the left and right speech units are aligned so that the pitch marks can fit in the predetermined interpolation region. The left and right speech units are superimposed after fading out the left speech unit and fading in the right speech unit. Accordingly, a determination of whether extra-segmental data exists or not is made, and smoothing concatenation is performed using either an interpolation of existing data or an interpolation of extrapolated data depending on the result of the determination.

    摘要翻译: 一种语音合成方法,其中语音单元使用DB级联,其中要连接的语音单元被确定并分成左语音单元和右语音单元。 可变地确定左右语音单元中的每一个的内插区域的长度。 扩展部分附加到左侧语音单元的右边界,并且扩展到右侧语音单元的左边界。 左右声音单元的延伸中包括的音高标记的位置对准,使得音高标记可以适合于预定的插值区域。 左右语音单元在淡出左侧语音单元并在右侧语音单元中衰落之后叠加。 因此,确定是否存在分段数据是否存在,并且根据确定的结果,使用现有数据的插值或外插数据的插值来执行平滑级联。

    Flexible printed circuit board
    36.
    发明申请
    Flexible printed circuit board 有权
    柔性印刷电路板

    公开(公告)号:US20080074853A1

    公开(公告)日:2008-03-27

    申请号:US11819021

    申请日:2007-06-25

    IPC分类号: H05K1/00

    摘要: A flexible printed circuit board includes a first substrate portion having at least one first terminal, a second substrate portion in communication with the first substrate portion and having at least one circuit device, a connection substrate portion in communication with the second substrate portion, the connection substrate portion extending away from the second substrate portion in a same direction as the first substrate portion, and a third substrate portion in communication with the connection substrate portion, the third substrate portion having at least one second terminal.

    摘要翻译: 柔性印刷电路板包括具有至少一个第一端子的第一衬底部分,与第一衬底部分连通并具有至少一个电路器件的第二衬底部分,与第二衬底部分连通的连接衬底部分, 基板部分沿与第一基板部分相同的方向从第二基板部分延伸,以及与连接基板部分连通的第三基板部分,第三基板部分具有至少一个第二端子。

    System and method for providing information using spoken dialogue interface
    37.
    发明授权
    System and method for providing information using spoken dialogue interface 有权
    使用口语对话界面提供信息的系统和方法

    公开(公告)号:US07225128B2

    公开(公告)日:2007-05-29

    申请号:US10401695

    申请日:2003-03-31

    IPC分类号: G10L15/00

    CPC分类号: G10L15/22

    摘要: There are provided a system and method for providing information using a spoken dialogue interface. The system includes a speech recognizer for transforming voice signals into sentences; a sentence analyzer for analyzing the sentences by their structural elements; a dialogue manager for extracting information on speech acts or intentions from the structural elements, and generating information on system's speech acts or intentions for a response to the extracted information on speech acts or intentions; a sentence generator for generating sentences based on the information on the system's speech acts or intentions for the response; a speech synthesizer for synthesizing the generated sentences into voices; an information extractor for extracting information required for the response from the Internet in real time; and a user modeling means for analyzing and classifying users' tendencies. Information demanded by a user can be detected in real time and provided through a voice interface with versatile and familiar dialogues based on the user's tendencies.

    摘要翻译: 提供了一种使用口头对话界面提供信息的系统和方法。 该系统包括用于将语音信号转换为句子的语音识别器; 一个句子分析器,用于通过结构元素分析句子; 提供关于结构要素的言语行为或意图的信息的对话管理者,并且产生关于系统的言语行为或意图的信息以对所提取的关于言语行为或意图的信息的响应; 用于根据关于系统的言语行为或反应意图的信息产生句子的句子生成器; 用于将生成的句子合成到语音中的语音合成器; 用于从互联网实时提取响应所需的信息的信息提取器; 以及用于分析和分类用户倾向的用户建模装置。 可以实时检测用户所要求的信息,并通过语音界面提供基于用户倾向的通用和熟悉的对话。

    Method and apparatus for recognizing speech by measuring confidence levels of respective frames
    38.
    发明申请
    Method and apparatus for recognizing speech by measuring confidence levels of respective frames 有权
    通过测量各帧的置信水平来识别语音的方法和装置

    公开(公告)号:US20060190259A1

    公开(公告)日:2006-08-24

    申请号:US11355082

    申请日:2006-02-16

    IPC分类号: G10L15/14

    CPC分类号: G10L15/08 G10L15/142

    摘要: Disclosed herein is a method and apparatus to recognize speech by measuring the confidence levels of respective frames. The method includes the operations of obtaining frequency features of a received speech signal for the respective frames having a predetermined length, calculating a keyword model-based likelihood and a filler model-based likelihood for each of the frame, calculating a confidence score based on the two types of likelihoods, and deciding whether the received speech signal corresponds to a keyword or a non-keyword based on the confidence scores. Also, the method includes the operation of transforming the confidence scores by applying transform functions of clusters, which include the confidence scores or are close to the confidence scores, to the confidence scores.

    摘要翻译: 本文公开了一种通过测量各个帧的置信水平来识别语音的方法和装置。 该方法包括获得具有预定长度的各个帧的接收到的语音信号的频率特征的操作,计算每个帧的基于关键词模型的可能性和基于填充模型的可能性,基于 两种类型的可能性,并且基于置信度分数来决定接收到的语音信号是否对应于关键字或非关键字。 此外,该方法包括通过将包括置信分数或接近置信度得分的聚类的变换函数应用到置信度得分来变换置信度分数的操作。