Utterance state detection device and utterance state detection method
    1.
    发明授权
    Utterance state detection device and utterance state detection method 有权
    发音状态检测装置和发声状态检测方法

    公开(公告)号:US09099088B2

    公开(公告)日:2015-08-04

    申请号:US13064871

    申请日:2011-04-21

    CPC分类号: G10L17/26 G10L25/48

    摘要: An utterance state detection device includes an user voice stream data input unit that gets user voice stream data of an user, a frequency element extraction unit that extracts high frequency elements by frequency-analyzing the user voice stream data, a fluctuation degree calculation unit that calculates a fluctuation degree of the high frequency elements thus extracted every unit time, a statistic calculation unit that calculates a statistic every certain interval based on a plurality of the fluctuation degrees in a certain period of time, and an utterance state detection unit that detects an utterance state of a specified user based on the statistic obtained from user voice stream data of the specified user.

    摘要翻译: 发声状态检测装置包括:用户语音流数据输入单元,其获取用户的用户语音流数据;频率元素提取单元,其通过对用户语音流数据进行频率分析来提取高频元素;波动度计算单元, 每单位时间提取的高频元件的波动程度,统计量计算单元,其基于一定时间段内的多个波动度计算每一定间隔的统计量;以及发声状态检测单元,其检测发音 基于从指定用户的用户语音流数据获得的统计量来指定用户的状态。

    Utterance state detection device and utterance state detection method
    2.
    发明申请
    Utterance state detection device and utterance state detection method 有权
    发音状态检测装置和发声状态检测方法

    公开(公告)号:US20110282666A1

    公开(公告)日:2011-11-17

    申请号:US13064871

    申请日:2011-04-21

    IPC分类号: G10L17/00

    CPC分类号: G10L17/26 G10L25/48

    摘要: An utterance state detection device includes an user voice stream data input unit that gets user voice stream data of an user, a frequency element extraction unit that extracts high frequency elements by frequency-analyzing the user voice stream data, a fluctuation degree calculation unit that calculates a fluctuation degree of the high frequency elements thus extracted every unit time, a statistic calculation unit that calculates a statistic every certain interval based on a plurality of the fluctuation degrees in a certain period of time, and an utterance state detection unit that detects an utterance state of a specified user based on the statistic obtained from user voice stream data of the specified user.

    摘要翻译: 发声状态检测装置包括:用户语音流数据输入单元,其获取用户的用户语音流数据;频率元素提取单元,其通过对用户语音流数据进行频率分析来提取高频元素;波动度计算单元, 每单位时间提取的高频元件的波动程度,统计量计算单元,其基于一定时间段内的多个波动度计算每一定间隔的统计量;以及发声状态检测单元,其检测发音 基于从指定用户的用户语音流数据获得的统计量来指定用户的状态。

    Speech recognition device and method outputting or rejecting derived words
    3.
    发明授权
    Speech recognition device and method outputting or rejecting derived words 有权
    语音识别装置和方法输出或拒绝派生词

    公开(公告)号:US08903724B2

    公开(公告)日:2014-12-02

    申请号:US13363411

    申请日:2012-02-01

    IPC分类号: G10L15/08 G10L15/02

    CPC分类号: G10L15/02 G10L2015/088

    摘要: A speech recognition device includes, a speech recognition section that conducts a search, by speech recognition, on audio data stored in a first memory section to extract word-spoken portions where plural words transferred are each spoken and, of the word-spoken portions extracted, rejects the word-spoken portion for the word designated as a rejecting object; an acquisition section that obtains a derived word of a designated search target word, the derived word being generated in accordance with a derived word generation rule stored in a second memory section or read out from the second memory section; a transfer section that transfers the derived word and the search target word to the speech recognition section, the derived word being set to the outputting object or the rejecting object by the acquisition section; and an output section that outputs the word-spoken portion extracted and not rejected in the search.

    摘要翻译: 一种语音识别装置,包括:语音识别部,其通过语音识别对存储在第一存储器部分中的音频数据进行搜索,以提取每个口令传送多个字的所述语音部分,并且提取所述单词语音部分 拒绝指定为拒绝对象的单词的单词部分; 获取部分,其获得指定搜索目标词的导出词,所述导出词根据存储在第二存储器部分中的从所述第二存储器部分读出的导出词生成规则生成; 将所述导出词和所述搜索目标词传送到所述语音识别部的传送部,所述获取部分被设置到所述输出对象或所述拒绝对象; 以及输出部分,其输出在搜索中提取而不被拒绝的词语部分。

    SPOKEN TERM DETECTION APPARATUS, METHOD, PROGRAM, AND STORAGE MEDIUM
    4.
    发明申请
    SPOKEN TERM DETECTION APPARATUS, METHOD, PROGRAM, AND STORAGE MEDIUM 有权
    SPOKEN TERM检测装置,方法,程序和存储介质

    公开(公告)号:US20110218805A1

    公开(公告)日:2011-09-08

    申请号:US13039495

    申请日:2011-03-03

    IPC分类号: G10L15/00

    CPC分类号: G10L15/00

    摘要: A spoken term detection apparatus includes: processing performed by a processor includes a feature extraction process extracting an acoustic feature from speech data accumulated in an accumulation part and storing an extracted acoustic feature in an acoustic feature storage, a first calculation process calculating a standard score from a similarity between an acoustic feature stored in the acoustic feature storage and an acoustic model stored in the acoustic model storage part, a second calculation process for comparing an acoustic model corresponding to an input keyword with the acoustic feature stored in the acoustic feature storage part to calculate a score of the keyword, and a retrieval process retrieving speech data including the keyword from speech data accumulated in the accumulation part based on the score of the keyword calculated by the second calculation process and the standard score stored in the standard score storage part.

    摘要翻译: 口语术语检测装置包括:处理器执行的处理包括特征提取处理,从累积在累积部分中的语音数据中提取声学特征,并将所提取的声学特征存储在声学特征存储器中,第一计算处理计算标准分数 存储在声学特征存储器中的声学特征和存储在声学模型存储部分中的声学模型之间的相似性,第二计算处理,用于将存储在声学特征存储部分中的与声学特征存储部分中存储的声学特征相对应的输入关键词的声学模型与 计算关键字的分数,以及检索处理,根据由第二计算处理计算的关键词的分数和存储在标准分数存储部分中的标准分数,从积累部分中累积的语音数据中检索包括该关键字的语音数据。

    Correction of matching results for speech recognition
    5.
    发明授权
    Correction of matching results for speech recognition 有权
    校正语音识别匹配结果

    公开(公告)号:US07991614B2

    公开(公告)日:2011-08-02

    申请号:US12558249

    申请日:2009-09-11

    IPC分类号: G10L17/00 G10L11/06 G10L15/20

    CPC分类号: G10L15/05

    摘要: A speech recognition system includes the following: a feature calculating unit; a sound level calculating unit that calculates an input sound level in each frame; a decoding unit that matches the feature of each frame with an acoustic model and a linguistic model, and outputs a recognized word sequence; a start-point detector that determines a start frame of a speech section based on a reference value; an end-point detector that determines an end frame of the speech section based on a reference value; and a reference value updating unit that updates the reference value in accordance with variations in the input sound level. The start-point detector updates the start frame every time the reference value is updated. The decoding unit starts matching before being notified of the end frame and corrects the matching results every time it is notified of the start frame. The speech recognition system can suppress a delay in response time while performing speech recognition based on a proper speech section.

    摘要翻译: 语音识别系统包括:特征计算单元; 声级计算单元,其计算每帧中的输入声级; 解码单元,其将每个帧的特征与声学模型和语言模型相匹配,并输出识别的字序列; 起点检测器,其基于参考值确定语音部分的起始帧; 终点检测器,其基于参考值确定语音部分的结束帧; 以及参考值更新单元,其根据输入声级的变化更新参考值。 起始点检测器每次更新参考值时更新起始帧。 解码单元在通知结束帧之前开始匹配,并且在每次通知起始帧时校正匹配结果。 语音识别系统可以在基于适当的语音部分执行语音识别的同时抑制响应时间的延迟。

    SPEECH RECOGNITION SYSTEM, SPEECH RECOGNITION PROGRAM, AND SPEECH RECOGNITION METHOD
    6.
    发明申请
    SPEECH RECOGNITION SYSTEM, SPEECH RECOGNITION PROGRAM, AND SPEECH RECOGNITION METHOD 有权
    语音识别系统,语音识别程序和语音识别方法

    公开(公告)号:US20100004932A1

    公开(公告)日:2010-01-07

    申请号:US12558249

    申请日:2009-09-11

    IPC分类号: G10L15/28

    CPC分类号: G10L15/05

    摘要: A speech recognition system includes the following: a feature calculating unit; a sound level calculating unit that calculates an input sound level in each frame; a decoding unit that matches the feature of each frame with an acoustic model and a linguistic model, and outputs a recognized word sequence; a start-point detector that determines a start frame of a speech section based on a reference value; an end-point detector that determines an end frame of the speech section based on a reference value; and a reference value updating unit that updates the reference value in accordance with variations in the input sound level. The start-point detector updates the start frame every time the reference value is updated. The decoding unit starts matching before being notified of the end frame and corrects the matching results every time it is notified of the start frame. The speech recognition system can suppress a delay in response time while performing speech recognition based on a proper speech section.

    摘要翻译: 语音识别系统包括:特征计算单元; 声级计算单元,其计算每帧中的输入声级; 解码单元,其将每个帧的特征与声学模型和语言模型相匹配,并输出识别的字序列; 起点检测器,其基于参考值确定语音部分的起始帧; 终点检测器,其基于参考值确定语音部分的结束帧; 以及参考值更新单元,其根据输入声级的变化更新参考值。 起始点检测器每次更新参考值时更新起始帧。 解码单元在通知结束帧之前开始匹配,并且在每次通知起始帧时校正匹配结果。 语音识别系统可以在基于适当的语音部分执行语音识别的同时抑制响应时间的延迟。

    Spoken term detection apparatus, method, program, and storage medium
    7.
    发明授权
    Spoken term detection apparatus, method, program, and storage medium 有权
    口语词检测装置,方法,程序和存储介质

    公开(公告)号:US08731926B2

    公开(公告)日:2014-05-20

    申请号:US13039495

    申请日:2011-03-03

    CPC分类号: G10L15/00

    摘要: A spoken term detection apparatus includes: processing performed by a processor includes a feature extraction process extracting an acoustic feature from speech data accumulated in an accumulation part and storing an extracted acoustic feature in an acoustic feature storage, a first calculation process calculating a standard score from a similarity between an acoustic feature stored in the acoustic feature storage and an acoustic model stored in the acoustic model storage part, a second calculation process for comparing an acoustic model corresponding to an input keyword with the acoustic feature stored in the acoustic feature storage part to calculate a score of the keyword, and a retrieval process retrieving speech data including the keyword from speech data accumulated in the accumulation part based on the score of the keyword calculated by the second calculation process and the standard score stored in the standard score storage part.

    摘要翻译: 口语术语检测装置包括:处理器执行的处理包括特征提取处理,从累积在累积部分中的语音数据中提取声学特征,并将所提取的声学特征存储在声学特征存储器中,第一计算处理计算标准分数 存储在声学特征存储器中的声学特征和存储在声学模型存储部分中的声学模型之间的相似性,第二计算处理,用于将存储在声学特征存储部分中的与声学特征存储部分中存储的声学特征相对应的输入关键词的声学模型与 计算关键字的分数,以及检索处理,根据由第二计算处理计算的关键词的分数和存储在标准分数存储部分中的标准分数,从积累部分中累积的语音数据中检索包括该关键字的语音数据。

    Speech recognition system finding self-repair utterance in misrecognized speech without using recognized words
    8.
    发明授权
    Speech recognition system finding self-repair utterance in misrecognized speech without using recognized words 失效
    语音识别系统在误会语音中找不到自我修复语音,而不使用识别的单词

    公开(公告)号:US07672846B2

    公开(公告)日:2010-03-02

    申请号:US11324463

    申请日:2006-01-04

    IPC分类号: G10L15/04 G10L15/00

    CPC分类号: G10L15/22 G10L2015/088

    摘要: A voice recognition system and a voice processing system in which a self-repair utterance can be inputted and recognized accurately, as in a conversation in which a human user makes a self-repair utterance. A signal processing unit converts speech voice data into a feature, a voice section detecting unit detects voice sections in the speech voice data, and a priority determining unit selects a voice section that includes a self-repair utterance from among the voice sections according to a priority criterion without using any result of recognizing a speech vocabulary sequence. Priority criteria can include a length of the voice section, signal to noise ratio, chronological order of the voice section as well as speech speed. A decoder calculates a matching score with a recognition vocabulary using the feature of the voice section and an acoustic model.

    摘要翻译: 语音识别系统和语音处理系统,其中可以准确地输入和识别自修复话语,如在人类用户进行自我修复话语的对话中。 信号处理单元将语音语音数据转换为特征,语音区间检测单元检测语音语音数据中的语音区间,并且优先级确定单元根据一个语音区段从语音区间中选择包括自修复话语的语音区段 优先级标准,而不使用识别语音词汇序列的任何结果。 优先级标准可以包括语音部分的长度,信噪比,语音部分的时间顺序以及语音速度。 解码器使用语音部分的特征和声学模型,利用识别词汇计算匹配分数。

    Voice recognition system and voice processing system
    9.
    发明申请
    Voice recognition system and voice processing system 失效
    语音识别系统和语音处理系统

    公开(公告)号:US20070050190A1

    公开(公告)日:2007-03-01

    申请号:US11324463

    申请日:2006-01-04

    IPC分类号: G10L17/00

    CPC分类号: G10L15/22 G10L2015/088

    摘要: A voice recognition system and a voice processing system in which a self-repair utterance can be inputted and recognized accurately as in a conversation between humans in the case where a user makes the self-repair utterance are provided. An signal processing unit for converting speech voice data into a feature, a voice section detecting unit for detecting voice sections in the speech voice data, a priority determining unit for selecting a voice section to be given priority from among the voice sections detected by the voice section detecting unit according to a predetermined priority criterion, and a decoder for calculating a degree of matching with a recognition vocabulary using the feature of the voice section selected by the priority determining unit and an acoustic model are included. The priority determining unit uses as the predetermined priority criterion at least one selected from the group consisting of (1) a length of the voice section, (2) a power or an S/N ratio of the voice section, and (3) a chronological order of the voice section.

    摘要翻译: 提供了一种语音识别系统和语音处理系统,其中,在用户进行自修复话语的情况下,可以像在人类之间的会话中精确地输入和识别自修复话语。 一种用于将语音语音数据转换为特征的信号处理单元,用于检测语音语音数据中的语音部分的语音部分检测单元,用于从由语音检测到的语音部分中选择要被赋予优先级的语音部分的优先级确定单元 包括根据预定优先级标准的部分检测单元,以及用于使用由优先级确定单元选择的语音部分的特征和声学模型来计算与识别词汇匹配度的解码器。 优先级确定单元使用从以下组中选择的至少一个中选择的至少一个:(1)话音段的长度,(2)话音段的功率或S / N比,以及(3) 语音部分的时间顺序。

    Sound reproduction method, sound reproduction apparatus, sound data creation method, and sound data creation apparatus
    10.
    发明授权
    Sound reproduction method, sound reproduction apparatus, sound data creation method, and sound data creation apparatus 失效
    声音再现方法,声音再现装置,声音数据创建方法和声音数据创建装置

    公开(公告)号:US06259793B1

    公开(公告)日:2001-07-10

    申请号:US09030165

    申请日:1998-02-25

    IPC分类号: H04B100

    CPC分类号: H04S1/007

    摘要: An apparatus for continuously reproducing plural sound data has a start end/terminal end determination unit for determining the start end/terminal end of the continued respective sound data, a fade-in/fade-out unit for carrying out fade-in process at the start end of plural respective sound data and/or fade-out process at the terminal end of the same, a data output unit for continuously outputting the plural sound data which have been subjected to fade-in process and/or fade-out process, and a reproduction unit for reproducing the outputted plural sound data. In reproducing continuously the plural sound data, no noise is generated at the joint portion of the adjacent sound data.

    摘要翻译: 用于连续再现多个声音数据的装置具有用于确定连续的各个声音数据的起始端/终端的起始端/终端确定单元,用于在该声音数据中执行淡入淡出处理的淡入/淡出单元 在其终端处的多个相应声音数据和/或淡出处理的开始结束,用于连续输出已经进行淡入处理和/或淡出处理的多个声音数据的数据输出单元, 以及再现单元,用于再现所输出的多个声音数据。 在连续重放多个声音数据时,在相邻声音数据的接合部分不产生噪声。