Voice activity segmentation device, voice activity segmentation method, and voice activity segmentation program
    1.
    发明授权
    Voice activity segmentation device, voice activity segmentation method, and voice activity segmentation program 有权
    语音活动分段设备,语音活动分割方法和语音活动分割程序

    公开(公告)号:US09293131B2

    公开(公告)日:2016-03-22

    申请号:US13814141

    申请日:2011-08-02

    CPC分类号: G10L15/04 G10L25/78 G10L25/87

    摘要: Provided is a noise-robust voice activity segmentation device which updates parameters used in the determination of voice-active segments without burdening the user, and also provided are a voice activity segmentation method and a voice activity segmentation program.The voice activity segmentation device comprises: a first voice activity segmentation means for determining a voice-active segment (first voice-active segment) and a voice-inactive segment (first voice-inactive segment) in a time-series of input sound by comparing a threshold value and a feature value of the time-series of the input sound; a second voice activity segmentation means for determining, after a reference speech acquired from a reference speech storage means has been superimposed on a time-series of the first voice-inactive segment, a voice-active segment and a voice-inactive segment in the time-series of the superimposed first voice-inactive segment by comparing the threshold value and a feature value of the time-series of the superimposed first voice-inactive segment; and a threshold value update means for updating the threshold value in such a way that a discrepancy rate between the determination result of the second voice activity segmentation means and a correct segmentation calculated from the reference speech is decreased.

    摘要翻译: 提供了一种噪声鲁棒的语音活动分段装置,其更新用于确定语音活动段的参数,而不会对用户造成负担,并且还提供了语音活动分割方法和语音活动分段程序。 语音活动分割装置包括:第一语音活动分段装置,用于通过比较来确定输入声音的时间序列中的语音活动段(第一语音活动段)和语音不活动段(第一语音无效段) 输入声音的时间序列的阈值和特征值; 第二语音活动分割装置,用于在从参考语音存储装置获取的参考语音叠加在所述第一语音无效段的时间序列上之后,确定所述时间中的语音活动段和语音无效段 - 通过将阈值与叠加的第一语音无效段的时间序列的特征值进行比较,来叠加第一语音无效段的系列; 以及阈值更新装置,用于以使得第二语音活动分段装置的确定结果与从参考语音计算的正确分割之间的差异率减小的方式更新阈值。

    DATA PROCESSING DEVICE, COMPUTER PROGRAM THEREFOR AND DATA PROCESSING METHOD
    2.
    发明申请
    DATA PROCESSING DEVICE, COMPUTER PROGRAM THEREFOR AND DATA PROCESSING METHOD 有权
    数据处理设备,其计算机程序和数据处理方法

    公开(公告)号:US20120310866A1

    公开(公告)日:2012-12-06

    申请号:US13520728

    申请日:2010-12-02

    IPC分类号: G06F15/18

    摘要: A plurality of pruning measures (PM) are calculated from a feature amount (CV) of test data (TD) which is input, a plurality of isopycnic surfaces (EC) are plotted and set on a threshold space (SS), a threshold curved surface (SC) in which a decrease in at least one of a plurality of pruning measures (PM) causes an increase in at least one thereof is generated using a portion of one isopycnic surface (EC) as a part, a hypothesis curved surface (HC) of subject data (CD) is generated on the threshold space (SS) to set a position intersecting the threshold curved surface (SC) to a pruning threshold (PS), and a plurality of hypotheses of the subject data (CD) are pruned. Thereby, there is provided a data processing device of which at least one of the recognition speed and the recognition accuracy is higher than in the related art.

    摘要翻译: 从输入的测试数据(TD)的特征量(CV)计算出多个修剪措施(PM),在阈值空间(SS)上绘制并设置多个等表面(EC),阈值曲线 使用一个等角表面(EC)的一部分作为假设曲面(一部分),生成多个修剪措施(PM)中的至少一个的减少导致其至少一个的增加的表面(SC) HC)在阈值空间(SS)上产生,以将与阈值曲面(SC)相交的位置设置为修剪阈值(PS),并且对象数据(CD)的多个假设是 修剪了 因此,提供了一种数据处理装置,其中识别速度和识别精度中的至少一个比现有技术中高。

    SPEECH PROCESSING DEVICE, METHOD, AND STORAGE MEDIUM
    3.
    发明申请
    SPEECH PROCESSING DEVICE, METHOD, AND STORAGE MEDIUM 有权
    语音处理设备,方法和存储介质

    公开(公告)号:US20120116765A1

    公开(公告)日:2012-05-10

    申请号:US13383527

    申请日:2010-06-04

    IPC分类号: G10L15/04

    CPC分类号: G10L15/04 G10L15/08

    摘要: A speech recognition unit (102) includes a phrase determination unit (103) which determines a phrase boundary based on the comparison between the hypothetical word group generated by speech recognition and set words representing phrase boundaries. In this speech processing device, the speech recognition unit (102) outputs recognition results for each phrase based on a phrase boundary determined by the phrase determination unit (103).

    摘要翻译: 语音识别单元(102)包括短语确定单元(103),其基于由语音识别产生的假设单词组与表示短语边界的设定单词之间的比较来确定短语边界。 在该语音处理装置中,语音识别单元(102)基于由短语确定单元(103)确定的短语边界输出每个短语的识别结果。

    Speech processing device, method, and storage medium
    4.
    发明授权
    Speech processing device, method, and storage medium 有权
    语音处理装置,方法和存储介质

    公开(公告)号:US09583095B2

    公开(公告)日:2017-02-28

    申请号:US13383527

    申请日:2010-06-04

    IPC分类号: G10L15/08 G10L15/04

    CPC分类号: G10L15/04 G10L15/08

    摘要: A speech recognition unit (102) includes a phrase determination unit (103) which determines a phrase boundary based on the comparison between the hypothetical word group generated by speech recognition and set words representing phrase boundaries. In this speech processing device, the speech recognition unit (102) outputs recognition results for each phrase based on a phrase boundary determined by the phrase determination unit (103).

    摘要翻译: 语音识别单元(102)包括短语确定单元(103),其基于由语音识别产生的假设单词组与表示短语边界的设定单词之间的比较来确定短语边界。 在该语音处理装置中,语音识别单元(102)基于由短语确定单元(103)确定的短语边界输出每个短语的识别结果。

    TEXT PROCESSING SYSTEM, TEXT PROCESSING METHOD, AND TEXT PROCESSING PROGRAM
    5.
    发明申请
    TEXT PROCESSING SYSTEM, TEXT PROCESSING METHOD, AND TEXT PROCESSING PROGRAM 审中-公开
    文本处理系统,文本处理方法和文本处理程序

    公开(公告)号:US20130144609A1

    公开(公告)日:2013-06-06

    申请号:US13814611

    申请日:2011-08-02

    IPC分类号: G06F17/21

    摘要: Provided is a text processing system capable of avoiding declining processing efficiency in analyses of text that does not contain breaks.This text processing system comprises: a linking means for generating linking data that links acquired text after the link object analysis result, which are the results of the analysis of text acquired prior to the acquired text; an analysis means for carrying out language analysis on the linked data, using at least a portion of the link object analysis result; and a determination means for determining a prescribed unit break included in the linked data, on the basis of the results of the analysis by the analysis means.The link object analysis results are the results of the analysis after the break that is determined by the determination means.The link object analysis results are the results of the analysis after the break that is determined by the determination means.

    摘要翻译: 提供了一种文本处理系统,其能够避免在不包含中断的文本的分析中降低处理效率。 该文本处理系统包括:链接装置,用于生成在链接对象分析结果之后链接获取的文本的链接数据,链接对象分析结果是在获取的文本之前获取的文本的分析结果; 分析装置,用于使用所述链接对象分析结果的至少一部分对所述链接数据进行语言分析; 以及确定装置,用于基于分析装置的分析结果来确定链接数据中包括的规定单位中断。 链接对象分析结果是由确定装置确定的中断之后的分析结果。 链接对象分析结果是由确定装置确定的中断之后的分析结果。

    SPEECH RECOGNITION DEVICE, SPEECH RECOGNITION METHOD AND PROGRAM
    6.
    发明申请
    SPEECH RECOGNITION DEVICE, SPEECH RECOGNITION METHOD AND PROGRAM 审中-公开
    语音识别装置,语音识别方法和程序

    公开(公告)号:US20130185068A1

    公开(公告)日:2013-07-18

    申请号:US13823194

    申请日:2011-09-15

    IPC分类号: G10L15/20

    摘要: The present invention provides a speech recognition device includes a threshold value candidate generation unit which extracts a feature indicating likeliness of being speech from a temporal sequence of input sound, and generates a plurality of threshold value candidates for discriminating between speech and non-speech; a speech determination unit which, by comparing the feature indicating likeliness of being speech with the plurality of threshold value candidates, determines respective speech sections, and outputs determination information as a result of the determination; a search unit which corrects each of the speech sections represented by the determination information, using a speech model and a non-speech model; and a parameter update unit which estimates a threshold value for determining a speech section, on the basis of distribution profiles of the feature respectively in utterance sections and in non-utterance sections, within each of the corrected speech sections, and makes an update with the threshold value.

    摘要翻译: 本发明提供一种语音识别装置,包括阈值候补生成部,其从输入声音的时间序列中提取表示语音的可能性的特征,生成用于区分语音和非语音的多个阈值候选; 语音确定单元,通过比较指示语音的可能性与多个阈值候选的特征,确定各个语音区间,并作为确定的结果输出确定信息; 搜索单元,其使用语音模型和非语音模型来校正由所述确定信息表示的每个所述语音段; 以及参数更新单元,其基于分别在每个校正语音段中的话语部分和非话语部分中的特征的分布简档来估计用于确定语音部分的阈值,并且使用 阈值。

    VOICE ACTIVITY SEGMENTATION DEVICE, VOICE ACTIVITY SEGMENTATION METHOD, AND VOICE ACTIVITY SEGMENTATION PROGRAM
    7.
    发明申请
    VOICE ACTIVITY SEGMENTATION DEVICE, VOICE ACTIVITY SEGMENTATION METHOD, AND VOICE ACTIVITY SEGMENTATION PROGRAM 有权
    语音活动分类设备,语音活动分段方法和语音活动分段程序

    公开(公告)号:US20130132078A1

    公开(公告)日:2013-05-23

    申请号:US13814141

    申请日:2011-08-02

    IPC分类号: G10L15/04

    CPC分类号: G10L15/04 G10L25/78 G10L25/87

    摘要: Provided is a noise-robust voice activity segmentation device which updates parameters used in the determination of voice-active segments without burdening the user, and also provided are a voice activity segmentation method and a voice activity segmentation program.The voice activity segmentation device comprises: a first voice activity segmentation means for determining a voice-active segment (first voice-active segment) and a voice-inactive segment (first voice-inactive segment) in a time-series of input sound by comparing a threshold value and a feature value of the time-series of the input sound; a second voice activity segmentation means for determining, after a reference speech acquired from a reference speech storage means has been superimposed on a time-series of the first voice-inactive segment, a voice-active segment and a voice-inactive segment in the time-series of the superimposed first voice-inactive segment by comparing the threshold value and a feature value of the time-series of the superimposed first voice-inactive segment; and a threshold value update means for updating the threshold value in such a way that a discrepancy rate between the determination result of the second voice activity segmentation means and a correct segmentation calculated from the reference speech is decreased.

    摘要翻译: 提供了一种噪声鲁棒的语音活动分段装置,其更新用于确定语音活动段的参数,而不会对用户造成负担,并且还提供了语音活动分割方法和语音活动分段程序。 语音活动分割装置包括:第一语音活动分段装置,用于通过比较来确定输入声音的时间序列中的语音活动段(第一语音活动段)和语音不活动段(第一语音无效段) 输入声音的时间序列的阈值和特征值; 第二语音活动分割装置,用于在从参考语音存储装置获取的参考语音叠加在所述第一语音无效段的时间序列上之后,确定所述时间中的语音活动段和语音无效段 - 通过将阈值与叠加的第一语音无效段的时间序列的特征值进行比较,来叠加第一语音无效段的系列; 以及阈值更新装置,用于以使得第二语音活动分段装置的确定结果与从参考语音计算的正确分割之间的差异率减小的方式更新阈值。

    Data processing device, information storage medium storing computer program therefor and data processing method
    8.
    发明授权
    Data processing device, information storage medium storing computer program therefor and data processing method 有权
    数据处理装置,存储其计算机程序的信息存储介质和数据处理方法

    公开(公告)号:US09047562B2

    公开(公告)日:2015-06-02

    申请号:US13520728

    申请日:2010-12-02

    摘要: A plurality of pruning measures (PM) are calculated from a feature amount (CV) of test data (TD) which is input, a plurality of isopycnic surfaces (EC) are plotted and set on a threshold space (SS), a threshold curved surface (SC) in which a decrease in at least one of a plurality of pruning measures (PM) causes an increase in at least one thereof is generated using a portion of one isopycnic surface (EC) as a part, a hypothesis curved surface (HC) of subject data (CD) is generated on the threshold space (SS) to set a position intersecting the threshold curved surface (SC) to a pruning threshold (PS), and a plurality of hypotheses of the subject data (CD) are pruned. Thereby, there is provided a data processing device of which at least one of the recognition speed and the recognition accuracy is higher than in the related art.

    摘要翻译: 从输入的测试数据(TD)的特征量(CV)计算出多个修剪措施(PM),在阈值空间(SS)上绘制并设置多个等表面(EC),阈值曲线 使用一个等角表面(EC)的一部分作为假设曲面(一部分),生成多个修剪措施(PM)中的至少一个的减少导致其至少一个的增加的表面(SC) HC)在阈值空间(SS)上产生,以将与阈值曲面(SC)相交的位置设置为修剪阈值(PS),并且对象数据(CD)的多个假设是 修剪了 因此,提供了一种数据处理装置,其中识别速度和识别精度中的至少一个比现有技术中高。

    SPEECH PROCESSING APPARATUS, CONTROL METHOD THEREOF, STORAGE MEDIUM STORING CONTROL PROGRAM THEREOF, AND VEHICLE, INFORMATION PROCESSING APPARATUS, AND INFORMATION PROCESSING SYSTEM INCLUDING THE SPEECH PROCESSING APPARATUS
    10.
    发明申请
    SPEECH PROCESSING APPARATUS, CONTROL METHOD THEREOF, STORAGE MEDIUM STORING CONTROL PROGRAM THEREOF, AND VEHICLE, INFORMATION PROCESSING APPARATUS, AND INFORMATION PROCESSING SYSTEM INCLUDING THE SPEECH PROCESSING APPARATUS 审中-公开
    语音处理装置,其控制方法,存储媒体存储控制程序及车辆信息处理装置以及包括语音处理装置的信息处理系统

    公开(公告)号:US20130311175A1

    公开(公告)日:2013-11-21

    申请号:US13978671

    申请日:2011-12-03

    IPC分类号: G10L21/0216

    摘要: An apparatus of this invention is a speech processing apparatus that acquires pseudo speech from a mixture sound including desired speech and noise. The speech processing apparatus includes a first microphone that inputs a first mixture sound including desired speech and noise and outputs a first mixture signal, a second microphone that is opened to the same sound space as that of the first microphone, inputs a second mixture sound including the desired speech and the noise at a ratio different from the first mixture sound, and outputs a second mixture signal, a sound insulator that is disposed between the first microphone and the second microphone, and a noise suppression circuit that suppresses an estimated noise signal based on the first mixture signal and the second mixture signal and outputs a pseudo speech signal. With this arrangement, it is possible to, in a single sound space where desired speech and noise mix, correctly estimate the noise and reconstruct pseudo speech close to the desired speech.

    摘要翻译: 本发明的装置是从包含所需语音和噪声的混合声音中获取伪语音的语音处理装置。 语音处理装置包括第一麦克风,其输入包括所需语音和噪声的第一混合声音并输出第一混合信号,打开到与第一麦克风相同的声音空间的第二麦克风输入包括 输出第二混合信号,布置在第一麦克风和第二麦克风之间的隔音器,以及噪声抑制电路,其抑制基于估计噪声信号的噪声信号 在第一混合信号和第二混合信号上输出伪语音信号。 利用这种布置,可以在需要的语音和噪声混合的单个声音空间中正确地估计噪声并重建接近期望语音的伪语音。