发明授权
US09293131B2 Voice activity segmentation device, voice activity segmentation method, and voice activity segmentation program 有权
语音活动分段设备,语音活动分割方法和语音活动分割程序

  • 专利标题: Voice activity segmentation device, voice activity segmentation method, and voice activity segmentation program
  • 专利标题(中): 语音活动分段设备,语音活动分割方法和语音活动分割程序
  • 申请号: US13814141
    申请日: 2011-08-02
  • 公开(公告)号: US09293131B2
    公开(公告)日: 2016-03-22
  • 发明人: Takayuki ArakawaDaisuke Tanaka
  • 申请人: Takayuki ArakawaDaisuke Tanaka
  • 申请人地址: JP Tokyo
  • 专利权人: NEC CORPORATION
  • 当前专利权人: NEC CORPORATION
  • 当前专利权人地址: JP Tokyo
  • 优先权: JP2010-179180 20100810
  • 国际申请: PCT/JP2011/068003 WO 20110802
  • 国际公布: WO2012/020717 WO 20120216
  • 主分类号: G10L15/20
  • IPC分类号: G10L15/20 G10L15/04 G10L25/87 G10L25/78
Voice activity segmentation device, voice activity segmentation method, and voice activity segmentation program
摘要:
Provided is a noise-robust voice activity segmentation device which updates parameters used in the determination of voice-active segments without burdening the user, and also provided are a voice activity segmentation method and a voice activity segmentation program.The voice activity segmentation device comprises: a first voice activity segmentation means for determining a voice-active segment (first voice-active segment) and a voice-inactive segment (first voice-inactive segment) in a time-series of input sound by comparing a threshold value and a feature value of the time-series of the input sound; a second voice activity segmentation means for determining, after a reference speech acquired from a reference speech storage means has been superimposed on a time-series of the first voice-inactive segment, a voice-active segment and a voice-inactive segment in the time-series of the superimposed first voice-inactive segment by comparing the threshold value and a feature value of the time-series of the superimposed first voice-inactive segment; and a threshold value update means for updating the threshold value in such a way that a discrepancy rate between the determination result of the second voice activity segmentation means and a correct segmentation calculated from the reference speech is decreased.
信息查询
0/0