SPEECH SYLLABLE/VOWEL/PHONE BOUNDARY DETECTION USING AUDITORY ATTENTION CUES
    1.
    发明申请
    SPEECH SYLLABLE/VOWEL/PHONE BOUNDARY DETECTION USING AUDITORY ATTENTION CUES 有权
    使用审计注意事项的语音可以/ VOWEL /电话边界检测

    公开(公告)号:US20150073794A1

    公开(公告)日:2015-03-12

    申请号:US14307426

    申请日:2014-06-17

    Abstract: In syllable or vowel or phone boundary detection during speech, an auditory spectrum may be determined for an input window of sound and one or more multi-scale features may be extracted from the auditory spectrum. Each multi-scale feature can be extracted using a separate two-dimensional spectro-temporal receptive filter. One or more feature maps corresponding to the one or more multi-scale features can be generated and an auditory gist vector can be extracted from each of the one or more feature maps. A cumulative gist vector may be obtained through augmentation of each auditory gist vector extracted from the one or more feature maps. One or more syllable or vowel or phone boundaries in the input window of sound can be detected by mapping the cumulative gist vector to one or more syllable or vowel or phone boundary characteristics using a machine learning algorithm.

    Abstract translation: 在语音期间的音节或元音或电话边界检测中,可以为输入声音窗口确定听觉频谱,并且可以从听觉谱中提取一个或多个多尺度特征。 可以使用单独的二维光谱接收滤波器来提取每个多尺度特征。 可以生成与一个或多个多尺度特征相对应的一个或多个特征图,并且可以从一个或多个特征图中的每一个提取听觉要点矢量。 可以通过增加从一个或多个特征图提取的每个听觉要素矢量来获得累积的要点向量。 通过使用机器学习算法将累积的要点向量映射到一个或多个音节或元音或电话边界特征,可以检测声音的输入窗口中的一个或多个音节或元音或电话边界。

    Speech syllable/vowel/phone boundary detection using auditory attention cues
    2.
    发明授权
    Speech syllable/vowel/phone boundary detection using auditory attention cues 有权
    语音音节/元音/电话边界检测使用听觉注意线索

    公开(公告)号:US09251783B2

    公开(公告)日:2016-02-02

    申请号:US14307426

    申请日:2014-06-17

    Abstract: In syllable or vowel or phone boundary detection during speech, an auditory spectrum may be determined for an input window of sound and one or more multi-scale features may be extracted from the auditory spectrum. Each multi-scale feature can be extracted using a separate two-dimensional spectro-temporal receptive filter. One or more feature maps corresponding to the one or more multi-scale features can be generated and an auditory gist vector can be extracted from each of the one or more feature maps. A cumulative gist vector may be obtained through augmentation of each auditory gist vector extracted from the one or more feature maps. One or more syllable or vowel or phone boundaries in the input window of sound can be detected by mapping the cumulative gist vector to one or more syllable or vowel or phone boundary characteristics using a machine learning algorithm.

    Abstract translation: 在语音期间的音节或元音或电话边界检测中,可以为输入声音窗口确定听觉频谱,并且可以从听觉谱中提取一个或多个多尺度特征。 可以使用单独的二维光谱接收滤波器来提取每个多尺度特征。 可以生成与一个或多个多尺度特征相对应的一个或多个特征图,并且可以从一个或多个特征图中的每一个提取听觉要点矢量。 可以通过增加从一个或多个特征图提取的每个听觉要素矢量来获得累积的要点向量。 通过使用机器学习算法将累积的要点向量映射到一个或多个音节或元音或电话边界特征,可以检测声音的输入窗口中的一个或多个音节或元音或电话边界。

    Emotion recognition using auditory attention cues extracted from users voice
    4.
    发明授权
    Emotion recognition using auditory attention cues extracted from users voice 有权
    情感识别使用从用户声音中提取的听觉注意线索

    公开(公告)号:US09020822B2

    公开(公告)日:2015-04-28

    申请号:US13655825

    申请日:2012-10-19

    CPC classification number: G10L15/00 G10L25/63

    Abstract: Emotion recognition may be implemented on an input window of sound. One or more auditory attention features may be extracted from an auditory spectrum for the window using one or more two-dimensional spectro-temporal receptive filters. One or more feature maps corresponding to the one or more auditory attention features may be generated. Auditory gist features may be extracted from feature maps, and the auditory gist features may be analyzed to determine one or more emotion classes corresponding to the input window of sound. In addition, a bottom-up auditory attention model may be used to select emotionally salient parts of speech and execute emotion recognition only on the salient parts of speech while ignoring the rest of the speech signal.

    Abstract translation: 情绪识别可以在声音的输入窗口上实现。 可以使用一个或多个二维光谱接收滤波器从窗口的听觉谱中提取一个或多个听觉注意特征。 可以生成对应于一个或多个听觉注意特征的一个或多个特征图。 可以从特征图中提取听觉特征,并且可以分析听觉要点特征以确定与输入的声音窗口相对应的一个或多个情绪类别。 另外,可以使用自下而上的听觉注意模型来选择情感上显着的部分语言,并且仅在语音的显着部分上执行情感识别,而忽略语音信号的其余部分。

Patent Agency Ranking