Augmented multi-tier classifier for multi-modal voice activity detection

    公开(公告)号:US09892745B2

    公开(公告)日:2018-02-13

    申请号:US13974453

    申请日:2013-08-23

    CPC classification number: G10L25/78 G06K9/00335 G10L15/24 G10L25/84

    Abstract: Disclosed herein are systems, methods, and computer-readable storage media for detecting voice activity in a media signal in an augmented, multi-tier classifier architecture. A system configured to practice the method can receive, from a first classifier, a first voice activity indicator detected in a first modality for a human subject. Then, the system can receive, from a second classifier, a second voice activity indicator detected in a second modality for the human subject, wherein the first voice activity indicator and the second voice activity indicators are based on the human subject at a same time, and wherein the first modality and the second modality are different. The system can concatenate, via a third classifier, the first voice activity indicator and the second voice activity indicator with original features of the human subject, to yield a classifier output, and determine voice activity based on the classifier output.

    AUGMENTED MULTI-TIER CLASSIFIER FOR MULTI-MODAL VOICE ACTIVITY DETECTION
    2.
    发明申请
    AUGMENTED MULTI-TIER CLASSIFIER FOR MULTI-MODAL VOICE ACTIVITY DETECTION 有权
    用于多模式语音活动检测的增强型多分类器

    公开(公告)号:US20150058004A1

    公开(公告)日:2015-02-26

    申请号:US13974453

    申请日:2013-08-23

    CPC classification number: G10L25/78 G06K9/00335 G10L15/24 G10L25/84

    Abstract: Disclosed herein are systems, methods, and computer-readable storage media for detecting voice activity in a media signal in an augmented, multi-tier classifier architecture. A system configured to practice the method can receive, from a first classifier, a first voice activity indicator detected in a first modality for a human subject. Then, the system can receive, from a second classifier, a second voice activity indicator detected in a second modality for the human subject, wherein the first voice activity indicator and the second voice activity indicators are based on the human subject at a same time, and wherein the first modality and the second modality are different. The system can concatenate, via a third classifier, the first voice activity indicator and the second voice activity indicator with original features of the human subject, to yield a classifier output, and determine voice activity based on the classifier output.

    Abstract translation: 这里公开了用于在增强的多层分类器架构中检测媒体信号中的语音活动的系统,方法和计算机可读存储介质。 被配置为实施该方法的系统可以从第一分类器接收在人对象的第一模态中检测到的第一语音活动指示符。 然后,系统可以从第二分类器接收以人类对象的第二模式检测到的第二语音活动指示符,其中第一语音活动指示符和第二语音活动指示符同时基于人类对象, 并且其中所述第一模态和所述第二模态是不同的。 该系统可以经由第三分类器将第一语音活动指示符和具有人类受试者的原始特征的第二语音活动指示符连接起来,以产生分类器输出,并且基于分类器输出来确定语音活动。

Patent Agency Ranking