Invention Grant
US09208778B2 System and method for combining frame and segment level processing, via temporal pooling, for phonetic classification 有权
用于组合帧和段级处理的系统和方法,通过时间池进行语音分类

System and method for combining frame and segment level processing, via temporal pooling, for phonetic classification
Abstract:
Disclosed herein are systems, methods, and non-transitory computer-readable storage media for combining frame and segment level processing, via temporal pooling, for phonetic classification. A frame processor unit receives an input and extracts the time-dependent features from the input. A plurality of pooling interface units generates a plurality of feature vectors based on pooling the time-dependent features and selecting a plurality of time-dependent features according to a plurality of selection strategies. Next, a plurality of segmental classification units generates scores for the feature vectors. Each segmental classification unit (SCU) can be dedicated to a specific pooling interface unit (PIU) to form a PIU-SCU combination. Multiple PIU-SCU combinations can be further combined to form an ensemble of combinations, and the ensemble can be diversified by varying the pooling operations used by the PIU-SCU combinations. Based on the scores, the plurality of segmental classification units selects a class label and returns a result.
Information query
Patent Agency Ranking
0/0