Patent search ap:("Sony Computer Entertainment Page Inc.") AND inv:"Ozlem Kalinli-Akbacak"

1.

发明申请
SPEECH SYLLABLE/VOWEL/PHONE BOUNDARY DETECTION USING AUDITORY ATTENTION CUES 有权
Title translation: 使用审计注意事项的语音可以/ VOWEL /电话边界检测

公开(公告)号：US20150073794A1

公开(公告)日：2015-03-12

申请号：US14307426

申请日：2014-06-17

Applicant: Sony Computer Entertainment, Inc.

Inventor： Ozlem Kalinli-Akbacak , Ruxin Chen

IPC: G10L15/05

CPC classification number: G10L15/05 , G10L15/04 , G10L15/16 , G10L15/24 , G10L15/34 , G10L25/03

Abstract: In syllable or vowel or phone boundary detection during speech, an auditory spectrum may be determined for an input window of sound and one or more multi-scale features may be extracted from the auditory spectrum. Each multi-scale feature can be extracted using a separate two-dimensional spectro-temporal receptive filter. One or more feature maps corresponding to the one or more multi-scale features can be generated and an auditory gist vector can be extracted from each of the one or more feature maps. A cumulative gist vector may be obtained through augmentation of each auditory gist vector extracted from the one or more feature maps. One or more syllable or vowel or phone boundaries in the input window of sound can be detected by mapping the cumulative gist vector to one or more syllable or vowel or phone boundary characteristics using a machine learning algorithm.

Abstract translation: 在语音期间的音节或元音或电话边界检测中，可以为输入声音窗口确定听觉频谱，并且可以从听觉谱中提取一个或多个多尺度特征。可以使用单独的二维光谱接收滤波器来提取每个多尺度特征。可以生成与一个或多个多尺度特征相对应的一个或多个特征图，并且可以从一个或多个特征图中的每一个提取听觉要点矢量。可以通过增加从一个或多个特征图提取的每个听觉要素矢量来获得累积的要点向量。通过使用机器学习算法将累积的要点向量映射到一个或多个音节或元音或电话边界特征，可以检测声音的输入窗口中的一个或多个音节或元音或电话边界。

2.

发明授权
Speech syllable/vowel/phone boundary detection using auditory attention cues 有权
Title translation: 语音音节/元音/电话边界检测使用听觉注意线索

公开(公告)号：US09251783B2

公开(公告)日：2016-02-02

申请号：US14307426

申请日：2014-06-17

Applicant: Sony Computer Entertainment, Inc.

Inventor： Ozlem Kalinli-Akbacak , Ruxin Chen

IPC: G10L15/04 , G10L15/05 , G10L15/16 , G10L15/24 , G10L15/34 , G10L25/03

CPC classification number: G10L15/05 , G10L15/04 , G10L15/16 , G10L15/24 , G10L15/34 , G10L25/03

Abstract: In syllable or vowel or phone boundary detection during speech, an auditory spectrum may be determined for an input window of sound and one or more multi-scale features may be extracted from the auditory spectrum. Each multi-scale feature can be extracted using a separate two-dimensional spectro-temporal receptive filter. One or more feature maps corresponding to the one or more multi-scale features can be generated and an auditory gist vector can be extracted from each of the one or more feature maps. A cumulative gist vector may be obtained through augmentation of each auditory gist vector extracted from the one or more feature maps. One or more syllable or vowel or phone boundaries in the input window of sound can be detected by mapping the cumulative gist vector to one or more syllable or vowel or phone boundary characteristics using a machine learning algorithm.

Abstract translation: 在语音期间的音节或元音或电话边界检测中，可以为输入声音窗口确定听觉频谱，并且可以从听觉谱中提取一个或多个多尺度特征。可以使用单独的二维光谱接收滤波器来提取每个多尺度特征。可以生成与一个或多个多尺度特征相对应的一个或多个特征图，并且可以从一个或多个特征图中的每一个提取听觉要点矢量。可以通过增加从一个或多个特征图提取的每个听觉要素矢量来获得累积的要点向量。通过使用机器学习算法将累积的要点向量映射到一个或多个音节或元音或电话边界特征，可以检测声音的输入窗口中的一个或多个音节或元音或电话边界。

3.

发明授权
Combining auditory attention cues with phoneme posterior scores for phone/vowel/syllable boundary detection 有权

公开(公告)号：US09672811B2

公开(公告)日：2017-06-06

申请号：US13901426

申请日：2013-05-23

Applicant: Sony Computer Entertainment Inc.

Inventor： Ozlem Kalinli-Akbacak

IPC: G10L15/02 , G10L15/04 , G10L25/03 , G10L15/16 , G10L25/30

CPC classification number: G10L15/04 , G10L15/16 , G10L25/03 , G10L25/30

Abstract: Phoneme boundaries may be determined from a signal corresponding to recorded audio by extracting auditory attention features from the signal and extracting phoneme posteriors from the signal. The auditory attention features and phoneme posteriors may then be combined to detect boundaries in the signal.

4.

发明授权
Emotion recognition using auditory attention cues extracted from users voice 有权
Title translation: 情感识别使用从用户声音中提取的听觉注意线索

公开(公告)号：US09020822B2

公开(公告)日：2015-04-28

申请号：US13655825

申请日：2012-10-19

Applicant: Sony Computer Entertainment Inc.

Inventor： Ozlem Kalinli-Akbacak

IPC: G10L21/00 , G10L15/00 , G10L25/63

CPC classification number: G10L15/00 , G10L25/63

Abstract: Emotion recognition may be implemented on an input window of sound. One or more auditory attention features may be extracted from an auditory spectrum for the window using one or more two-dimensional spectro-temporal receptive filters. One or more feature maps corresponding to the one or more auditory attention features may be generated. Auditory gist features may be extracted from feature maps, and the auditory gist features may be analyzed to determine one or more emotion classes corresponding to the input window of sound. In addition, a bottom-up auditory attention model may be used to select emotionally salient parts of speech and execute emotion recognition only on the salient parts of speech while ignoring the rest of the speech signal.

Abstract translation: 情绪识别可以在声音的输入窗口上实现。可以使用一个或多个二维光谱接收滤波器从窗口的听觉谱中提取一个或多个听觉注意特征。可以生成对应于一个或多个听觉注意特征的一个或多个特征图。可以从特征图中提取听觉特征，并且可以分析听觉要点特征以确定与输入的声音窗口相对应的一个或多个情绪类别。另外，可以使用自下而上的听觉注意模型来选择情感上显着的部分语言，并且仅在语音的显着部分上执行情感识别，而忽略语音信号的其余部分。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification