Segmentation of audio data for indexing of conversational speech for real-time or postprocessing applications

Invention Grant

US5655058A Segmentation of audio data for indexing of conversational speech for real-time or postprocessing applications 失效

Title translation: 对用于实时或后处理应用程序的会话语音索引的音频数据进行分段

Please log in to see more content

Patent Title: Segmentation of audio data for indexing of conversational speech for real-time or postprocessing applications
Patent Title (中): 对用于实时或后处理应用程序的会话语音索引的音频数据进行分段
Application No.: US226519

Application Date: 1994-04-12
Publication No.: US5655058A

Publication Date: 1997-08-05
Inventor: Vijay Balasubramanian , Francine R. Chen , Philip A. Chou , Donald G. Kimber , Alex D. Poon , Karon A. Weber , Lynn D. Wilcox
Applicant: Vijay Balasubramanian , Francine R. Chen , Philip A. Chou , Donald G. Kimber , Alex D. Poon , Karon A. Weber , Lynn D. Wilcox
Applicant Address: CT Stamford
Assignee: Xerox Corporation
Current Assignee: Xerox Corporation
Current Assignee Address: CT Stamford
Main IPC: G10L15/04
IPC: G10L15/04 ; G10L15/10 ; G10L15/14 ; G10L17/00 ; H04R3/00 ; G10L5/06 ; G10L9/00

Segmentation of audio data for indexing of conversational speech for
real-time or postprocessing applications

Abstract:

A method for segmenting audio data, comprising speech from a plurality of individual speakers, according to speaker is provided. The method comprises providing individual HMMs for each individual speaker, each individual HMM including at least one state, and constructing a speaker network HMM by connecting the individual HMMs in parallel. The audio data is then divided into segments by determining a most likely sequence of states through the speaker network HMM, each of the segments being associated with one of the individual HMMs. Afterward, the speaker of each of the segments is identified. The segmented data may be used to form an index into the audio data according to speaker.

Abstract(Chinese):

提供了一种用于根据说话者分割包括来自多个单独扬声器的语音的音频数据的方法。该方法包括为每个单独的扬声器提供单独的HMM，每个单独的HMM包括至少一个状态，以及通过并行连接各个HMM来构建扬声器网络HMM。然后通过通过扬声器网络HMM确定最可能的状态序列，将音频数据分成多个段，每个段与各个HMM中的一个相关联。之后，识别每个段的扬声器。分段数据可以用于根据说话者形成到音频数据中的索引。

Information query

Global Dossier Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/04	.分段；字极限检测