-
公开(公告)号:US09595264B2
公开(公告)日:2017-03-14
申请号:US14506955
申请日:2014-10-06
Applicant: Avaya Inc.
Inventor: John Jacob , Keith Ponting , Wendy J. Holmes
CPC classification number: G10L19/00 , G10L15/08 , G10L25/51 , H04M3/4936
Abstract: To detect events in an audio stream, frames of an audio signal (e.g., frames generated by a codec for a voice call or music stream) are received. Based on information in the frames, an index is used to look up an entry in a table associated with the codec. Each entry in the table indicates a likelihood that a frame matches a sound model element. The likelihood is used in the search for a sound bite, word, and/or phrase in the audio signal. The process of dynamic programming is used to find the combined likelihood for a match of the word, phrase, and/or sound bite to a region of the audio stream. Upon detection of the word, phrase, and/or sound bite in the audio stream, an event is generated, such as, notifying a person or logging the event in a database.
Abstract translation: 为了检测音频流中的事件,接收音频信号的帧(例如,用于语音呼叫或音乐流的编解码器产生的帧)。 基于帧中的信息,使用索引来查找与编解码器相关联的表中的条目。 表中的每个条目表示帧与声音模型元素匹配的可能性。 可能性用于在音频信号中搜索声音咬合,单词和/或短语。 使用动态编程的过程来找到词,短语和/或声音咬合到音频流的区域的组合似然性。 当检测到音频流中的单词,短语和/或声音咬合时,生成事件,例如通知人或将数据记录在数据库中。
-
公开(公告)号:US20160098999A1
公开(公告)日:2016-04-07
申请号:US14506955
申请日:2014-10-06
Applicant: Avaya Inc.
Inventor: John Jacob , Keith Ponting , Wendy J. Holmes
IPC: G10L19/00
CPC classification number: G10L19/00 , G10L15/08 , G10L25/51 , H04M3/4936
Abstract: To detect events in an audio stream, frames of an audio signal (e.g., frames generated by a codec for a voice call or music stream) are received. Based on information in the frames, an index is used to look up an entry in a table associated with the codec. Each entry in the table indicates a likelihood that a frame matches a sound model element. The likelihood is used in the search for a sound bite, word, and/or phrase in the audio signal. The process of dynamic programming is used to find the combined likelihood for a match of the word, phrase, and/or sound bite to a region of the audio stream. Upon detection of the word, phrase, and/or sound bite in the audio stream, an event is generated, such as, notifying a person or logging the event in a database.
Abstract translation: 为了检测音频流中的事件,接收音频信号的帧(例如,用于语音呼叫或音乐流的编解码器产生的帧)。 基于帧中的信息,使用索引来查找与编解码器相关联的表中的条目。 表中的每个条目表示帧与声音模型元素匹配的可能性。 可能性用于在音频信号中搜索声音咬合,单词和/或短语。 使用动态编程的过程来找到词,短语和/或声音咬合到音频流的区域的组合似然性。 当检测到音频流中的单词,短语和/或声音咬合时,生成事件,例如通知人或将数据记录在数据库中。
-