-
公开(公告)号:US07640161B2
公开(公告)日:2009-12-29
申请号:US11748319
申请日:2007-05-14
申请人: Robert W. Morris , Jon A. Arrowood , Marsal Gavalda , Peter S. Cardillo , Mark Finlay , Zahi Karam
发明人: Robert W. Morris , Jon A. Arrowood , Marsal Gavalda , Peter S. Cardillo , Mark Finlay , Zahi Karam
IPC分类号: G10L13/00
CPC分类号: G10L15/26 , G10L2015/025 , G10L2015/0638 , G10L2015/088
摘要: An approach to improving the performance of a wordspotting system includes providing an interface for interactive improvement of a phonetic representation of a query based on an operator identifying true detections and false alarms in a data set.
摘要翻译: 改进字注系统的性能的方法包括提供用于基于识别数据集中的真实检测和虚假警报的操作者的交互式改进查询的语音表示的接口。
-
公开(公告)号:US20120278079A1
公开(公告)日:2012-11-01
申请号:US13097830
申请日:2011-04-29
IPC分类号: G10L15/04
CPC分类号: G10L15/02 , G10L19/24 , G10L2015/088
摘要: An audio processing system makes use of a number of levels of compression or data reduction, thereby providing reduced storage requirements while maintaining a high accuracy of keyword detection in the original audio input.
摘要翻译: 音频处理系统利用多个级别的压缩或数据缩减,从而提供降低的存储要求,同时保持原始音频输入中的关键字检测的高精度。
-
公开(公告)号:US09361879B2
公开(公告)日:2016-06-07
申请号:US12391395
申请日:2009-02-24
申请人: Robert W. Morris , Jon A. Arrowood , Mark A. Clements , Kenneth King Griggs , Peter S. Cardillo , Marsal Gavalda
发明人: Robert W. Morris , Jon A. Arrowood , Mark A. Clements , Kenneth King Griggs , Peter S. Cardillo , Marsal Gavalda
IPC分类号: G10L15/10 , G10L15/187 , G10L25/54 , G10L15/08
CPC分类号: G10L15/10 , G10L15/187 , G10L25/54 , G10L2015/088
摘要: In one aspect, a method for processing media includes accepting a query. One or more language patterns are identified that are similar to the query. A putative instance of the query is located in the media. The putative instance is associated with a corresponding location in the media. The media in a vicinity of the putative instance is compared to the identified language patterns and data characterizing the putative instance of the query is provided according to the comparing of the media to the language patterns, for example, as a score for the putative instance that is determined according to the comparing of the media to the language patterns.
摘要翻译: 一方面,用于处理媒体的方法包括接受查询。 识别与查询相似的一种或多种语言模式。 查询的推定实例位于媒体中。 推定的实例与媒体中的相应位置相关联。 将推定实例附近的媒体与所识别的语言模式进行比较,并且根据媒体与语言模式的比较来提供表征推定的查询实例的数据,例如作为推定实例的得分, 是根据媒体与语言模式的比较来确定的。
-
公开(公告)号:US08719022B2
公开(公告)日:2014-05-06
申请号:US13097830
申请日:2011-04-29
IPC分类号: G10L15/00
CPC分类号: G10L15/02 , G10L19/24 , G10L2015/088
摘要: An audio processing system makes use of a number of levels of compression or data reduction, thereby providing reduced storage requirements while maintaining a high accuracy of keyword detection in the original audio input.
摘要翻译: 音频处理系统利用多个级别的压缩或数据缩减,从而提供降低的存储要求,同时保持原始音频输入中的关键字检测的高精度。
-
公开(公告)号:US20110044447A1
公开(公告)日:2011-02-24
申请号:US12545282
申请日:2009-08-21
CPC分类号: H04M3/51 , G06T11/206 , G10L2015/088 , H04M2201/38 , H04M2203/357
摘要: Techniques for processing data representative of text associated with one or more content sources to generate a specification of a set of keyphrases of interest; processing a first set of audio signals collected during a first time period to generate first data characterizing putative occurrences of one or more keyphrases of the set in the first set of audio signals; evaluating the first data to generate keyphrase-specific comparison values for the first set of audio signals; deriving first trending data between the first set of audio signals and a second set of audio signals based in part on an analysis of the keyphrase-specific comparison values for the first set of audio signals relative to stored keyphrase-specific baseline values; and generating a visual representation of at least some of the first trending data and causing the visual representation of the first trending data to be presented on a display terminal.
摘要翻译: 用于处理表示与一个或多个内容源相关联的文本的数据的技术,以生成一组感兴趣的关键短语的说明; 处理在第一时间段期间收集的第一组音频信号,以产生第一数据,以表征所述第一组音频信号中所述的一个或多个关键短语的推定出现; 评估所述第一数据以产生所述第一组音频信号的关键短语特定比较值; 部分地基于相对于存储的关键短语特定基线值对第一组音频信号的关键词特定比较值的分析,得出第一组音频信号和第二组音频信号之间的第一趋势数据; 以及生成第一趋势数据中的至少一些的视觉表示,并使第一趋势数据的视觉表示呈现在显示终端上。
-
公开(公告)号:US20100217596A1
公开(公告)日:2010-08-26
申请号:US12391395
申请日:2009-02-24
申请人: Robert W. Morris , Jon A. Arrowood , Mark A. Clements , Kenneth King Griggs , Peter S. Cardillo , Marsal Gavalda
发明人: Robert W. Morris , Jon A. Arrowood , Mark A. Clements , Kenneth King Griggs , Peter S. Cardillo , Marsal Gavalda
CPC分类号: G10L15/10 , G10L15/187 , G10L25/54 , G10L2015/088
摘要: In one aspect, a method for processing media includes accepting a query. One or more language patterns are identified that are similar to the query. A putative instance of the query is located in the media. The putative instance is associated with a corresponding location in the media. The media in a vicinity of the putative instance is compared to the identified language patterns and data characterizing the putative instance of the query is provided according to the comparing of the media to the language patterns, for example, as a score for the putative instance that is determined according to the comparing of the media to the language patterns.
摘要翻译: 一方面,用于处理媒体的方法包括接受查询。 识别与查询相似的一种或多种语言模式。 查询的推定实例位于媒体中。 推定的实例与媒体中的相应位置相关联。 将推定实例附近的媒体与所识别的语言模式进行比较,并且根据媒体与语言模式的比较来提供表征推定的查询实例的数据,例如作为推定实例的得分, 是根据媒体与语言模式的比较来确定的。
-
公开(公告)号:US20120284026A1
公开(公告)日:2012-11-08
申请号:US13102175
申请日:2011-05-06
申请人: Peter S. Cardillo , Marsal Gavalda
发明人: Peter S. Cardillo , Marsal Gavalda
IPC分类号: G10L17/00
摘要: In an aspect, in general, a method for computer assisted speaker authentication in a voice communication session includes establishing a voice communication session between a first speaker and an agent, accepting a first voice signal from the first speaker, determining a voice characteristic measure of the first voice signal, including characterizing a similarity of the first voice signal to each of one or more stored characterizations of voice signals previously acquired from one or more known speakers, and providing an interface to the agent during the voice communication session between the agent and the first speaker, including presenting an indicator based on the determined voice characteristic measure to the agent.
摘要翻译: 一方面,通常,在语音通信会话中用于计算机辅助讲话者认证的方法包括在第一说话者和代理之间建立语音通信会话,接受来自第一说话者的第一话音信号, 第一语音信号,包括表征第一语音信号与先前从一个或多个已知扬声器获取的语音信号的一个或多个存储的表征中的每一个的相似性,以及在代理和第二语音信号之间的语音通信会话期间向代理提供接口 包括向所述代理提供基于所确定的语音特征度量的指示符。
-
公开(公告)号:US20120010736A1
公开(公告)日:2012-01-12
申请号:US12833244
申请日:2010-07-09
申请人: Peter S. Cardillo , Marsal Gavalda
发明人: Peter S. Cardillo , Marsal Gavalda
CPC分类号: G11B27/322 , G06F16/41 , G06K9/0055 , G10L25/54 , G11B27/28 , H04N21/4394 , H04N21/44016 , H04N21/8106 , H04N21/812
摘要: A method for detecting sections of a known input in an unknown input includes processing the known input to form a series of discrete-valued feature values associated with corresponding time locations in the known input. Index data associating a plurality of the feature values each with one or more time locations in the known input is then formed. The unknown input is processed to form a series of discrete-valued features values. A time offset between the unknown input and the known input is determined by determining time locations in the known input associated with the feature values of the unknown input. Determining the time offset may include maintaining a distribution of time offsets based on successive determined time locations of the feature values of the unknown input.
摘要翻译: 用于检测未知输入中的已知输入的部分的方法包括处理已知输入以形成与已知输入中的相应时间位置相关联的一系列离散值特征值。 然后形成将多个特征值与已知输入中的一个或多个时间位置相关联的索引数据。 处理未知输入以形成一系列离散值特征值。 通过确定与未知输入的特征值相关联的已知输入中的时间位置来确定未知输入和已知输入之间的时间偏移。 确定时间偏移可以包括基于未知输入的特征值的连续确定的时间位置保持时间偏移的分布。
-
公开(公告)号:US07769587B2
公开(公告)日:2010-08-03
申请号:US12323601
申请日:2008-11-26
CPC分类号: G10L15/187 , G06F17/30681 , G06F17/30746 , G10L13/08 , Y10S707/99933 , Y10S707/99934 , Y10S707/99935
摘要: An improved method and apparatus is disclosed which uses probabilistic techniques to map an input search string with a prestored audio file, and recognize certain portions of a search string phonetically. An improved interface is disclosed which permits users to input search strings, linguistics, phonetics, or a combination of both, and also allows logic functions to be specified by indicating how far separated specific phonemes are in time.
摘要翻译: 公开了一种改进的方法和装置,其使用概率技术将输入搜索字符串与预存储的音频文件进行映射,并且通过语音来识别搜索字符串的某些部分。 公开了改进的接口,其允许用户输入搜索字符串,语言学,语音学或两者的组合,并且还允许通过指示时间上分离的特定音素有多远来指定逻辑功能。
-
公开(公告)号:US07475065B1
公开(公告)日:2009-01-06
申请号:US11609127
申请日:2006-12-11
CPC分类号: G10L15/187 , G06F17/30681 , G06F17/30746 , G10L13/08 , Y10S707/99933 , Y10S707/99934 , Y10S707/99935
摘要: An improved method and apparatus is disclosed which uses probabilistic techniques to map an input search string with a prestored audio file, and recognize certain portions of a search string phonetically. An improved interface is disclosed which permits users to input search strings, linguistics, phonetics, or a combination of both, and also allows logic functions to be specified by indicating how far separated specific phonemes are in time.
摘要翻译: 公开了一种改进的方法和装置,其使用概率技术将输入搜索字符串与预存储的音频文件进行映射,并且通过语音来识别搜索字符串的某些部分。 公开了改进的接口,其允许用户输入搜索字符串,语言学,语音学或两者的组合,并且还允许通过指示时间上分离的特定音素有多远来指定逻辑功能。
-
-
-
-
-
-
-
-
-