专利检索 ap:("Florent Perronnin" OR "Roland Kuhn" OR "Patrick Nguyen" OR "Jean-Claude Junqua") AND inv:"Patrick Nguyen" 第 2 页

11.

发明申请
Voice tagging, voice annotation, and speech recognition for portable devices with optional post processing 有权
标题翻译：语音标记，语音注释和可选后置处理的便携式设备的语音识别

公开(公告)号：US20050075881A1

公开(公告)日：2005-04-07

申请号：US10677174

申请日：2003-10-02

申请人： Luca Rigazio , Robert Boman , Patrick Nguyen , Jean-Claude Junqua

发明人： Luca Rigazio , Robert Boman , Patrick Nguyen , Jean-Claude Junqua

IPC分类号： G10L15/26 , G10L21/00

CPC分类号： G06F17/30796 , G10L15/26

摘要： A media capture device has an audio input receptive of user speech relating to a media capture activity in close temporal relation to the media capture activity. A plurality of focused speech recognition lexica respectively relating to media capture activities are stored on the device, and a speech recognizer recognizes the user speech based on a selected one of the focused speech recognition lexica. A media tagger tags captured media with generated speech recognition text, and a media annotator annotates the captured media with a sample of the user speech that is suitable for input to a speech recognizer. Tagging and annotating are based on close temporal relation between receipt of the user speech and capture of the captured media. Annotations may be converted to tags during post processing, employed to edit a lexicon using letter-to-sound rules and spelled word input, or matched directly to speech to retrieve captured media.

摘要翻译： 媒体捕获设备具有接收与媒体捕获活动紧密相关的媒体捕获活动的用户语音的音频输入。分别与媒体捕获活动相关的多个聚焦语音识别词典被存储在设备上，并且语音识别器基于所选择的一个焦点语音识别词典识别用户语音。媒体标签器使用生成的语音识别文本来标记捕获的媒体，并且媒体注释器用适合于输入到语音识别器的用户语音的样本来注释所捕获的媒体。标记和注释是基于用户语音的接收和捕获的媒体的捕获之间的紧密的时间关系。在后期处理中，注释可以转换为标签，用于使用字母对声音规则和拼写单词输入来编辑词典，或直接与语音匹配以检索所捕获的媒体。

12.

发明授权
Voice tagging, voice annotation, and speech recognition for portable devices with optional post processing 有权
标题翻译：语音标记，语音注释和可选后置处理的便携式设备的语音识别

公开(公告)号：US07324943B2

公开(公告)日：2008-01-29

申请号：US10677174

申请日：2003-10-02

申请人： Luca Rigazio , Robert Boman , Patrick Nguyen , Jean-Claude Junqua

发明人： Luca Rigazio , Robert Boman , Patrick Nguyen , Jean-Claude Junqua

IPC分类号： G10L21/00 , H04N5/76

CPC分类号： G06F17/30796 , G10L15/26

摘要： A media capture device has an audio input receptive of user speech relating to a media capture activity in close temporal relation to the media capture activity. A plurality of focused speech recognition lexica respectively relating to media capture activities are stored on the device, and a speech recognizer recognizes the user speech based on a selected one of the focused speech recognition lexica. A media tagger tags captured media with generated speech recognition text, and a media annotator annotates the captured media with a sample of the user speech that is suitable for input to a speech recognizer. Tagging and annotating are based on close temporal relation between receipt of the user speech and capture of the captured media. Annotations may be converted to tags during post processing, employed to edit a lexicon using letter-to-sound rules and spelled word input, or matched directly to speech to retrieve captured media.

摘要翻译： 媒体捕获设备具有接收与媒体捕获活动紧密相关的媒体捕获活动的用户语音的音频输入。分别与媒体捕获活动相关的多个聚焦语音识别词典被存储在设备上，并且语音识别器基于所选择的一个焦点语音识别词典识别用户语音。媒体标签器使用生成的语音识别文本来标记捕获的媒体，并且媒体注释器用适合于输入到语音识别器的用户语音的样本来注释所捕获的媒体。标记和注释是基于用户语音的接收和捕获的媒体的捕获之间的紧密的时间关系。在后期处理中，注释可以转换为标签，用于使用字母对声音规则和拼写单词输入来编辑词典，或直接与语音匹配以检索所捕获的媒体。

13.

发明申请
Speech data mining for call center management 审中-公开
标题翻译：语音数据挖掘用于呼叫中心管理

公开(公告)号：US20050010411A1

公开(公告)日：2005-01-13

申请号：US10616006

申请日：2003-07-09

申请人： Luca Rigazio , Patrick Nguyen , Jean-Claude Junqua , Robert Boman

发明人： Luca Rigazio , Patrick Nguyen , Jean-Claude Junqua , Robert Boman

IPC分类号： G10L15/26 , G10L17/00 , G10L15/00

CPC分类号： G10L15/26 , G10L17/00

摘要： A speech data mining system for use in generating a rich transcription having utility in call center management includes a speech differentiation module differentiating between speech of interacting speakers, and a speech recognition module improving automatic recognition of speech of one speaker based on interaction with another speaker employed as a reference speaker. A transcript generation module generates a rich transcript based on recognized speech of the speakers. Focused, interactive language models improve recognition of a customer on a low quality channel using context extracted from speech of a call center operator on a high quality channel with a speech model adapted to the operator. Mined speech data includes number of interaction turns, customer frustration phrases, operator polity, interruptions, and/or contexts extracted from speech recognition results, such as topics, complaints, solutions, and resolutions. Mined speech data is useful in call center and/or product or service quality management.

摘要翻译： 用于产生在呼叫中心管理中具有效用的丰富录音的语音数据挖掘系统包括区分交互式扬声器的语音的语音区分模块和改善一个扬声器的语音的自动识别的语音识别模块，作为参考发言人。转录本生成模块基于扬声器的识别语音生成丰富的录音。专注的交互式语言模型通过使用适合于操作员的语音模型，在高质量频道上从呼叫中心运营商的语音提取的上下文，改善对低质量信道上客户的识别。挖掘的语音数据包括从诸如主题，投诉，解决方案和分辨率的语音识别结果中提取的交互轮廓数量，客户沮丧短语，运营商政治，中断和/或上下文。挖掘的语音数据在呼叫中心和/或产品或服务质量管理中是有用的。

14.

发明授权
Methods and apparatus for blind channel estimation based upon speech correlation structure 有权
标题翻译：基于语音相关结构的盲信道估计方法与装置

公开(公告)号：US06687672B2

公开(公告)日：2004-02-03

申请号：US10099428

申请日：2002-03-15

申请人： Younes Souilmi , Luca Rigazio , Patrick Nguyen , Jean-Claude Junqua

发明人： Younes Souilmi , Luca Rigazio , Patrick Nguyen , Jean-Claude Junqua

IPC分类号： G10L1508

CPC分类号： G10L21/0208

摘要： Methods and apparatus for blind channel estimation of a speech signal corrupted by a communication channel are provided. One method includes converting a noisy speech signal into either a cepstral representation or a log-spectral representation; estimating a correlation of the representation of the noisy speech signal; determining an average of the noisy speech signal; constructing and solving, subject to a minimization constraint, a system of linear equations utilizing a correlation structure of a clean speech training signal, the correlation of the representation of the noisy speech signal, and the average of the noisy speech signal; and selecting a sign of the solution of the system of linear equations to estimate an average clean speech signal in a processing window.

摘要翻译： 提供了由通信信道损坏的语音信号的盲信道估计的方法和装置。一种方法包括将噪声语音信号转换成倒谱表示或对数谱表示; 估计噪声语音信号的表示的相关性; 确定噪声语音信号的平均值; 利用最小化约束，构建和求解利用清晰语音训练信号的相关结构，噪声语音信号的表示与噪声语音信号的平均值的相关性的线性方程组; 以及选择线性方程式的解的符号来估计处理窗口中的平均清洁语音信号。

15.

发明授权
Unsupervised speech model adaptation using reliable information among N-best strings 失效
标题翻译：无人监督的语音模型适应使用N最佳字符串中的可靠信息

公开(公告)号：US06205426B1

公开(公告)日：2001-03-20

申请号：US09237170

申请日：1999-01-25

申请人： Patrick Nguyen , Philippe Gelin , Jean-Claude Junqua

发明人： Patrick Nguyen , Philippe Gelin , Jean-Claude Junqua

IPC分类号： G10L1514

CPC分类号： G10L15/065

摘要： The system performs unsupervised speech model adaptation using the recognizer to generate the N-best solutions for an input utterance. Each of these N-best solutions is tested by a reliable information extraction process. Reliable information is extracted by a weighting technique based on likelihood scores generated by the recognizer, or by a non-linear thresholding function. The system may be used in a single pass implementation or iteratively in a multi-pass implementation.

摘要翻译： 该系统使用识别器执行无监督的语音模型自适应，以产生用于输入语音的N最佳解。这些N最佳解决方案中的每一个都通过可靠的信息提取过程进行测试。通过基于由识别器生成的似然分数的加权技术或非线性阈值函数来提取可靠信息。该系统可以在单遍实现中或在多遍实现中迭代地使用。

16.

发明授权
Speaker and environment adaptation based on linear separation of variability sources 有权
标题翻译：基于可变性来源线性分离的扬声器和环境适应

公开(公告)号：US06915259B2

公开(公告)日：2005-07-05

申请号：US09864838

申请日：2001-05-24

申请人： Luca Rigazio , Patrick Nguyen , David Kryze , Jean-Claude Junqua

发明人： Luca Rigazio , Patrick Nguyen , David Kryze , Jean-Claude Junqua

IPC分类号： G10L15/06 , G10L21/02

CPC分类号： G10L15/07 , G10L21/0208

摘要： Linear approximation of the background noise is applied after feature extraction and prior to speaker adaptation to allow the speaker adaptation system to adapt the speech models to the enrolling user without distortion from background noise. The linear approximation is applied in the feature domain, such as in the cepstral domain. Any adaptation technique that is commutative in the feature domain may be used.

摘要翻译： 背景噪声的线性近似在特征提取之后并且在说话者适配之前被应用，以允许扬声器适配系统将语音模型适应于登记用户，而不会从背景噪声失真。线性近似应用于特征域，如倒谱域。可以使用在特征域中可交换的任何适配技术。

17.

发明授权
System and method of media file access and retrieval using speech recognition 有权
标题翻译：使用语音识别的媒体文件访问和检索的系统和方法

公开(公告)号：US06907397B2

公开(公告)日：2005-06-14

申请号：US10245727

申请日：2002-09-16

申请人： David Kryze , Luca Rigazio , Patrick Nguyen , Jean-Claude Junqua

发明人： David Kryze , Luca Rigazio , Patrick Nguyen , Jean-Claude Junqua

IPC分类号： G10L15/00 , G06F17/30 , G10L20060101 , G10L11/00 , G10L15/04 , G10L15/06 , G10L15/18 , G10L15/26 , G10L21/00

CPC分类号： G06F17/30026 , G10L15/183 , G10L15/19 , G10L15/26 , Y10S707/99933 , Y10S707/99934 , Y10S707/99935

摘要： An embedded device for playing media files is capable of generating a play list of media files based on input speech from a user. It includes an indexer generating a plurality of speech recognition grammars. According to one aspect of the invention, the indexer generates speech recognition grammars based on contents of a media file header of the media file. According to another aspect of the invention, the indexer generates speech recognition grammars based on categories in a file path for retrieving the media file to a user location. When a speech recognizer receives an input speech from a user while in a selection mode, a media file selector compares the input speech received while in the selection mode to the plurality of speech recognition grammars, thereby selecting the media file.

摘要翻译： 用于播放媒体文件的嵌入式设备能够基于来自用户的输入语音来生成媒体文件的播放列表。它包括产生多个语音识别语法的索引器。根据本发明的一个方面，索引器基于媒体文件的媒体文件头的内容生成语音识别语法。根据本发明的另一方面，索引器基于用于将媒体文件检索到用户位置的文件路径中的类别来生成语音识别语法。当语音识别器在选择模式下从用户接收到输入语音时，媒体文件选择器将选择模式下接收到的输入语音与多个语音识别语法进行比较，从而选择媒体文件。

18.

发明授权
Medical ventilator with compressor heated exhalation filter 有权
标题翻译：医用呼吸机带压缩机加热呼气过滤器

公开(公告)号：US07926485B2

公开(公告)日：2011-04-19

申请号：US12323792

申请日：2008-11-26

申请人： Patrick Nguyen , Gardner J. Kimm , Steve Han , Mabini M. Arcilla

发明人： Patrick Nguyen , Gardner J. Kimm , Steve Han , Mabini M. Arcilla

IPC分类号： F24J3/00 , A62B7/10 , A62B23/02

CPC分类号： A61M16/1055 , A61M16/0057 , A61M16/0858 , A61M16/1065 , A61M16/204 , A61M16/205 , A61M2016/0027 , A61M2016/0039 , A61M2016/0042 , A61M2205/3368 , A61M2205/3633 , A61M2205/3666 , A61M2205/42

摘要： A medical ventilator includes a pressure generator for increasing a pressure of gas that produces heat during the operation thereof. A heat sink spaced from the pressure generator is provided for absorbing heat from the pressure generator. A bacteria filter requiring heating in excess of an ambient temperature for the effective operation thereof is coupled in thermal communication with the heat sink. A heat pipe is coupled in thermal communication with the heat sink and the pressure generator for conveying at least part of heat produced by the pressure generator to the bacteria filter via the heat sink.

摘要翻译： 医用呼吸机包括用于在其操作期间增加产生热量的气体的压力的压力发生器。提供与压力发生器间隔开的散热器，用于从压力发生器吸收热量。需要加热超过环境温度以用于其有效操作的细菌过滤器与散热器热连通地耦合。热管与散热器和压力发生器热连通地耦合，用于将由压力发生器产生的至少一部分热量经由散热器传送到细菌过滤器。

19.

发明授权
Time-anchored posterior indexing of speech 有权
标题翻译：时间锚定的后向索引语音

公开(公告)号：US07831425B2

公开(公告)日：2010-11-09

申请号：US11300735

申请日：2005-12-15

申请人： Alejandro Acero , Asela J. Gunawardana , Ciprian I. Chelba , Erik W. Selberg , Frank Torsten B. Seide , Patrick Nguyen , Roger Peng Yu

发明人： Alejandro Acero , Asela J. Gunawardana , Ciprian I. Chelba , Erik W. Selberg , Frank Torsten B. Seide , Patrick Nguyen , Roger Peng Yu

IPC分类号： G10L15/04 , G10L15/00 , G10L15/28

CPC分类号： G10L15/08 , G10L15/05

摘要： A computer-implemented method of indexing a speech lattice for search of audio corresponding to the speech lattice is provided. The method includes identifying at least two speech recognition hypotheses for a word which have time ranges satisfying a criteria. The method further includes merging the at least two speech recognition hypotheses to generate a merged speech recognition hypothesis for the word.

摘要翻译： 提供了一种用于索引用于搜索与语音格子相对应的音频的语音格子的计算机实现的方法。该方法包括识别具有满足标准的时间范围的单词的至少两个语音识别假设。该方法还包括合并至少两个语音识别假设以产生该单词的合并语音识别假设。

20.

发明授权
Bubble splitting for compact acoustic modeling 有权
标题翻译：气泡分裂用于紧凑的声学建模

公开(公告)号：US07328154B2

公开(公告)日：2008-02-05

申请号：US10639974

申请日：2003-08-13

申请人： Ambroise Mutel , Patrick Nguyen , Luca Rigazio

发明人： Ambroise Mutel , Patrick Nguyen , Luca Rigazio

IPC分类号： G10L15/00

CPC分类号： G10L15/063 , G10L15/144 , G10L2015/0631 , G10L2015/0638

摘要： An improved method is provided for constructing compact acoustic models for use in a speech recognizer. The method includes: partitioning speech data from a plurality of training speakers according to at least one speech related criteria (i.e., vocal tract length); grouping together the partitioned speech data from training speakers having a similar speech characteristic; and training an acoustic bubble model for each group using the speech data within the group.

摘要翻译： 提供了一种用于构建用于语音识别器中的紧凑声学模型的改进方法。该方法包括：根据至少一个语音相关标准（即，声道长度）来分割来自多个训练说话者的语音数据; 将具有类似语音特征的训练说话者的分割语音数据分组在一起; 并使用组内的语音数据为每个组训练声音气泡模型。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类