Patent search ap:"Andrej LJOLJE" Page 5

41.

发明申请
SYSTEM AND METHOD FOR HANDLING MISSING SPEECH DATA 有权
Title translation: 用于处理丢失语音数据的系统和方法

公开(公告)号：US20100131264A1

公开(公告)日：2010-05-27

申请号：US12275920

申请日：2008-11-21

Applicant: Andrej Ljolje , Alistair D. Conkie

Inventor： Andrej Ljolje , Alistair D. Conkie

IPC: G06F17/27 , G10L15/18

CPC classification number: G10L15/1815 , G06F17/27 , G10L13/027 , G10L15/02 , G10L15/04 , G10L15/20 , G10L25/87 , G10L2015/025

Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for handling missing speech data. The computer-implemented method includes receiving speech with a missing segment, generating a plurality of hypotheses for the missing segment, identifying a best hypothesis for the missing segment, and recognizing the received speech by inserting the identified best hypothesis for the missing segment. In another method embodiment, the final step is replaced with synthesizing the received speech by inserting the identified best hypothesis for the missing segment. In one aspect, the method further includes identifying a duration for the missing segment and generating the plurality of hypotheses of the identified duration for the missing segment. The step of identifying the best hypothesis for the missing segment can be based on speech context, a pronouncing lexicon, and/or a language model. Each hypothesis can have an identical acoustic score.

Abstract translation: 本文公开了用于处理丢失的语音数据的系统，计算机实现的方法和有形的计算机可读介质。计算机实现的方法包括接收具有缺失段的语音，为缺失段生成多个假设，识别缺失段的最佳假设，以及通过为缺失段插入所识别的最佳假设来识别接收到的语音。在另一种方法实施例中，通过为缺失的段插入所识别的最佳假设，来代替最后的步骤来合成所接收的语音。在一个方面，所述方法还包括识别缺失段的持续时间并为缺失段生成所识别的持续时间的多个假设。识别缺失片段的最佳假设的步骤可以基于语音上下文，发音词典和/或语言模型。每个假设可以具有相同的声学得分。

42.

发明申请
Systems and methods of providing modified media content 失效
Title translation: 提供修改的媒体内容的系统和方法

公开(公告)号：US20080232775A1

公开(公告)日：2008-09-25

申请号：US11725979

申请日：2007-03-20

Applicant: Andrej Ljolje

Inventor： Andrej Ljolje

IPC: H04N5/91 , H04N7/173

CPC classification number: H04N5/783 , G06F17/30787 , G06F17/30796 , G06F17/30843 , G10L19/09 , G10L21/04 , G11B27/005 , G11B27/034 , H04N5/44513 , H04N5/781 , H04N5/85 , H04N21/4305 , H04N21/8549

Abstract: In an embodiment, a method of providing modified media content is disclosed and includes receiving media content that includes audio data and video data having a first number of video frames. The method also includes generating abstracted media content that includes portions of the video data and audio elements of the audio data, where the abstracted media content includes less than all of the video data and includes fewer video frames than the first number of video frames.

Abstract translation: 在一个实施例中，公开了提供修改的媒体内容的方法，并且包括接收包括具有第一数量视频帧的音频数据和视频数据的媒体内容。该方法还包括生成包括音频数据的视频数据和音频元素的部分的抽象媒体内容，其中抽象媒体内容包括少于所有视频数据，并且包括比第一数量的视频帧少的视频帧。

43.

发明申请
System and Method for Optimizing Speech Recognition and Natural Language Parameters with User Feedback 有权
Title translation: 用户反馈优化语音识别和自然语言参数的系统和方法

公开(公告)号：US20150348540A1

公开(公告)日：2015-12-03

申请号：US14287866

申请日：2014-05-27

Applicant: Andrej LJOLJE , Diamantino Antonio CASEIRO , Mazin GILBERT GILBERT , Vincent GOFFIN , Taniya Mishra

Inventor： Andrej LJOLJE , Diamantino Antonio CASEIRO , Mazin GILBERT GILBERT , Vincent GOFFIN , Taniya Mishra

IPC: G10L15/18 , G10L15/26

CPC classification number: G10L15/063 , G10L15/01 , G10L15/18 , G10L15/26 , G10L2015/0635

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for assigning saliency weights to words of an ASR model. The saliency values assigned to words within an ASR model are based on human perception judgments of previous transcripts. These saliency values are applied as weights to modify an ASR model such that the results of the weighted ASR model in converting a spoken document to a transcript provide a more accurate and useful transcription to the user.

Abstract translation: 这里公开了用于将显着权重分配给ASR模型的单词的系统，方法和非暂时计算机可读存储介质。分配给ASR模型中的单词的显着性值基于以前的成绩单的人类感知判断。这些显着性值被用作权重以修改ASR模型，使得将口头文档转换成抄本的加权ASR模型的结果为用户提供更准确和有用的转录。

44.

发明授权
System and method for optimizing speech recognition and natural language parameters with user feedback 有权
Title translation: 用户反馈优化语音识别和自然语言参数的系统和方法

公开(公告)号：US08738375B2

公开(公告)日：2014-05-27

申请号：US13103665

申请日：2011-05-09

Applicant: Andrej Ljolje , Diamantino Antonio Caseiro , Mazin Gilbert , Vincent Goffin , Taniya Mishra

Inventor： Andrej Ljolje , Diamantino Antonio Caseiro , Mazin Gilbert , Vincent Goffin , Taniya Mishra

IPC: G10L15/26

CPC classification number: G10L15/197 , G10L15/063 , G10L15/22

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for assigning saliency weights to words of an ASR model. The saliency values assigned to words within an ASR model are based on human perception judgments of previous transcripts. These saliency values are applied as weights to modify an ASR model such that the results of the weighted ASR model in converting a spoken document to a transcript provide a more accurate and useful transcription to the user.

Abstract translation: 这里公开了用于将显着权重分配给ASR模型的单词的系统，方法和非暂时计算机可读存储介质。分配给ASR模型中的单词的显着性值基于以前的成绩单的人类感知判断。这些显着性值被用作权重以修改ASR模型，使得将口头文档转换成抄本的加权ASR模型的结果为用户提供更准确和有用的转录。

45.

发明授权
System and method for training adaptation-specific acoustic models for automatic speech recognition 有权
Title translation: 用于训练用于自动语音识别的适应特定声学模型的系统和方法

公开(公告)号：US08600749B2

公开(公告)日：2013-12-03

申请号：US12633334

申请日：2009-12-08

Applicant: Andrej Ljolje

Inventor： Andrej Ljolje

IPC: G10L15/06

CPC classification number: G10L15/144 , G10L15/063

Abstract: Disclosed herein are systems, methods, and computer-readable storage media for training adaptation-specific acoustic models. A system practicing the method receives speech and generates a full size model and a reduced size model, the reduced size model starting with a single distribution for each speech sound in the received speech. The system finds speech segment boundaries in the speech using the full size model and adapts features of the speech data using the reduced size model based on the speech segment boundaries and an overall centroid for each speech sound. The system then recognizes speech using the adapted features of the speech. The model can be a Hidden Markov Model (HMM). The reduced size model can also be of a reduced complexity, such as having fewer mixture components than a model of full complexity. Adapting features of speech can include moving the features closer to an overall feature distribution center.

Abstract translation: 本文公开了用于训练适应特定声学模型的系统，方法和计算机可读存储介质。实施该方法的系统接收语音并生成全尺寸模型和缩小尺寸模型，缩小尺寸模型从接收到的语音中的每个语音的单个分布开始。该系统使用全尺寸模型在语音中找到语音段边界，并且使用基于语音段边界的缩小尺寸模型和每个语音的整体质心来适应语音数据的特征。该系统然后使用该语音的适应特征识别语音。该模型可以是隐马尔可夫模型（HMM）。缩小的尺寸模型也可以是降低的复杂性，例如具有比完全复杂性的模型更少的混合分量。适应语音功能可以包括将功能移动到更接近整体功能分配中心。

46.

发明授权
Speech recognition based on pronunciation modeling 有权
Title translation: 基于发音建模的语音识别

公开(公告)号：US08532993B2

公开(公告)日：2013-09-10

申请号：US13539996

申请日：2012-07-02

Applicant: Andrej Ljolje

Inventor： Andrej Ljolje

IPC: G10L15/02 , G10L15/04 , G10L15/06 , G10L15/20

CPC classification number: G10L15/187 , G10L15/063

Abstract: A system and method for performing speech recognition is disclosed. The method comprises receiving an utterance, applying the utterance to a recognizer with a language model having pronunciation probabilities associated with unique word identifiers for words given their pronunciations and presenting a recognition result for the utterance. Recognition improvement is found by moving a pronunciation model from a dictionary to the language model.

Abstract translation: 公开了一种用于执行语音识别的系统和方法。该方法包括接收一个话语，使用具有发音概率的语言模型将话语应用于识别器，该语音模型具有与给定发音的单词相关联的单词识别符，并且提供用于发音的识别结果。通过将发音模型从字典移动到语言模型来发现识别改进。

47.

发明授权
Systems and methods of providing modified media content 有权
Title translation: 提供修改的媒体内容的系统和方法

公开(公告)号：US08428443B2

公开(公告)日：2013-04-23

申请号：US11716995

申请日：2007-03-12

Applicant: Andrej Ljolje , Ann Syrdal , Alistair Conkie

Inventor： Andrej Ljolje , Ann Syrdal , Alistair Conkie

IPC: H04N5/783 , G06F3/00

CPC classification number: H04N9/87 , H04N5/765 , H04N5/782 , H04N21/4334 , H04N21/4398 , H04N21/440281 , H04N21/4621

Abstract: A method of providing modified media content is disclosed that includes providing media content to a destination device via a network, where the media content comprises video data and audio data have a first viewing rate. The method further includes receiving data indicating a selection of a second viewing rate via the network and modifying the media content to produce modified media content having approximately the second viewing rate. The modified media content includes modified video data and modified audio data synchronized at approximately the second viewing rate.

Abstract translation: 公开了一种提供修改的媒体内容的方法，其包括经由网络向目的地设备提供媒体内容，其中，媒体内容包括视频数据和音频数据具有第一观看速率。该方法还包括接收经由网络指示选择第二观看速率的数据，并修改媒体内容以产生具有大约第二观看速率的修改的媒体内容。修改的媒体内容包括修改的视频数据和大约第二观看速率同步的修改的音频数据。

48.

发明申请
SPEECH RECOGNITION BASED ON PRONUNCIATION MODELING 有权
Title translation: 基于发音建模的语音识别

公开(公告)号：US20120271635A1

公开(公告)日：2012-10-25

申请号：US13539996

申请日：2012-07-02

Applicant: Andrej Ljolje

Inventor： Andrej Ljolje

IPC: G10L15/04

CPC classification number: G10L15/187 , G10L15/063

Abstract: A system and method for performing speech recognition is disclosed. The method comprises receiving an utterance, applying the utterance to a recognizer with a language model having pronunciation probabilities associated with unique word identifiers for words given their pronunciations and presenting a recognition result for the utterance. Recognition improvement is found by moving a pronunciation model from a dictionary to the language model.

Abstract translation: 公开了一种用于执行语音识别的系统和方法。该方法包括接收一个话语，使用具有发音概率的语言模型将话语应用于识别器，该语音模型具有与给定发音的单词相关联的单词识别符，并且提供用于发音的识别结果。通过将发音模型从字典移动到语言模型来发现识别改进。

49.

发明授权
System and method for discriminative pronunciation modeling for voice search 有权
Title translation: 用于语音搜索的歧视性发音建模的系统和方法

公开(公告)号：US08296141B2

公开(公告)日：2012-10-23

申请号：US12274025

申请日：2008-11-19

Applicant: Mazin Gilbert , Alistair D. Conkie , Andrej Ljolje

Inventor： Mazin Gilbert , Alistair D. Conkie , Andrej Ljolje

IPC: G10L15/04

CPC classification number: G10L15/063 , G10L2015/025

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable media for speech recognition. The method includes receiving speech utterances, assigning a pronunciation weight to each unit of speech in the speech utterances, each respective pronunciation weight being normalized at a unit of speech level to sum to 1, for each received speech utterance, optimizing the pronunciation weight by (1) identifying word and phone alignments and corresponding likelihood scores, and (2) discriminatively adapting the pronunciation weight to minimize classification errors, and recognizing additional received speech utterances using the optimized pronunciation weights. A unit of speech can be a sentence, a word, a context-dependent phone, a context-independent phone, or a syllable. The method can further include discriminatively adapting pronunciation weights based on an objective function. The objective function can be maximum mutual information (MMI), maximum likelihood (MLE) training, minimum classification error (MCE) training, or other functions known to those of skill in the art. Speech utterances can be names. The speech utterances can be received as part of a multimodal search or input. The step of discriminatively adapting pronunciation weights can further include stochastically modeling pronunciations.

Abstract translation: 本文公开了用于语音识别的系统，计算机实现的方法和计算机可读介质。该方法包括接收语音话语，在语音话语中为每个语音单元分配发音权重，将每个相应的发音权重以语音级别为单位归一化为1，对于每个接收到的语音话语，通过（ 1）识别词和电话对齐和相应的可能性分数，以及（2）歧视地调整发音权重以最小化分类错误，以及使用优化的发音权重来识别附加的接收到的语音话语。语音单位可以是句子，单词，上下文相关的电话，与上下文无关的电话或音节。该方法还可以包括基于目标函数的歧视地适应发音权重。目标函数可以是本领域技术人员已知的最大相互信息（MMI），最大似然（MLE）训练，最小分类误差（MCE）训练或其他功能。言语言可以是名字。可以作为多模态搜索或输入的一部分接收演讲话语。歧视性地适应发音权重的步骤还可以包括随机建模发音。

50.

发明申请
System and Method for Increasing Recognition Rates of In-Vocabulary Words By Improving Pronunciation Modeling 有权
Title translation: 通过改进发音建模来提高词汇量识别率的系统和方法

公开(公告)号：US20120078617A1

公开(公告)日：2012-03-29

申请号：US13311512

申请日：2011-12-05

Applicant: Alistair D. Conkie , Mazin Gilbert , Andrej Ljolje

Inventor： Alistair D. Conkie , Mazin Gilbert , Andrej Ljolje

IPC: G06F17/21

CPC classification number: G06F17/277 , G10L15/063 , G10L15/187

Abstract: The present disclosure relates to systems, methods, and computer-readable media for generating a lexicon for use with speech recognition. The method includes receiving symbolic input as labeled speech data, overgenerating potential pronunciations based on the symbolic input, identifying potential pronunciations in a speech recognition context, and storing the identified potential pronunciations in a lexicon. Overgenerating potential pronunciations can include establishing a set of conversion rules for short sequences of letters, converting portions of the symbolic input into a number of possible lexical pronunciation variants based on the set of conversion rules, modeling the possible lexical pronunciation variants in one of a weighted network and a list of phoneme lists, and iteratively retraining the set of conversion rules based on improved pronunciations. Symbolic input can include multiple examples of a same spoken word. Speech data can be labeled explicitly or implicitly and can include words as text and recorded audio.

Abstract translation: 本公开涉及用于生成用于语音识别的词典的系统，方法和计算机可读介质。所述方法包括：将符号输入作为标记的语音数据接收，基于所述符号输入过度生成潜在发音，识别语音识别语境中的潜在发音，以及将所识别的潜在发音存储在词典中。过度生成潜在发音可以包括为短的字母序列建立一组转换规则，基于一组转换规则将符号输入的部分转换成许多可能的词汇发音变体，对可能的词汇发音变体在加权网络和音素列表，并且基于改进的发音迭代地重新训练一组转换规则。符号输入可以包括相同口语单词的多个示例。语音数据可以被明确地或隐含地标记，并且可以将单词包括为文本和记录的音频。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification