Patent search ap:"Andrej LJOLJE" Page 6

51.

发明申请
SYSTEM AND METHOD FOR PRONUNCIATION MODELING 有权
Title translation: 发明建模系统与方法

公开(公告)号：US20120065975A1

公开(公告)日：2012-03-15

申请号：US13302380

申请日：2011-11-22

Applicant: Andrej Ljolje , Alistair D. Conkie , Ann K. Syrdal

Inventor： Andrej Ljolje , Alistair D. Conkie , Ann K. Syrdal

IPC: G10L15/04

CPC classification number: G10L15/187 , G10L15/183 , G10L2015/025

Abstract: Systems, computer-implemented methods, and tangible computer-readable media for generating a pronunciation model. The method includes identifying a generic model of speech composed of phonemes, identifying a family of interchangeable phonemic alternatives for a phoneme in the generic model of speech, labeling the family of interchangeable phonemic alternatives as referring to the same phoneme, and generating a pronunciation model which substitutes each family for each respective phoneme. In one aspect, the generic model of speech is a vocal tract length normalized acoustic model. Interchangeable phonemic alternatives can represent a same phoneme for different dialectal classes. An interchangeable phonemic alternative can include a string of phonemes.

Abstract translation: 系统，计算机实现的方法和用于生成发音模型的有形计算机可读介质。该方法包括识别由音素组成的通用语音模型，在通用语音模型中识别音素的可互换音素替代品系列，将可互换音素替代品的家族标记为指相同的音素，以及生成发音模型，其中将每个家庭的每个音素替代。在一个方面，语音的通用模型是声道长度归一化声学模型。可互换的音素替代品可以代表不同方言课程的相同音素。可互换的音素替代品可以包括一串音素。

52.

发明授权
System and method for increasing recognition rates of in-vocabulary words by improving pronunciation modeling 有权
Title translation: 通过改进发音建模来增加词汇单词识别率的系统和方法

公开(公告)号：US08095365B2

公开(公告)日：2012-01-10

申请号：US12328436

申请日：2008-12-04

Applicant: Alistair D. Conkie , Mazin Gilbert , Andrej Ljolje

Inventor： Alistair D. Conkie , Mazin Gilbert , Andrej Ljolje

IPC: G10L13/08

CPC classification number: G06F17/277 , G10L15/063 , G10L15/187

Abstract: The present disclosure relates to systems, methods, and computer-readable media for generating a lexicon for use with speech recognition. The method includes receiving symbolic input as labeled speech data, overgenerating potential pronunciations based on the symbolic input, identifying potential pronunciations in a speech recognition context, and storing the identified potential pronunciations in a lexicon. Overgenerating potential pronunciations can include establishing a set of conversion rules for short sequences of letters, converting portions of the symbolic input into a number of possible lexical pronunciation variants based on the set of conversion rules, modeling the possible lexical pronunciation variants in one of a weighted network and a list of phoneme lists, and iteratively retraining the set of conversion rules based on improved pronunciations. Symbolic input can include multiple examples of a same spoken word. Speech data can be labeled explicitly or implicitly and can include words as text and recorded audio.

Abstract translation: 本公开涉及用于生成用于语音识别的词典的系统，方法和计算机可读介质。所述方法包括：将符号输入作为标记的语音数据接收，基于所述符号输入过度生成潜在发音，识别语音识别语境中的潜在发音，以及将所识别的潜在发音存储在词典中。过度生成潜在发音可以包括为短的字母序列建立一组转换规则，基于一组转换规则将符号输入的部分转换成许多可能的词汇发音变体，对可能的词汇发音变体在加权网络和音素列表，并且基于改进的发音迭代地重新训练一组转换规则。符号输入可以包括相同口语单词的多个示例。语音数据可以被明确地或隐含地标记，并且可以将单词包括为文本和记录的音频。

53.

发明授权
System and method of using acoustic models for automatic speech recognition which distinguish pre- and post-vocalic consonants 有权
Title translation: 用于自动语音识别的声学模型的系统和方法，其区分声前和后声辅音

公开(公告)号：US08015008B2

公开(公告)日：2011-09-06

申请号：US11930675

申请日：2007-10-31

Applicant: Yeon-Jun Kim , Alistair Conkie , Andrej Ljolje , Ann K. Syrdal

Inventor： Yeon-Jun Kim , Alistair Conkie , Andrej Ljolje , Ann K. Syrdal

IPC: G10L15/04

CPC classification number: G10L25/78 , G10L15/02

Abstract: Disclosed are systems, methods and computer readable media for training acoustic models for an automatic speech recognition systems (ASR) system. The method includes receiving a speech signal, defining at least one syllable boundary position in the received speech signal, based on the at least one syllable boundary position, generating for each consonant in a consonant phoneme inventory a pre-vocalic position label and a post-vocalic position label to expand the consonant phoneme inventory, reformulating a lexicon to reflect an expanded consonant phoneme inventory, and training a language model for an automated speech recognition (ASR) system based on the reformulated lexicon.

Abstract translation: 公开了用于训练用于自动语音识别系统（ASR）系统的声学模型的系统，方法和计算机可读介质。该方法包括基于所述至少一个音节边界位置接收定义接收到的语音信号中的至少一个音节边界位置的语音信号，在辅音音素库中为每个辅音生成声前位置标签和后声音位置标签，声音位置标签，以扩展辅音音素库存，重新设计词典，以反映扩展的辅音音素库存，并为基于重新设计的词典的自动语音识别（ASR）系统培训语言模型。

54.

发明申请
SYSTEM AND METHOD FOR IMPROVED AUTOMATIC SPEECH RECOGNITION PERFORMANCE 有权
Title translation: 用于改进自动语音识别性能的系统和方法

公开(公告)号：US20110137648A1

公开(公告)日：2011-06-09

申请号：US12631131

申请日：2009-12-04

Applicant: Andrej LJOLJE , Mazin GILBERT

Inventor： Andrej LJOLJE , Mazin GILBERT

IPC: G10L15/00

CPC classification number: G10L15/00 , G10L15/285 , G10L15/32

Abstract: Disclosed herein are systems, methods, and computer-readable storage media for improving automatic speech recognition performance. A system practicing the method identifies idle speech recognition resources and establishes a supplemental speech recognizer on the idle resources based on overall speech recognition demand. The supplemental speech recognizer can differ from a main speech recognizer, and, along with the main speech recognizer, can be associated with a particular speaker. The system performs speech recognition on speech received from the particular speaker in parallel with the main speech recognizer and the supplemental speech recognizer and combines results from the main and supplemental speech recognizer. The system recognizes the received speech based on the combined results. The system can use beam adjustment in place of or in combination with a supplemental speech recognizer. A scheduling algorithm can tailor a particular combination of speech recognition resources and release the supplemental speech recognizer based on increased demand.

Abstract translation: 本文公开了用于改善自动语音识别性能的系统，方法和计算机可读存储介质。实施该方法的系统识别空闲语音识别资源，并且基于总体语音识别需求在空闲资源上建立补充语音识别器。补充语音识别器可以与主语音识别器不同，并且与主语音识别器一起可以与特定扬声器相关联。该系统与主语音识别器和辅助语音识别器并行地执行从特定扬声器接收的语音的语音识别，并且组合来自主语音识别器和补充语音识别器的结果。系统基于组合的结果识别接收到的语音。该系统可以使用波束调整来代替或与补充语音识别器组合。调度算法可以定制语音识别资源的特定组合，并且基于增加的需求来释放补充语音识别器。

55.

发明申请
AUTOMATIC DISCLOSURE DETECTION 有权
Title translation: 自动披露检测

公开(公告)号：US20100332227A1

公开(公告)日：2010-12-30

申请号：US12490631

申请日：2009-06-24

Applicant: I. Dan MELAMED , Yeon-Jun KIM , Andrej LJOLJE , Bernard S. RENGER , David J. SMITH

Inventor： I. Dan MELAMED , Yeon-Jun KIM , Andrej LJOLJE , Bernard S. RENGER , David J. SMITH

IPC: G10L15/08

CPC classification number: G10L25/63 , G06F17/2881 , G06Q10/06395 , G10L15/04 , G10L15/18 , G10L15/1822 , G10L15/26 , G10L15/265

Abstract: A method of detecting pre-determined phrases to determine compliance quality is provided. The method includes determining whether at least one of an event or a precursor event has occurred based on a comparison between pre-determined phrases and a communication between a sender and a recipient in a communications network, and rating the recipient based on the presence of the pre-determined phrases associated with the event or the presence of the pre-determined phrases associated with the precursor event in the communication.

Abstract translation: 提供了检测预定短语以确定顺应性质量的方法。该方法包括基于预定短语与通信网络中的发送者和接收者之间的通信之间的比较来确定事件或前兆事件中的至少一个是否已经发生，并且基于存在与事件相关联的预定短语或与通信中的前体事件相关联的预定短语的存在。

56.

发明申请
SYSTEM AND METHOD FOR ADAPTING AUTOMATIC SPEECH RECOGNITION PRONUNCIATION BY ACOUSTIC MODEL RESTRUCTURING 有权
Title translation: 通过声学模型重建来适应自动语音识别发音的系统和方法

公开(公告)号：US20100312560A1

公开(公告)日：2010-12-09

申请号：US12480848

申请日：2009-06-09

Applicant: Andrej LJOLJE , Alistair D. CONKIE , Ann K. SYRDAL

Inventor： Andrej LJOLJE , Alistair D. CONKIE , Ann K. SYRDAL

IPC: G10L15/02

CPC classification number: G10L17/14 , G10L15/063 , G10L15/07 , G10L15/14 , G10L15/187 , G10L15/265 , G10L15/30 , G10L2015/025

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.

Abstract translation: 这里公开的是系统，计算机实现的方法和用于通过声学模型重构来适应自动语音识别发音来识别语音的计算机可读存储介质。该方法识别在目标方言中典型的本地语音训练的声学模型和匹配的发音字典。该方法从新的演讲者收集演讲，从而收集到的演讲并转录收集的演讲，以产生一个合理的音素格子。然后，该方法创建一个自定义语音模型，用于通过用于所有似乎合理的音素的声学模型的加权和来表示在发音字典中使用的每个音素，其中发音字典不改变，而是在每个音素的声学空间的模型中字典成为典型本地语音的音素的声学模型的加权和。最后，该方法包括使用定制语音模型通过处理器从目标说话者识别附加语音。

57.

发明授权
Low latency real-time vocal tract length normalization 有权
Title translation: 低延迟实时声道长度归一化

公开(公告)号：US07567903B1

公开(公告)日：2009-07-28

申请号：US11034535

申请日：2005-01-12

Applicant: Vincent Goffin , Andrej Ljolje , Murat Saraclar

Inventor： Vincent Goffin , Andrej Ljolje , Murat Saraclar

IPC: G10L15/06 , G10L15/10 , G10L17/00 , G10L13/00 , G10L19/14

CPC classification number: G10L15/063 , G10L15/10 , G10L15/12 , G10L17/04 , G10L17/08

Abstract: A method and apparatus for performing speech recognition are provided. A Vocal Tract Length Normalized acoustic model for a speaker is generated from training data. Speech recognition is performed on a first recognition input to determine a first best hypothesis. A first Vocal Tract Length Normalization factor is estimated based on the first best hypothesis. Speech recognition is performed on a second recognition input using the Vocal Tract Length Normalized acoustic model to determine an other best hypothesis. An other Vocal Tract Length Normalization factor is estimated based on the other best hypothesis and at least one previous best hypothesis.

Abstract translation: 提供了一种用于执行语音识别的方法和装置。声音段长度从训练数据生成扬声器的归一化声学模型。在第一识别输入上执行语音识别以确定第一最佳假设。第一个声带长度归一化因子是基于第一个最佳假设估计的。在第二识别输入上使用声带长度归一化声学模型进行语音识别，以确定另一个最佳假设。另一个声带长度归一化因子基于另一个最佳假设和至少一个先前的最佳假设来估计。

58.

发明申请
Systems and Methods of providing modified media content 有权
Title translation: 提供修改媒体内容的系统和方法

公开(公告)号：US20080235741A1

公开(公告)日：2008-09-25

申请号：US11725591

申请日：2007-03-19

Applicant: Andrej Ljolje , Ann Syrdal , Alistair Conkie

Inventor： Andrej Ljolje , Ann Syrdal , Alistair Conkie

IPC: H04N7/173

CPC classification number: H04N21/6373 , H04N5/4401 , H04N5/765 , H04N5/775 , H04N5/783 , H04N7/56 , H04N9/8063 , H04N21/2335 , H04N21/234381 , H04N21/2393 , H04N21/4307 , H04N21/4325 , H04N21/47202 , H04N21/6587

Abstract: A method and system of providing media content is disclosed. In a particular embodiment, the method includes receiving media content from a content source at a set-top box device. The media content includes video data having a first playback rate and audio data having the first playback rate. The method further includes transforming the audio data via a non-linear transformation to produce modified audio data having a second playback rate, modifying the video data to produce modified video data having the second playback rate, and synchronizing the modified audio data and the modified video data to produce modified media content having the second playback rate. A network-based media content storage device and associated logic to provide adjusted rate audio content are also disclosed.

Abstract translation: 公开了提供媒体内容的方法和系统。在特定实施例中，该方法包括在机顶盒设备处从内容源接收媒体内容。媒体内容包括具有第一播放速率的视频数据和具有第一播放速率的音频数据。该方法还包括经由非线性变换来变换音频数据以产生具有第二播放速率的修改的音频数据，修改视频数据以产生具有第二播放速率的修改的视频数据，以及使修改的音频数据和修改的视频同步数据以产生具有第二播放速率的修改的媒体内容。还公开了一种基于网络的媒体内容存储设备和相关逻辑以提供经调整的速率音频内容。

59.

发明申请
Systems and methods of providing modified media content 有权
Title translation: 提供修改的媒体内容的系统和方法

公开(公告)号：US20080226256A1

公开(公告)日：2008-09-18

申请号：US11716995

申请日：2007-03-12

Applicant: Andrej Ljolje , Ann Syrdal , Alistair Conkie

Inventor： Andrej Ljolje , Ann Syrdal , Alistair Conkie

IPC: H04N5/91

CPC classification number: H04N9/87 , H04N5/765 , H04N5/782 , H04N21/4334 , H04N21/4398 , H04N21/440281 , H04N21/4621

Abstract: A method of providing modified media content is disclosed that includes providing media content to a destination device via a network, where the media content comprises video data and audio data have a first viewing rate. The method further includes receiving data indicating a selection of a second viewing rate via the network and modifying the media content to produce modified media content having approximately the second viewing rate. The modified media content includes modified video data and modified audio data synchronized at approximately the second viewing rate.

Abstract translation: 公开了一种提供修改的媒体内容的方法，其包括经由网络向目的地设备提供媒体内容，其中媒体内容包括视频数据和音频数据具有第一观看速率。该方法还包括接收经由网络指示选择第二观看速率的数据，并修改媒体内容以产生具有大约第二观看速率的修改的媒体内容。修改的媒体内容包括修改的视频数据和大约第二观看速率同步的修改的音频数据。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification