专利检索 ap:("Dan Chazan" OR "Meir Zibulski" OR "Ron Hoory") AND inv:"Ron Hoory" 第 1 页

1.

发明授权
Fast frequency-domain pitch estimation 有权
标题翻译：快速频域间距估计

公开(公告)号：US06587816B1

公开(公告)日：2003-07-01

申请号：US09617582

申请日：2000-07-14

申请人： Dan Chazan , Meir Zibulski , Ron Hoory

发明人： Dan Chazan , Meir Zibulski , Ron Hoory

IPC分类号： G10L1104

CPC分类号： G10L25/90

摘要： A method for estimating a pitch frequency of an audio signal includes computing a first transform of the signal to a frequency domain over a first time interval, and computing a second transform of the signal to the frequency domain over a second time interval, which contains the first time interval. A line spectrum of the signal is found, based on the first and second transforms, the spectrum including spectral lines having respective line amplitudes and line frequencies. A utility function that is periodic in the frequencies of the lines in the spectrum is then computed. This function is indicative, for each candidate pitch frequency in a given pitch frequency range, of a compatibility of the spectrum with the candidate pitch frequency. The pitch frequency of the speech signal is estimated responsive to the utility function.

摘要翻译： 一种用于估计音频信号的音调频率的方法包括：在第一时间间隔上计算信号到频域的第一变换，以及在第二时间间隔上计算信号到频域的第二变换，该第二时间间隔包含第一时间间隔。基于第一和第二变换，发现包括具有各自线路幅度和线路频率的谱线的频谱的信号线谱。然后计算在频谱中的线的频率中周期性的效用函数。该功能针对给定音调频率范围内的每个候选音调频率指示频谱与候选音调频率的兼容性。响应于效用函数来估计语音信号的音调频率。

2.

发明授权
Method and system for low bit rate speech coding with speech recognition features and pitch providing reconstruction of the spectral envelope 有权
标题翻译：用于具有语音识别特征的低比特率语音编码的方法和系统，并且提供频谱包络的重建

公开(公告)号：US06678655B2

公开(公告)日：2004-01-13

申请号：US10291590

申请日：2002-11-12

申请人： Ron Hoory , Dan Chazan , Ezra Silvera , Meir Zibulski

发明人： Ron Hoory , Dan Chazan , Ezra Silvera , Meir Zibulski

IPC分类号： G10L1912

CPC分类号： G10L19/02 , G10L15/02

摘要： A method for encoding a digitized speech signal so as to generate data capable of being decoded as speech. A digitized speech signal is first converted to a series of feature vectors using for example known Mel-frequency Cepstral coefficients (MFCC) techniques. At successive instances instance of time a respective pitch value of the digitized speech signal is computed, and successive acoustic vectors each containing the respective pitch value and feature vector are compressed so as to derive therefrom a bit stream. A suitable decoder reverses the operation so as to extract the features vectors and pitch values, thus allowing speech reproduction and playback. In addition, speech recognition is possible using the decompressed feature vectors, with no impairment of the recognition accuracy and no computational overhead.

摘要翻译： 一种用于编码数字化语音信号以便产生能够被解码为语音的数据的方法。使用例如已知的Mel-frequency倒谱系数（MFCC）技术，首先将数字化语音信号转换成一系列特征向量。在连续的实例中，计算数字化语音信号的相应音调值，并且压缩每个包含相应音调值和特征向量的连续声矢量，从而从其中导出比特流。合适的解码器反转操作以提取特征向量和音调值，从而允许语音再现和回放。另外，使用解压缩的特征向量可以进行语音识别，而不会损害识别精度并且没有计算开销。

3.

发明授权
Feature-domain concatenative speech synthesis 有权
标题翻译：特征域级联语音合成

公开(公告)号：US07035791B2

公开(公告)日：2006-04-25

申请号：US09901031

申请日：2001-07-10

申请人： Dan Chazan , Ron Hoory

发明人： Dan Chazan , Ron Hoory

IPC分类号： G10L11/04

CPC分类号： G10L13/07 , G10L25/18

摘要： A method for speech synthesis includes receiving an input speech signal containing a set of speech segments, and estimating spectral envelopes of the input speech signal in a succession of time intervals during each of the speech segments. The spectral envelopes are integrated over a plurality of window functions in a frequency domain so as to determine elements of feature vectors corresponding to the speech segments. An output speech signal is reconstructed by concatenating the feature vectors corresponding to a sequence of the speech segments.

摘要翻译： 一种用于语音合成的方法包括接收包含一组语音段的输入语音信号，并且在每个语音段期间以一连串的时间间隔估计输入语音信号的频谱包络。频谱包络被集成在频域中的多个窗口函数上，以便确定与语音段对应的特征向量的元素。通过连接对应于语音片段序列的特征向量来重构输出语音信号。

4.

发明申请
Speech synthesis using complex spectral modeling 有权
标题翻译：使用复谱谱建模的语音合成

公开(公告)号：US20050131680A1

公开(公告)日：2005-06-16

申请号：US11046911

申请日：2005-01-31

申请人： Dan Chazan , Ron Hoory , Zvi Kons , Slava Shechtman , Alexander Sorin

发明人： Dan Chazan , Ron Hoory , Zvi Kons , Slava Shechtman , Alexander Sorin

IPC分类号： G10L11/00 , G10L11/04 , G10L13/08 , G10L19/02 , G10L19/14

CPC分类号： G10L13/08 , G10L19/02

摘要： A method for processing a speech signal includes dividing the speech signal into a succession of frames, identifying one or more of the frames as click frames, and extracting phase information from the click frames. The speech signal is encoded using the phase information. Methods are also provided for modeling phase spectra of voiced frames and click frames.

摘要翻译： 一种用于处理语音信号的方法包括将语音信号划分成一系列帧，将一个或多个帧标识为点击帧，以及从点击帧中提取相位信息。使用相位信息对语音信号进行编码。还提供了用于建模有声帧和点击帧的相位谱建模的方法。

5.

发明授权
Method and system for speech reconstruction from speech recognition features, pitch and voicing with resampled basis functions providing reconstruction of the spectral envelope 有权
标题翻译：用于语音识别特征的语音重建的方法和系统，具有重新采样的基函数的音调和发音，提供频谱包络的重建

公开(公告)号：US06725190B1

公开(公告)日：2004-04-20

申请号：US09432081

申请日：1999-11-02

申请人： Dan Chazan , Gilad Cohen , Ron Hoory

发明人： Dan Chazan , Gilad Cohen , Ron Hoory

IPC分类号： G10L1902

CPC分类号： G10L13/07 , G10L25/18

摘要： A speech reconstruction method and system for converting a series of binned spectra or functions thereof such as the Mel Frequency Cepstra Coefficients (MFCC), of an original digitized speech signal, into a reconstructed speech signal, where each binned spectrum has a respective pitch value and voicing decision. The binned spectra are derived from the original digitized speech signal at successive instances by multiplying each estimate of the spectral envelope by a predetermined set of frequency domain window functions and computing the integrals thereof. At each respective time instance, harmonic frequencies and weights are generated according to the respective pitch value and voicing decision. Basis functions having bounded supports on the frequency axis are each sampled at all said harmonic frequencies, which are within its support and multiplied by respective harmonic weights. The sampled basis functions are combined with respective phases, generated according to the pitch value, voicing decision and possibly the binned spectrum, resulting in a complex line spectrum corresponding to each basis function. Coefficients are generated of the basis functions, and each of the points of the respective complex line spectra is multiplied by the respective basis function coefficient. The complex line spectra are summed up to generate for each time instance a single complex line spectrum with values for all harmonic frequencies. A time signal is generated from complex line spectra computed at successive instances of time.

摘要翻译： 一种将原始数字化语音信号的Mel频率Cepstra系数（MFCC）的一系列二进制频谱或其功能转换为重构语音信号的语音重建方法和系统，其中每个合并频谱具有相应的音调值，发声决定。通过将频谱包络的每个估计乘以预定的一组频域窗口函数并计算其积分，在连续实例中从原始数字化语音信号导出分箱频谱。在各个时间的情况下，根据相应的音调值和发音决定产生谐波频率和权重。在频率轴上具有界限支撑的基础功能在所有谐波频率下进行采样，所述谐波频率在其支持范围内并乘以相应的谐波权重。采样基函数与根据音调值，发声判定和可能的分频谱产生的相位相结合，得到与每个基函数对应的复谱线谱。生成基函数的系数，并将各个复谱谱的每个点乘以各自的基函数系数。归纳出复谱线谱，为每个时间实例生成具有所有谐波频率值的单个复谱谱线。时间信号由在连续的时间实例计算出的复线谱产生。

6.

发明授权
Speech synthesis using complex spectral modeling 有权
标题翻译：使用复谱谱建模的语音合成

公开(公告)号：US08280724B2

公开(公告)日：2012-10-02

申请号：US11046911

申请日：2005-01-31

申请人： Dan Chazan , Ron Hoory , Zvi Kons , Slava Shechtman , Alexander Sorin

发明人： Dan Chazan , Ron Hoory , Zvi Kons , Slava Shechtman , Alexander Sorin

IPC分类号： G10L11/04 , G10L19/14 , G10L11/06 , G10L19/06

CPC分类号： G10L13/08 , G10L19/02

摘要： A method for processing a speech signal includes dividing the speech signal into a succession of frames, identifying one or more of the frames as click frames, and extracting phase information from the click frames. The speech signal is encoded using the phase information. Methods are also provided for modeling phase spectra of voiced frames and click frames.

摘要翻译： 一种用于处理语音信号的方法包括将语音信号划分成一系列帧，将一个或多个帧标识为点击帧，以及从点击帧中提取相位信息。使用相位信息对语音信号进行编码。还提供了用于建模有声帧和点击帧的相位谱建模的方法。

7.

发明授权
Accuracy improvement of spoken queries transcription using co-occurrence information 有权
标题翻译：使用同现信息进行语音查询转录的准确性提高

公开(公告)号：US08650031B1

公开(公告)日：2014-02-11

申请号：US13194972

申请日：2011-07-31

申请人： Jonathan Mamou , Abhinav Sethy , Bhuvana Ramabhadran , Ron Hoory , Paul Joseph Vozila , Nathan Bodenstab

发明人： Jonathan Mamou , Abhinav Sethy , Bhuvana Ramabhadran , Ron Hoory , Paul Joseph Vozila , Nathan Bodenstab

IPC分类号： G10L15/00 , G10L15/26 , G06F17/27 , G10L21/00 , G10L25/00 , G10L21/06 , G06F17/28 , G10L13/00 , G10L13/06 , G10L19/12 , G06F7/00 , G06F17/30

CPC分类号： G10L15/08 , G06F7/00 , G06F17/30 , G10L15/1815 , G10L15/265

摘要： Techniques disclosed herein include systems and methods for voice-enabled searching. Techniques include a co-occurrence based approach to improve accuracy of the 1-best hypothesis for non-phrase voice queries, as well as for phrased voice queries. A co-occurrence model is used in addition to a statistical natural language model and acoustic model to recognize spoken queries, such as spoken queries for searching a search engine. Given an utterance and an associated list of automated speech recognition n-best hypotheses, the system rescores the different hypotheses using co-occurrence information. For each hypothesis, the system estimates a frequency of co-occurrence within web documents. Combined scores from a speech recognizer and a co-occurrence engine can be combined to select a best hypothesis with a lower word error rate.

摘要翻译： 本文公开的技术包括用于支持语音的搜索的系统和方法。技术包括基于共现的方法，以提高非短语语音查询的1最佳假设的准确性，以及用于短语语音查询。使用统计自然语言模型和声学模型来识别口语查询（例如用于搜索搜索引擎的口语查询）的共现模型。给定一个话语和相关的自动语音识别n最佳假设列表，系统使用同现信息重新分辨不同的假设。对于每个假设，系统估计网络文档中共现的频率。来自语音识别器和共现引擎的组合分数可以组合以选择具有较低字错误率的最佳假设。

8.

发明申请
VOCAL SOURCE EXTRACTION BY MAXIMUM PHASE DETECTION 有权
标题翻译：通过最大相位检测提取VOCAL SOURCE

公开(公告)号：US20130325455A1

公开(公告)日：2013-12-05

申请号：US13487275

申请日：2012-06-04

申请人： Aharon Satt , Zvi Kons , Ron Hoory

发明人： Aharon Satt , Zvi Kons , Ron Hoory

IPC分类号： G10L11/04

CPC分类号： G10L25/75 , G10L25/03 , G10L25/45

摘要： Methods, apparatus and computer program products implement embodiments of the present invention that include receiving a time domain voice signal, and extracting a single pitch cycle from the received signal. The extracted single pitch cycle is transformed to a frequency domain, and the misclassified roots of the frequency domain are identified and corrected. Using the corrected roots, an indication of a maximum phase of the frequency domain is generated.

摘要翻译： 方法，装置和计算机程序产品实现本发明的实施例，其包括接收时域语音信号，并从接收到的信号中提取单个音调周期。提取的单音调周期被转换为频域，并且识别和校正频域的错误分类的根。使用校正的根，产生频域的最大相位的指示。

9.

发明申请
Dictionary lookup for mobile devices using spelling recognition 审中-公开
标题翻译：使用拼写识别的移动设备的字典查找

公开(公告)号：US20070016420A1

公开(公告)日：2007-01-18

申请号：US11176154

申请日：2005-07-07

申请人： Ophir Azulai , Ron Hoory , Zohar Sivan

发明人： Ophir Azulai , Ron Hoory , Zohar Sivan

IPC分类号： G10L15/04 , G10L15/00

CPC分类号： G10L15/19

摘要： A method for querying an electronic dictionary using letters of an alphabet enunciated by a user includes accepting a speech input from the user. The speech input includes a sequence of spelled letters enunciated by the user that spell a query word. The speech input is analyzed to determine one or more sequences of the letters that approximate the sequence of spelled letters. The one or more sequences of the letters are post-processed so as to produce a plurality of recognized words approximating the query word. The electronic dictionary is queried with the plurality of recognized words so as to retrieve a respective plurality of dictionary entries. A list of results including the plurality of recognized words and the respective plurality of dictionary entries is presented to the user.

摘要翻译： 一种使用用户名字母字母查询电子词典的方法包括接受来自用户的语音输入。语音输入包括由用户发出拼写查询词的拼写字母序列。分析语音输入以确定近似拼写字母序列的一个或多个字母序列。对字母的一个或多个序列进行后处理，以产生近似于查询词的多个识别词。使用多个识别的字查询电子词典，以便检索相应的多个字典条目。向用户呈现包括多个识别字和相应的多个字典条目的结果列表。

10.

发明授权
Voice transformation with encoded information 有权
标题翻译：具有编码信息的语音变换

公开(公告)号：US08930182B2

公开(公告)日：2015-01-06

申请号：US13049924

申请日：2011-03-17

申请人： Shay Ben-David , Ron Hoory , Zvi Kons , David Nahamoo

发明人： Shay Ben-David , Ron Hoory , Zvi Kons , David Nahamoo

IPC分类号： G10L21/00 , G10L25/90 , G10L25/93 , G10L21/003 , G10L19/018

CPC分类号： G10L21/003 , G10L19/018

摘要： Method, system, and computer program product for voice transformation are provided. The method includes transforming a source speech using transformation parameters, and encoding information on the transformation parameters in an output speech using steganography, wherein the source speech can be reconstructed using the output speech and the information on the transformation parameters. A method for reconstructing voice transformation is also provided including: receiving an output speech of a voice transformation system wherein the output speech is transformed speech which has encoded information on the transformation parameters using steganography; extracting the information on the transformation parameters; and carrying out an inverse transformation of the output speech to obtain an approximation of an original source speech.

摘要翻译： 提供语音转换的方法，系统和计算机程序产品。该方法包括使用变换参数来变换源语言，以及使用隐写术对输入语音中的变换参数对信息进行编码，其中可以使用输出语音和关于变换参数的信息来重构源语音。还提供了一种用于重建语音变换的方法，包括：接收语音转换系统的输出语音，其中输出语音是使用隐写术编码关于变换参数的信息的变换语音; 提取变换参数信息; 并执行输出语音的逆变换以获得原始源语音的近似。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类