Patent search cpc:"G10L19/00" Page 4

31.

发明申请
SYNTHETISEUR NUMERIQUE AUDIO AMELIORE 审中-公开
Title translation: 改进的数字音频合成器

公开(公告)号：WO2011161372A1

公开(公告)日：2011-12-29

申请号：PCT/FR2011/051425

申请日：2011-06-21

Applicant: INRIA INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE , UNIVERSITÉ DE NANCY 1 HENRI POINCARÉ , DI MARTINO, Joseph , PIERRON, Laurent

Inventor： DI MARTINO, Joseph , PIERRON, Laurent

IPC: G10H7/10 , G10L13/02

CPC classification number: G06F17/00 , G10H2210/155 , G10H2210/285 , G10H2250/005 , G10H2250/031 , G10H2250/235 , G10H2250/281 , G10H2250/471 , G10L13/02 , G10L13/04 , G10L19/00 , G10L19/022 , G10L25/18 , G10L2021/0135

Abstract: Un synthétiseur numérique audio qui comprend : une mémoire d'entrée pour recevoir une suite de données numériques représentatives du spectre d'amplitude d'un signai audio, sur des fenêtres temporelles consécutives et chevauchantes, un calculateur (120), agencé pour recevoir en entrée un jeu de données numériques d'esquisse d'une fenêtre courante comprenant en début de fenêtre des données extrapolées d'amplitude, et des valeurs nulles pour le reste de la fenêtre, et pour établir en réponse une représentation numérique de la transformée de Fourier discrète complexe de ce jeu, un composeur (130), agencé pour combiner l'entrée de spectre d'amplitude associée à la fenêtre courante considérée et la représentation numérique déterminée par le calculateur, et pour appeler le calculateur (120) avec les données résultantes pour établir une représentation numérique de la transformée de Fourier discrète complexe inverse correspondante, ce qui fournit, un jeu de données numériques estimées, relatives à la fenêtre courante considérée, et un additionneur (140), pour cumuler sélectivement les données numériques estimées qui correspondent à un même temps, le composeur ( 130) est agencé pour calculer un jeu de données numériques auxiliaires (Xi(n)), en prenant le jeu de données numériques estimées (z(n)) courant, divisé par une fonction de fenêtre sur chaque fenêtre temporelle, l'additionneur (140) est agencé pour ajouter le jeu de données numériques estimées courant multiplié par la fonction de fenêtre (H) à la valeur précédente du cumul, un cxtrapolateur (110) agencé pour calculer le jeu de données numériques d'esquisse pour une fenêtre courante à partir du jeu de données numériques auxiliaires de la fenêtre précédente multiplié sélectivement par le carré de la fonction de fenêtre.

Abstract translation: 本发明涉及一种数字音频合成器，包括：输入存储器，用于在连续和重叠的时间窗口上接收表示音频信号的幅度谱的数字数据序列; 计算机（120），被设置为接收在窗口开始处包括幅度外推数据的活动窗口的草图数字数据集作为输入，并且为窗口的其余部分接收零值，作为响应，用于所述集合的离散复数傅里叶变换的数字表示; 拨号器（130），其被设置为将对应于所讨论的活动窗口的振幅谱输入与由计算机预定的数字表示组合，并且利用所得到的数据来呼叫计算机（120），以建立相应的复数离散傅立叶逆变换的数字表示，其提供与所讨论的活动窗口相关的一组估计的数字数据; 以及用于选择性地累积对应于单个时间段的估计数字数据的加法器（140）。拨号器（130）被设置为通过在每个时间窗口上采用由窗函数划分的估计数字数据的活动集（z（n））来计算辅助数字数据集（Xi（n））。加法器（140）被设置为将累加的总数中具有前一值的窗函数（H）乘以估计数字数据的有效集合。所述数字音频合成器还包括外推器（110），其被设置为从前一窗口的辅助数字数据组计算用于有效窗口的草图数字数据集，所述辅助数字数据组选择性地乘以窗口功能。

32.

发明申请
ONLINE MAXIMUM-LIKELIHOOD MEAN AND VARIANCE NORMALIZATION FOR SPEECH RECOGNITION 审中-公开
Title translation: 在线最大平均值和语音识别的变化正则化

公开(公告)号：WO2011102842A1

公开(公告)日：2011-08-25

申请号：PCT/US2010/024890

申请日：2010-02-22

Applicant: NUANCE COMMUNICATIONS, INC. , WILLETT, Daniel

Inventor： WILLETT, Daniel

IPC: G10L15/02 , G10L15/20

CPC classification number: G10L19/0212 , G10L15/02 , G10L15/08 , G10L15/20 , G10L15/34 , G10L19/00

Abstract: A feature transform for speech recognition is described. An input speech utterance is processed to produce a sequence of representative speech vectors. A time-synchronous speech recognition pass is performed using a decoding search to determine a recognition output corresponding to the speech input. The decoding search includes, for each speech vector after some first threshold number of speech vectors, estimating a feature transform based on the preceding speech vectors in the utterance and partial decoding results of the decoding search. The current speech vector is then adjusted based on the current feature transform, and the adjusted speech vector is used in a current frame of the decoding search.

Abstract translation: 描述用于语音识别的特征变换。处理输入语音话语以产生代表性语音向量的序列。使用解码搜索来执行时间同步语音识别遍，以确定对应于语音输入的识别输出。解码搜索包括对于在一些第一阈值数量的语音矢量之后的每个语音矢量，基于解码搜索的发音和部分解码结果中的先前语音向量来估计特征变换。然后基于当前特征变换来调整当前语音向量，并且在解码搜索的当前帧中使用经调整的语音向量。

33.

发明申请
A METHOD AND AN APPARATUS FOR PROCESSING AN AUDIO SIGNAL 审中-公开
Title translation: 处理音频信号的方法和设备

公开(公告)号：WO2011034375A3

公开(公告)日：2011-07-07

申请号：PCT/KR2010006410

申请日：2010-09-17

Applicant: LG ELECTRONICS INC , UNIV YONSEI IACF , OH HYEN-O , LEE CHANG HEON , KANG HONG GOO , SONG JUNG WOOK

Inventor： OH HYEN-O , LEE CHANG HEON , KANG HONG GOO , SONG JUNG WOOK

IPC: G10L19/00 , G10L19/04 , G11B20/10

CPC classification number: G10L19/00 , G10L19/022 , G10L19/04 , G10L19/167 , G10L19/18 , G11B20/00007 , G11B2020/00028 , G11B2020/10546

Abstract: An apparatus for processing an audio signal and method thereof are disclosed. The present invention includes receiving, by an audio processing apparatus, coding identification information indicating whether to apply a first coding scheme or a second coding scheme to a current frame; when the coding identification information indicates that the second coding scheme is applied to the current frame, receiving window type information indicating a particular window for the current frame, from among a plurality of windows; identifying that a current window is a stop start window based on the window type information, wherein the stop start window follows one of a long start window, a short window and a window of the first coding scheme for a previous frame, wherein the stop start window is followed by one of a long stop window, a short window and a window of the first coding scheme for a following frame, wherein the stop_start window includes a gentle-gentle stop_start window, a gentle-steep stop start window, a steep-gentle stop_start window and a steep-steep stop_start window; when the first coding scheme is applied to a previous frame, applying one of the gentle-gentle stop start window and the gentle-steep stop start window to the current frame; and, when the first coding scheme is applied to a following frame, applying one of the gentle-gentle stop start window and the steep-gentle stop_start window to the current frame, wherein: the gentle-gentle stop start window comprise an ascending line with first slope and a descending line with the first slope, the gentle-steep stop start window comprise an ascending line with the first slope and a descending line with second slope, the steep-gentle stop start window comprise an ascending line with the second slope and a descending line with first slope, the steep-steep stop start window comprise an ascending line with the second slope and a descending line with the second slope, and, the first slope is gentler than the second slope.

Abstract translation: 公开了一种用于处理音频信号的设备及其方法。本发明包括由音频处理装置接收指示是将第一编码方案还是第二编码方案应用于当前帧的编码标识信息; 当编码识别信息指示第二编码方案被应用于当前帧时，从多个窗口中接收指示当前帧的特定窗口的窗口类型信息; 基于所述窗口类型信息来识别当前窗口是停止开始窗口，其中所述停止开始窗口针对先前帧在所述第一编码方案的长开始窗口，短窗口和窗口之后进行，其中所述停止开始窗口之后是针对后一帧的长停窗口，短窗口和第一编码方案的窗口中的一个，其中stop_start窗口包括温和平缓的stop_start窗口，平缓陡峭的停止开始窗口，温柔的stop_start窗口和陡峭的stop_start窗口; 当第一编码方案应用于前一帧时，将温和平缓的开始窗口和平缓陡峭的开始窗口中的一个应用于当前帧; 并且当将第一编码方案应用于随后的帧时，将温和平缓的停止开始窗口和陡缓的停止开始窗口中的一个应用于当前帧，其中：平缓温和的停止开始窗口包括与第一斜坡和下降线与第一斜坡相关，缓坡陡峭的停止开始窗口包括具有第一斜坡的上升线和具有第二斜坡的下降线，陡峭平缓的停止开始窗口包括具有第二斜坡的上升线和具有第一坡度的下降线，陡峭的停止开始窗口包括具有第二坡度的上升线和具有第二坡度的下降线，并且第一坡度比第二坡度平缓。

34.

发明申请
A METHOD AND AN APPARATUS FOR PROCESSING AN AUDIO SIGNAL 审中-公开
Title translation: 一种处理音频信号的方法和装置

公开(公告)号：WO2011034374A2

公开(公告)日：2011-03-24

申请号：PCT/KR2010006409

申请日：2010-09-17

Applicant: LG ELECTRONICS INC , UNIV YONSEI IACF , OH HYEN-O , LEE CHANG HEON , KANG HONG GOO , SONG JUNG WOOK

Inventor： OH HYEN-O , LEE CHANG HEON , KANG HONG GOO , SONG JUNG WOOK

IPC: G10L19/00 , G10L19/04 , G11B20/10

CPC classification number: G10L19/00 , G10L19/022 , G10L19/04 , G10L19/167 , G10L19/18 , G11B20/00007 , G11B2020/00028 , G11B2020/10546

Abstract: An apparatus for processing an audio signal and method thereof are disclosed. The present invention includes receiving, by an audio processing apparatus, coding identification information indicating whether to apply a first coding scheme or a second coding scheme to a current frame, when the coding identification information indicates that the second coding scheme is applied to the current frame, receiving window type information indicating a particular window for the current frame, from among a plurality of windows; identifying that a current window is a short window based on the window type information, wherein the short window has one fixed shape which comprises a plurality of short parts overlapped together, and, applying the short window of the fixed shape to the current frame, wherein the short window follows one of a long_start window, a stop_start window and a window of the first coding scheme for a previous frame, wherein the short window is followed by one of a long_stop window, the stop_start window and the window of the first coding scheme for a following frame.

Abstract translation: 公开了一种用于处理音频信号的装置及其方法。本发明包括由音频处理装置接收指示是否对当前帧应用第一编码方案或第二编码方案的编码标识信息，当编码标识信息指示第二编码方案应用于当前帧时从多个窗口中接收指示当前帧的特定窗口的窗口类型信息; 基于窗口类型信息识别当前窗口是短窗口，其中短窗口具有一个固定形状，其包括重叠在一起的多个短部分，并且将固定形状的短窗口应用于当前帧，其中，短窗口遵循前一帧的long_start窗口，stop_start窗口和第一编码方案的窗口之一，其中短窗口之后是long_stop窗口，stop_start窗口和第一编码方案的窗口之一用于后续框架。

35.

发明申请
CODING AND DECODING OF SOURCE SIGNALS USING CONSTRAINED RELATIVE ENTROPY QUANTIZATION 审中-公开
Title translation: 使用约束相关熵定标对源信号进行编码和解码

公开(公告)号：WO2011033103A1

公开(公告)日：2011-03-24

申请号：PCT/EP2010/063789

申请日：2010-09-20

Applicant: GLOBAL IP SOLUTIONS (GIPS) AB , GLOBAL IP SOLUTIONS, INC. , KLEIJN, Willem Bastiaan , LI, Minyue

Inventor： KLEIJN, Willem Bastiaan , LI, Minyue

IPC: G10L19/00 , H03M7/30 , H04N7/26

CPC classification number: G10L19/00

Abstract: Methods and devices for encoding and decoding are provided. A source signal value is encoded by a quantization index determined using a partition into quantization cells. Decoding of the quantization index takes place by sampling a reconstruction probability distribution, thereby obtaining a reconstructed signal value, such that the reconstructed signal value lies in the same quantization cell as the source signal value. In one embodiment, encoding and decoding are such that their succession preserves the source signal distribution. In another embodiment, the partition and the reconstruction probability distribution are determined in such manner that the quantization error is minimized subject to a constraint on the relative entropy between the source signal and the reconstructed signal.

Abstract translation: 提供了用于编码和解码的方法和装置。源信号值由使用分区确定的量化单元的量化索引进行编码。通过对重构概率分布进行采样来进行量化索引的解码，从而获得重构信号值，使得重构信号值位于与源信号值相同的量化单元中。在一个实施例中，编码和解码使得它们的继承保持源信号分布。在另一个实施例中，以这样的方式确定分区和重建概率分布，使得在对源信号和重建信号之间的相对熵的约束下，量化误差被最小化。

36.

发明申请
SYSTEM FOR ADAPTIVELY STREAMING AUDIO OBJECTS 审中-公开
Title translation: 适应流行音乐对象的系统

公开(公告)号：WO2011020067A1

公开(公告)日：2011-02-17

申请号：PCT/US2010/045532

申请日：2010-08-13

Applicant: SRS LABS, INC. , TRACEY, James , KATSIANOS, Themis , KRAEMER, Alan, D.

Inventor： TRACEY, James , KATSIANOS, Themis , KRAEMER, Alan, D.

IPC: G10L19/00

CPC classification number: H04R3/12 , G10L19/00 , G10L19/167 , G10L19/24 , H04S7/308 , H04S7/40 , H04S2400/03 , H04S2400/11 , H04S2400/15

Abstract: Systems and methods for providing object-oriented audio are described. Audio objects can be created by associating sound sources with attributes of those sound sources, such as location, velocity, directivity, and the like. Audio objects can be used in place of or in addition to channels to distribute sound, for example, by streaming the audio objects over a network to a client device. The objects can define their locations in space with associated two or three dimensional coordinates. The objects can be adaptively streamed to the client device based on available network or client device resources. A renderer on the client device can use the attributes of the objects to determine how to render the objects. The renderer can further adapt the playback of the objects based on information about a rendering environment of the client device. Various examples of audio object creation techniques are also described.

Abstract translation: 描述用于提供面向对象音频的系统和方法。可以通过将声源与这些声源的属性（诸如位置，速度，方向性等）相关联来创建音频对象。可以使用音频对象来代替或除了用于分发声音的频道之外，例如，通过网络将音频对象流式传输到客户端设备。对象可以使用关联的两维或三维坐标在空间中定义它们的位置。可以根据可用的网络或客户端设备资源将对象自适应地流式传输到客户端设备。客户端设备上的渲染器可以使用对象的属性来确定如何呈现对象。渲染器可以基于关于客户端设备的呈现环境的信息来进一步适应对象的回放。还描述了音频对象创建技术的各种示例。

37.

发明申请
METHOD AND APPARATUS FOR PROCESSING AUDIO SIGNALS 审中-公开
Title translation: 处理音频信号的方法和装置

公开(公告)号：WO2009128666A3

公开(公告)日：2010-02-18

申请号：PCT/KR2009001988

申请日：2009-04-16

Applicant: SAMSUNG ELECTRONICS CO LTD , KIM HYUN-WOOK , LEE CHUL-WOO , JEONG JONG-HOON , LEE NAM-SUK , MOON HAN-GIL , LEE SANG-HOON

Inventor： KIM HYUN-WOOK , LEE CHUL-WOO , JEONG JONG-HOON , LEE NAM-SUK , MOON HAN-GIL , LEE SANG-HOON

IPC: G10L19/00 , H04S5/00

CPC classification number: H04S7/30 , G10L19/00 , H04R5/04 , H04S2400/03 , H04S2400/11

Abstract: A method and an apparatus for processing an audio signal are disclosed. In the present invention, an audio signal is encoded and decoded on the basis of the sound-source motion, reverberation characteristics, and semantic objects included in the audio signal, thereby enabling more faithful reproduction of audio and efficient search and editing of the same.

Abstract translation: 公开了一种用于处理音频信号的方法和装置。在本发明中，音频信号是根据包含在音频信号中的声源运动，混响特性和语义对象进行编码和解码的，从而能够更加忠实地再现音频并进行有效的搜索和编辑。

38.

发明申请
METHOD AND DEVICE FOR VOICE OPERATED CONTROL 审中-公开
Title translation: 用于语音操作控制的方法和设备

公开(公告)号：WO2009128853A1

公开(公告)日：2009-10-22

申请号：PCT/US2008/069429

申请日：2008-07-08

Applicant: PERSONICS HOLDINGS INC. , GOLDSTEIN, Steven Wayne , USHER, John , BOILLOT, Marc

Inventor： GOLDSTEIN, Steven Wayne , USHER, John , BOILLOT, Marc

IPC: H04R25/00

CPC classification number: H04R1/08 , G10L15/20 , G10L19/00 , G10L25/78 , H04R1/1041 , H04R3/005 , H04R25/50

Abstract: At least one exemplary embodiment is directed to a method and/or a device for voice operated control. The method can include method measuring an ambient sound received from at least one Ambient Sound Microphone, measuring an internal sound received from at least one Ear Canal Microphone, detecting a spoken voice from a wearer of the earpiece based on an analysis of the ambient sound and the internal sound, and controlling at least one voice operation of the earpiece if the presence of spoken voice is detected. The analysis can be a non-difference comparison such as a correlation analysis, a cross-correlation analysis, and a coherence analysis.

Abstract translation: 至少一个示例性实施例涉及用于语音操作控制的方法和/或设备。该方法可以包括测量从至少一个环境声音麦克风接收的环境声音的方法，测量从至少一个耳道麦克风接收的内部声音，基于对环境声音的分析来检测来自耳机佩戴者的口语声音;以及内部声音，并且如果检测到语音的存在，则控制耳机的至少一个语音操作。分析可以是非差异比较，如相关分析，互相关分析和一致性分析。

39.

发明申请
VOICE AND TEXT COMMUNICATION SYSTEM, METHOD AND APPARATUS 审中-公开
Title translation: 语音通信系统，方法与设备

公开(公告)号：WO2008016949A3

公开(公告)日：2008-03-20

申请号：PCT/US2007074902

申请日：2007-07-31

Applicant: QUALCOMM INC , MOLLOY STEPHEN , EL-MALEH KHALED HELMI

Inventor： MOLLOY STEPHEN , EL-MALEH KHALED HELMI

IPC: G10L13/04 , G10L15/26 , G10L19/00

CPC classification number: G10L13/027 , G10L13/00 , G10L15/26 , G10L19/00 , G10L19/0019 , G10L21/0208 , G10L2021/02082

Abstract: The disclosure relates to systems, methods and apparatus to convert speech to text and vice versa. One apparatus comprises a vocoder, a speech to text conversion engine, a text to speech conversion engine, and a user interface. The vocoder is operable to convert speech signals into packets and convert packets into speech signals. The speech to text conversion engine is operable to convert speech to text. The text to speech conversion engine is operable to convert text to speech. The user interface is operable to receive a user selection of a mode from among a plurality of modes, wherein a first mode enables the speech to text conversion engine, a second mode enables the text to speech conversion engine, and a third mode enables the speech to text conversion engine and the text to speech conversion engine.

Abstract translation: 本公开涉及将语音转换为文本的系统，方法和装置，反之亦然。一种装置包括声码器，语音到文本转换引擎，文本到语音转换引擎和用户界面。声码器可操作以将语音信号转换成分组并将分组转换为语音信号。语音到文本转换引擎可操作以将语音转换为文本。文本到语音转换引擎可操作以将文本转换为语音。用户接口可操作以从多个模式中接收用户对模式的选择，其中第一模式使得语音能够进行文本转换引擎，第二模式使文本到语音转换引擎成为可能，第三模式使语音将文字转换引擎和文字转换为语音转换引擎。

40.

发明申请
VARIABLE-RESOLUTION PROCESSING OF FRAME-BASED DATA 审中-公开
Title translation: 基于框架数据的可变分辨率处理

公开(公告)号：WO2008022566A1

公开(公告)日：2008-02-28

申请号：PCT/CN2007/002491

申请日：2007-08-17

Applicant: DIGITAL RISE TECHNOLOGY CO., LTD. , YOU, Yuli

Inventor： YOU, Yuli

IPC: G10L15/00

CPC classification number: G10L19/00 , G10L19/008 , G10L19/025 , G10L19/038

Abstract: Provide are systems, methods and techniques for processing frame-based data. A frame of data, an indication that a transient occurs within the frame, and a location of the transient within the frame are obtained. Based on the indication f the transient, a block size is set for the frame, thereby effectively defining a plurality of equal-sized blocks with the frame. In addition, different window functions are selected for efferent ones of the plurality of equal-sized blocks based on the location of the transient, and the framed of data is processed by applying the selected window functions.

Abstract translation: 提供用于处理基于帧的数据的系统，方法和技术。获得一帧数据，在帧内发生瞬态的指示，以及该帧内瞬态的位置。基于瞬态的指示，为帧设置块大小，从而有效地定义与帧相等的多个块。此外，基于瞬态的位置，针对多个等大小块中的传出选择不同的窗口函数，并且通过应用所选择的窗函数来处理数据帧。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification