专利检索 ipc:G10L25/45 第 1 页

1.

发明公开
SYSTEM AND METHOD FOR TRAINING SPEECH PROCESSING NEURAL NETWORKS FOR DYNAMIC LOADS 审中-公开

公开(公告)号：US20240347042A1

公开(公告)日：2024-10-17

申请号：US18298473

申请日：2023-04-11

申请人： Nuance Communications, Inc.

发明人： Felix Weninger , Marco Gaudesi , Puming Zhan

IPC分类号： G10L15/06 , G10L15/04 , G10L15/16 , G10L15/22 , G10L25/45

CPC分类号： G10L15/063 , G10L15/04 , G10L15/16 , G10L15/22 , G10L25/45

摘要： A method, computer program product, and computing system for dividing a speech signal into a plurality of chunks. A first context window is defined with a first period of past context for processing the plurality of chunks with a neural network of a speech processing system. The neural network is trained using the first context window. A second context window is defined with a first period of past context for processing the plurality of chunks with the neural network. The neural network is trained using the second context window.

2.

发明公开
METHODS FOR PHASE ECU F0 INTERPOLATION SPLIT AND RELATED CONTROLLER 审中-公开

公开(公告)号：US20240304192A1

公开(公告)日：2024-09-12

申请号：US18646877

申请日：2024-04-26

申请人： Telefonaktiebolaget LM Ericsson (publ)

发明人： Martin SEHLSTEDT

IPC分类号： G10L19/005 , G06F17/14 , G10L19/02 , G10L25/18 , G10L25/45 , H04L65/75 , H04L65/80

CPC分类号： G10L19/005 , G06F17/142 , G10L19/02 , G10L19/0204 , G10L19/0212 , G10L25/18 , G10L25/45 , H04L65/75 , H04L65/80

摘要： Controlling a concealment method for a lost audio frame associated with a received audio signal is provided. At least one bin vector of a spectral representation for at least one tone is obtained, wherein the at least one bin vector includes three consecutive bin values for the at least one tone. Whether each of the three consecutive bin values has a complex value or a real value is determined. Responsive to the determination, the three consecutive bin values are processed to estimate a frequency of the at least one tone based on whether each bin value has a complex value or a real value.

3.

发明授权
Automated clinical documentation system and method 有权

公开(公告)号：US12073361B2

公开(公告)日：2024-08-27

申请号：US17955693

申请日：2022-09-29

申请人： Microsoft Technology Licensing, LLC

发明人： Christina Drexel , Ljubomir Milanovic

IPC分类号： G16H15/00 , G06F3/16 , G06F40/117 , G06F40/30 , G06Q10/10 , G06T7/20 , G10L15/22 , G10L15/26 , G10L15/30 , G10L25/45 , G10L25/51 , G16H10/20 , G16H10/40 , G16H10/60 , G16H50/70 , H04R1/40 , H04R3/00

CPC分类号： G06Q10/10 , G06F3/165 , G06F40/117 , G06F40/30 , G06T7/20 , G10L15/22 , G10L15/26 , G10L15/30 , G10L25/45 , G10L25/51 , G16H10/20 , G16H10/40 , G16H10/60 , G16H15/00 , G16H50/70 , H04R1/406 , H04R3/005 , G06T2207/30196

摘要： A method, computer program product, and computing system for obtaining encounter information of a patient encounter; processing the encounter information to generate an encounter transcript; and processing the encounter transcript to locate one or more procedural events within the encounter transcript.

4.

再颁专利
Apparatus and method for generating audio subband values and apparatus and method for generating time-domain audio samples 有权

公开(公告)号：USRE49999E1

公开(公告)日：2024-06-04

申请号：US17845607

申请日：2022-06-21

申请人： Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

发明人： Markus Schnell , Manfred Lutzky , Markus Lohwasser , Markus Schmidt , Marc Gayer , Michael Mellar , Bernd Edler , Markus Multrus , Gerald Schuller , Ralf Geiger , Bernhard Grill

IPC分类号： G10L19/00 , G10L19/02 , G10L19/022 , G10L25/45 , H03H17/02 , G10L21/038

CPC分类号： G10L19/0204 , G10L19/022 , G10L25/45 , H03H17/0266 , G10L21/038

摘要： An embodiment of an apparatus for generating audio subband values in audio subband channels includes an analysis windower for windowing a frame of time-domain audio input samples being in a time sequence extending from an early sample to a later sample using an analysis window function including a sequence of window coefficients to obtain windowed samples. The analysis window function includes a first number of window coefficients derived from a larger window function including a sequence of a larger second number of window coefficients, wherein the window coefficients of the window function are derived by an interpolation of window coefficients of the larger window function. The apparatus further includes a calculator for calculating the audio subband values using the windowed samples.

5.

发明公开
SPECTRAL SHAPE ESTIMATION FROM MDCT COEFFICIENTS 审中-公开

公开(公告)号：US20240135936A1

公开(公告)日：2024-04-25

申请号：US18524622

申请日：2023-11-30

申请人： Telefonaktiebolaget LM Ericsson (publ)

发明人： Martin SEHLSTEDT , Jonas SVEDBERG

IPC分类号： G10L19/005 , G06F17/14 , G10L19/02 , G10L25/18 , G10L25/45 , H04L65/75 , H04L65/80

CPC分类号： G10L19/005 , G06F17/142 , G10L19/02 , G10L19/0204 , G10L19/0212 , G10L25/18 , G10L25/45 , H04L65/75 , H04L65/80

摘要： A method, decoder, and program code for controlling a concealment method for a lost audio frame is provided. A first audio frame and a second audio frame of the received audio signal are decoded to obtain modified discrete cosine transform (MDCT) coefficients. Values of a first spectral shape based upon the MDCT coefficients decoded from the first audio frame decoded and values of a second spectral shape based upon MDCT coefficients decoded from the second audio frame decoded are determined, the spectral shapes each comprising a number of sub-bands. The values of the spectral shapes and frame energies of the first audio frame and second audio frame are transformed into representations of FFT based spectral analyses. A transient condition is detected based on the representations of the FFTs. Responsive to detecting the transient condition, the concealment method is modified by selectively adjusting a spectrum magnitude of a substitution frame spectrum.

6.

再颁专利
Alias cancelling during audio coding mode transitions 有权

公开(公告)号：USRE49813E1

公开(公告)日：2024-01-23

申请号：US17589228

申请日：2022-01-31

申请人： DOLBY LABORATORIES LICENSING CORPORATION

发明人： Hyen-O Oh , Chang Heon Lee , Hong-Goo Kang , Jeungook Song

IPC分类号： G10L19/00 , G10L25/45 , G10L21/00 , G10L19/04 , G10L19/022 , G10L19/18 , G10L19/005

CPC分类号： G10L19/04 , G10L19/005 , G10L19/022 , G10L19/18 , G10L25/45

摘要： An apparatus for processing an audio signal and method thereof are disclosed. The present invention includes receiving, by an audio processing apparatus, an audio signal including a first data of a first block encoded with rectangular coding scheme and a second data of a second block encoded with non-rectangular coding scheme; receiving a compensation signal corresponding to the second block; estimating a prediction of an aliasing part using the first data; and, obtaining a reconstructed signal for the second block based on the second data, the compensation signal and the prediction of aliasing part.

7.

发明授权
Spectral shape estimation from MDCT coefficients 有权

公开(公告)号：US11862180B2

公开(公告)日：2024-01-02

申请号：US17432260

申请日：2020-02-20

申请人： Telefonaktiebolaget LM Ericsson (publ)

发明人： Martin Sehlstedt , Jonas Svedberg

IPC分类号： G10L19/005 , G10L19/02 , G10L25/18 , H04L65/75 , G10L25/45 , H04L65/80 , G06F17/14

CPC分类号： G10L19/005 , G06F17/142 , G10L19/02 , G10L19/0204 , G10L19/0212 , G10L25/18 , G10L25/45 , H04L65/75 , H04L65/80

摘要： A method, decoder, and program code for controlling a concealment method for a lost audio frame is provided. A first audio frame and a second audio frame of the received audio signal are decoded to obtain modified discrete cosine transform (MDCT) coefficients. Values of a first spectral shape based upon the MDCT coefficients decoded from the first audio frame decoded and values of a second spectral shape based upon MDCT coefficients decoded from the second audio frame decoded are determined, the spectral shapes each comprising a number of sub-bands. The values of the spectral shapes and frame energies of the first audio frame and second audio frame are transformed into representations of FFT based spectral analyses. A transient condition is detected based on the representations of the FFTs. Responsive to detecting the transient condition, the concealment method is modified by selectively adjusting a spectrum magnitude of a substitution frame spectrum.

8.

发明授权
Method for changing speed and pitch of speech and speech synthesis system 有权

公开(公告)号：US11776528B2

公开(公告)日：2023-10-03

申请号：US17380426

申请日：2021-07-20

申请人： Xinapse Co., Ltd.

发明人： Jinbeom Kang , Dong Won Joo , Yongwook Nam

IPC分类号： G10L13/033 , G10L21/013 , G10L25/45 , G10L21/043 , G10L21/14

CPC分类号： G10L13/0335 , G10L21/013 , G10L21/043 , G10L21/14 , G10L25/45

摘要： This application relates to a method of synthesizing a speech of which a speed and a pitch are changed. In one aspect, the method includes a spectrogram may be generated by performing a short-time Fourier transformation on a first speech signal based on a first hop length and a first window length, and speech signals of sections having a second window length at the interval of a second hop length from the spectrogram. A ratio between the first hop length and the second hop length may be set to be equal to the value of a playback rate and a ratio between the first window length and the second window length may be set to be equal to the value of a pitch change rate, thereby generating a second speech signal of which the speed and the pitch are changed.

9.

发明公开
METHODS FOR PHASE ECU F0 INTERPOLATION SPLIT AND RELATED CONTROLLER 审中-公开

公开(公告)号：US20230298597A1

公开(公告)日：2023-09-21

申请号：US18203280

申请日：2023-05-30

申请人： Telefonaktiebolaget LM Ericsson (publ)

发明人： Martin SEHLSTEDT

IPC分类号： G10L19/005 , G10L19/02 , G10L25/18 , H04L65/75 , G10L25/45 , H04L65/80 , G06F17/14

CPC分类号： G10L19/005 , G10L19/0204 , G10L19/0212 , G10L25/18 , H04L65/75 , G10L19/02 , G10L25/45 , H04L65/80 , G06F17/142

摘要： Controlling a concealment method for a lost audio frame associated with a received audio signal is provided. At least one bin vector of a spectral representation for at least one tone is obtained, wherein the at least one bin vector includes three consecutive bin values for the at least one tone. Whether each of the three consecutive bin values has a complex value or a real value is determined. Responsive to the determination, the three consecutive bin values are processed to estimate a frequency of the at least one tone based on whether each bin value has a complex value or a real value.

10.

发明公开
NEURAL NETWORK-BASED AUDIO PACKET LOSS RESTORATION METHOD AND APPARATUS, AND SYSTEM 审中-公开

公开(公告)号：US20230245668A1

公开(公告)日：2023-08-03

申请号：US17911733

申请日：2020-09-30

申请人： ZHUHAI JIELI TECHNOLOGY CO., LTD

发明人： Quanzhi XIAO , Yufeng YAN , Rongjun HUANG , Guiping FANG

IPC分类号： G10L21/02 , G10L25/30 , G10L25/60 , G10L25/45

CPC分类号： G10L21/02 , G10L25/30 , G10L25/60 , G10L25/45

摘要： An audio packet loss repairing method, device and system based on a neural network. The method comprises: obtaining an audio data packet (S101), the audio data packet comprises a plurality of audio data frames, and the plurality of audio data frames at least comprise a plurality of voice signal frames; determining a position of a lost voice signal frame in the plurality of audio data packet to obtain position information of the lost frame (S103), the position comprising a first preset position or a second reset position; selecting, according to the position information of the lost frame, a neural network model for repairing the lost frame (S105), the neural network model comprising a first repairing model and a second repairing model; and sending the plurality of audio data frames to the selected neural network model so as to repair the lost voice signal frame (S107).

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类