Patent search ap:("QUALCOMM Incorporated") AND inv:"Vahid MONTAZERI" Page 1

1.

发明申请
SYNTHESIZED SPEECH GENERATION 有权

公开(公告)号：US20220230623A1

公开(公告)日：2022-07-21

申请号：US17154372

申请日：2021-01-21

Applicant: QUALCOMM Incorporated

Inventor： Kyungguen BYUN , Sunkuk MOON , Shuhua ZHANG , Vahid MONTAZERI , Lae-Hoon KIM , Erik VISSER

IPC: G10L13/047 , G06N3/04 , G10L13/033 , G10L25/63 , G10L19/02

Abstract: A device for speech generation includes one or more processors configured to receive one or more control parameters indicating target speech characteristics. The one or more processors are also configured to process, using a multi-encoder, an input representation of speech based on the one or more control parameters to generate encoded data corresponding to an audio signal that represents a version of the speech based on the target speech characteristics.

2.

发明申请
MACHINE-LEARNING BASED AUDIO SUBBAND PROCESSING 有权

公开(公告)号：US20250119704A1

公开(公告)日：2025-04-10

申请号：US18907310

申请日：2024-10-04

Applicant: QUALCOMM Incorporated

Inventor： Vahid MONTAZERI , Rogerio Guedes ALVES , Erik VISSER

IPC: H04S7/00 , H04R3/00 , H04R3/04

Abstract: A device includes a memory configured to store audio data. The device also includes one or more processors configured to obtain, from first audio data, first subband audio data and second subband audio data. The first subband audio data is associated with a first frequency subband and the second subband audio data is associated with a second frequency subband. The one or more processors are also configured to use a first machine-learning model to process the first subband audio data to generate first subband noise suppressed audio data. The one or more processors are further configured to use a second machine-learning model to process the second subband audio data to generate second subband noise suppressed audio data. The one or more processors are also configured to generate output data based on the first subband noise suppressed audio data and the second subband noise suppressed audio data.

3.

发明公开
LOW-LATENCY NOISE SUPPRESSION 审中-公开

公开(公告)号：US20240331716A1

公开(公告)日：2024-10-03

申请号：US18611308

申请日：2024-03-20

Applicant: QUALCOMM Incorporated

Inventor： Jacob Jon BEAN , Rogerio Guedes ALVES , Vahid MONTAZERI , Erik VISSER

IPC: G10L21/0224 , G06F1/16 , G10L21/0232 , G10L21/0216

CPC classification number: G10L21/0224 , G06F1/163 , G10L21/0232 , G10L2021/02166

Abstract: A device includes one or more processors configured to obtain audio data representing one or more audio signals. The audio data includes a first segment and a second segment subsequent to the first segment. The one or more processors are configured to perform one or more transform operations on the first segment to generate frequency-domain audio data. The one or more processors are configured to provide input data based on the frequency-domain audio data as input to one or more machine-learning models to generate a noise-suppression output. The one or more processors are configured to perform one or more reverse transform operations on the noise-suppression output to generate time-domain filter coefficients. The one or more processors are configured to perform time-domain filtering of the second segment using the time-domain filter coefficients to generate a noise-suppressed output signal.

4.

发明申请
CONTEXT-BASED SPEECH ENHANCEMENT 有权

公开(公告)号：US20220310108A1

公开(公告)日：2022-09-29

申请号：US17209621

申请日：2021-03-23

Applicant: QUALCOMM Incorporated

Inventor： Kyungguen BYUN , Shuhua ZHANG , Lae-Hoon KIM , Erik VISSER , Sunkuk MOON , Vahid MONTAZERI

IPC: G10L21/038

Abstract: A device to perform speech enhancement includes one or more processors configured to obtain input spectral data based on an input signal. The input signal represents sound that includes speech. The one or more processors are also configured to process, using a multi-encoder transformer, the input spectral data and context data to generate output spectral data that represents a speech enhanced version of the input signal.

5.

发明申请
MACHINE-LEARNING BASED AUDIO SUBBAND PROCESSING 有权

公开(公告)号：US20250118318A1

公开(公告)日：2025-04-10

申请号：US18907321

申请日：2024-10-04

Applicant: QUALCOMM Incorporated

Inventor： Vahid MONTAZERI , Rogerio Guedes ALVES , Erik VISSER

IPC: G10L21/0208 , G10L21/0216 , G10L21/038

Abstract: A device includes a memory configured to store audio data. The device also includes one or more processors configured to use a first machine-learning model to process first audio data to generate first spatial sector audio data. The first spatial sector audio data is associated with a first spatial sector. The one or more processors are also configured to use a second machine-learning model to process second audio data to generate second spatial sector audio data. The second spatial sector audio data is associated with a second spatial sector. The one or more processors are further configured to generate output data based on the first spatial sector audio data, the second spatial sector audio data, or both.

6.

发明公开
CONTEXT-BASED SPEECH ENHANCEMENT 审中-公开

公开(公告)号：US20230326477A1

公开(公告)日：2023-10-12

申请号：US18334641

申请日：2023-06-14

Applicant: QUALCOMM Incorporated

Inventor： Kyungguen BYUN , Shuhua ZHANG , Lae-Hoon KIM , Erik VISSER , Sunkuk MOON , Vahid MONTAZERI

IPC: G10L21/0232 , G10L21/038 , G10L21/02

CPC classification number: G10L21/0232 , G10L21/038 , G10L21/02

Abstract: A device to perform speech enhancement includes one or more processors configured to process image data to detect at least one of an emotion, a speaker characteristic, or a noise type. The one or more processors are also configured to generate context data based at least in part on the at least one of the emotion, the speaker characteristic, or the noise type. The one or more processors are further configured to obtain input spectral data based on an input signal. The input signal represents sound that includes speech. The one or more processors are also configured to process, using a multi-encoder transformer, the input spectral data and the context data to generate output spectral data that represents a speech enhanced version of the input signal.

7.

发明申请
NOISE SUPPRESSION USING TANDEM NETWORKS 有权

公开(公告)号：US20230026735A1

公开(公告)日：2023-01-26

申请号：US17382166

申请日：2021-07-21

Applicant: QUALCOMM Incorporated

Inventor： Vahid MONTAZERI , Van NGUYEN , Hannes PESSENTHEINER , Lae-Hoon KIM , Erik VISSER , Rogerio Guedes ALVES

IPC: H04R3/04 , H04R3/00 , H04R5/04 , H04R5/033 , H04S7/00 , H04S1/00 , G06N3/08

Abstract: A device includes a memory configured to store instructions and one or more processors configured to execute the instructions. The one or more processors are configured to execute the instructions to receive audio data including a first audio frame corresponding to a first output of a first microphone and a second audio frame corresponding to a second output of a second microphone. The one or more processors are also configured to execute the instructions to provide the audio data to a first noise-suppression network and a second noise-suppression network. The first noise-suppression network is configured to generate a first noise-suppressed audio frame and the second noise-suppression network is configured to generate a second noise-suppressed audio frame. The one or more processors are further configured to execute the instructions to provide the noise-suppressed audio frames to an attention-pooling network. The attention-pooling network is configured to generate an output noise-suppressed audio frame.

8.

发明申请
MIXED ADAPTIVE AND FIXED COEFFICIENT NEURAL NETWORKS FOR SPEECH ENHANCEMENT 有权

公开(公告)号：US20210343306A1

公开(公告)日：2021-11-04

申请号：US17243434

申请日：2021-04-28

Applicant: QUALCOMM Incorporated

Inventor： Erik VISSER , Vahid MONTAZERI , Shuhua ZHANG , Lae-Hoon KIM

IPC: G10L21/0208 , G10L25/30 , G06N3/08 , G06N3/04

Abstract: Systems, methods and computer-readable media are provided for speech enhancement using a hybrid neural network. An example process can include receiving, by a first neural network portion of the hybrid neural network, audio data and reference data, the audio data including speech data, noise data, and echo data; filtering, by the first neural network portion, a portion of the audio data based on adapted coefficients of the first neural network portion, the portion of the audio data including the noise data and/or echo data; based on the filtering, generating, by the first neural network portion, filtered audio data including the speech data and an unfiltered portion of the noise data and/or echo data; and based on the filtered audio data and the reference data, extracting, by a second neural network portion of the hybrid neural network, the speech data from the filtered audio data.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification