Patent search ap:("QUALCOMM Incorporated") AND inv:"Shuhua ZHANG" Page 1

1.

发明申请
CLOUD-BASED PROCESSING USING LOCAL DEVICE PROVIDED SENSOR DATA AND LABELS 审中-公开

公开(公告)号：US20170270406A1

公开(公告)日：2017-09-21

申请号：US15273496

申请日：2016-09-22

Applicant: QUALCOMM Incorporated

Inventor： Erik VISSER , Minho JIN , Lae-Hoon KIM , Raghuveer PERI , Shuhua ZHANG

IPC: G06N3/08 , G06N3/04

CPC classification number: G06N3/08 , G06N3/04 , G06N3/0454

Abstract: A method of training a device specific cloud-based audio processor includes receiving sensor data captured from multiple sensors at a local device. The method also includes receiving spatial information labels computed on the local device using local configuration information. The spatial information labels are associated with the captured sensor data. Lower layers of a first neural network are trained based on the spatial information labels and sensor data. The trained lower layers are incorporated into a second, larger neural network for audio classification. The second, larger neural network may be retrained using the trained lower layers of the first neural network.

2.

发明公开
TRANSFORM AMBISONIC COEFFICIENTS USING AN ADAPTIVE NETWORK FOR PRESERVING SPATIAL DIRECTION 审中-公开

公开(公告)号：US20230260525A1

公开(公告)日：2023-08-17

申请号：US18138684

申请日：2023-04-24

Applicant: QUALCOMM Incorporated

Inventor： Lae-Hoon KIM , Shankar THAGADUR SHIVAPPA , S M Akramus SALEHIN , Shuhua ZHANG , Erik VISSER

IPC: G10L19/038 , H04R5/00 , G10L19/002

CPC classification number: G10L19/038 , H04R5/00 , G10L19/002 , H04S2420/11 , H04R2430/21 , G10L19/008

Abstract: A device includes a memory configured to store untransformed ambisonic coefficients at different time segments. The device includes one or more processors configured to obtain the untransformed ambisonic coefficients at the different time segments, where the untransformed ambisonic coefficients at the different time segments represent a soundfield at the different time segments. The one or more processors are configured to apply one adaptive network, based on a constraint that includes preservation of a spatial direction of one or more audio sources in the soundfield at the different time segments, to the untransformed ambisonic coefficients at the different time segments to generate transformed ambisonic coefficients at the different time segments, wherein the transformed ambisonic coefficients at the different time segments represent a modified soundfield at the different time segments, that was modified based on the constraint. The one or more processors are also configured to apply an additional adaptive network.

3.

发明申请
CONTEXT-BASED SPEECH ENHANCEMENT 有权

公开(公告)号：US20220310108A1

公开(公告)日：2022-09-29

申请号：US17209621

申请日：2021-03-23

Applicant: QUALCOMM Incorporated

Inventor： Kyungguen BYUN , Shuhua ZHANG , Lae-Hoon KIM , Erik VISSER , Sunkuk MOON , Vahid MONTAZERI

IPC: G10L21/038

Abstract: A device to perform speech enhancement includes one or more processors configured to obtain input spectral data based on an input signal. The input signal represents sound that includes speech. The one or more processors are also configured to process, using a multi-encoder transformer, the input spectral data and context data to generate output spectral data that represents a speech enhanced version of the input signal.

4.

发明申请
SYNTHESIZED SPEECH GENERATION 有权

公开(公告)号：US20220230623A1

公开(公告)日：2022-07-21

申请号：US17154372

申请日：2021-01-21

Applicant: QUALCOMM Incorporated

Inventor： Kyungguen BYUN , Sunkuk MOON , Shuhua ZHANG , Vahid MONTAZERI , Lae-Hoon KIM , Erik VISSER

IPC: G10L13/047 , G06N3/04 , G10L13/033 , G10L25/63 , G10L19/02

Abstract: A device for speech generation includes one or more processors configured to receive one or more control parameters indicating target speech characteristics. The one or more processors are also configured to process, using a multi-encoder, an input representation of speech based on the one or more control parameters to generate encoded data corresponding to an audio signal that represents a version of the speech based on the target speech characteristics.

5.

发明申请
SPATIAL AUDIO WIND NOISE DETECTION 有权

公开(公告)号：US20220199100A1

公开(公告)日：2022-06-23

申请号：US17128544

申请日：2020-12-21

Applicant: QUALCOMM Incorporated

Inventor： S M Akramus SALEHIN , Lae-Hoon KIM , Hannes PESSENTHEINER , Shuhua ZHANG , Sanghyun CHI , Erik VISSER , Shankar THAGADUR SHIVAPPA

IPC: G10L21/0232 , H04R1/40 , H04R3/00 , H04S7/00 , H04S3/00 , G10L25/51 , G10L21/0324

Abstract: A device includes one or more processors configured to obtain audio signals representing sound captured by at least three microphones and determine spatial audio data based on the audio signals. The one or more processors are further configured to determine a metric indicative of wind noise in the audio signals. The metric is based on a comparison of a first value and a second value. The first value corresponds to an aggregate signal based on the spatial audio data, and the second value corresponds to a differential signal based on the spatial audio data.

6.

发明公开
CONTEXT-BASED SPEECH ENHANCEMENT 审中-公开

公开(公告)号：US20230326477A1

公开(公告)日：2023-10-12

申请号：US18334641

申请日：2023-06-14

Applicant: QUALCOMM Incorporated

Inventor： Kyungguen BYUN , Shuhua ZHANG , Lae-Hoon KIM , Erik VISSER , Sunkuk MOON , Vahid MONTAZERI

IPC: G10L21/0232 , G10L21/038 , G10L21/02

CPC classification number: G10L21/0232 , G10L21/038 , G10L21/02

Abstract: A device to perform speech enhancement includes one or more processors configured to process image data to detect at least one of an emotion, a speaker characteristic, or a noise type. The one or more processors are also configured to generate context data based at least in part on the at least one of the emotion, the speaker characteristic, or the noise type. The one or more processors are further configured to obtain input spectral data based on an input signal. The input signal represents sound that includes speech. The one or more processors are also configured to process, using a multi-encoder transformer, the input spectral data and the context data to generate output spectral data that represents a speech enhanced version of the input signal.

7.

发明申请
MIXED ADAPTIVE AND FIXED COEFFICIENT NEURAL NETWORKS FOR SPEECH ENHANCEMENT 有权

公开(公告)号：US20210343306A1

公开(公告)日：2021-11-04

申请号：US17243434

申请日：2021-04-28

Applicant: QUALCOMM Incorporated

Inventor： Erik VISSER , Vahid MONTAZERI , Shuhua ZHANG , Lae-Hoon KIM

IPC: G10L21/0208 , G10L25/30 , G06N3/08 , G06N3/04

Abstract: Systems, methods and computer-readable media are provided for speech enhancement using a hybrid neural network. An example process can include receiving, by a first neural network portion of the hybrid neural network, audio data and reference data, the audio data including speech data, noise data, and echo data; filtering, by the first neural network portion, a portion of the audio data based on adapted coefficients of the first neural network portion, the portion of the audio data including the noise data and/or echo data; based on the filtering, generating, by the first neural network portion, filtered audio data including the speech data and an unfiltered portion of the noise data and/or echo data; and based on the filtered audio data and the reference data, extracting, by a second neural network portion of the hybrid neural network, the speech data from the filtered audio data.

8.

发明申请
TRANSFORM AMBISONIC COEFFICIENTS USING AN ADAPTIVE NETWORK 有权

公开(公告)号：US20210304777A1

公开(公告)日：2021-09-30

申请号：US17210357

申请日：2021-03-23

Applicant: QUALCOMM Incorporated

Inventor： Lae-Hoon KIM , Shankar THAGADUR SHIVAPPA , S M Akramus SALEHIN , Shuhua ZHANG , Erik VISSER

IPC: G10L19/038 , G10L19/002 , H04R5/00

Abstract: A device includes a memory configured to store untransformed ambisonic coefficients at different time segments. The device also includes one or more processors configured to obtain the untransformed ambisonic coefficients at the different time segments, where the untransformed ambisonic coefficients at the different time segments represent a soundfield at the different time segments. The one or more processors are also configured to apply one adaptive network, based on a constraint, to the untransformed ambisonic coefficients at the different time segments to generate transformed ambisonic coefficients at the different time segments, wherein the transformed ambisonic coefficients at the different time segments represent a modified soundfield at the different time segments, that was modified based on the constraint.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification