Patent search ap:("QUALCOMM Incorporated") AND inv:"Lae-Hoon KIM" Page 2

11.

发明申请
ENABLING A USER TO OBTAIN A SUITABLE HEAD-RELATED TRANSFER FUNCTION PROFILE 审中-公开

公开(公告)号：US20200228915A1

公开(公告)日：2020-07-16

申请号：US16244875

申请日：2019-01-10

Applicant: QUALCOMM Incorporated

Inventor： Dongmei WANG , Lae-Hoon KIM , Erik VISSER

IPC: H04S7/00 , H04R5/033

Abstract: Methods, systems, computer-readable media, and apparatuses for HRTF profile selection are presented. In one example, a device prompts a user to follow a simple procedure to obtain measurements that are matched to a suitable high-resolution HRTF profile.

12.

发明申请
AUDIO USER INTERACTION RECOGNITION AND APPLICATION INTERFACE 审中-公开

公开(公告)号：US20170308164A1

公开(公告)日：2017-10-26

申请号：US15645365

申请日：2017-07-10

Applicant: QUALCOMM Incorporated

Inventor： Lae-Hoon KIM , Jongwon Shin , Erik Visser

IPC: G06F3/01 , H04N7/15 , H04M3/56 , G10L25/48 , G04G21/00 , G10L21/0216 , G06F1/16 , G01S3/808 , H04R1/40 , H04R3/00 , H04R29/00 , H04S7/00 , G10L17/00

CPC classification number: G06F3/013 , G01S3/8083 , G04G21/00 , G06F1/1613 , G06F3/011 , G10L17/00 , G10L25/48 , G10L2021/02166 , H04M3/568 , H04N7/15 , H04R1/406 , H04R3/005 , H04R29/005 , H04S7/304

Abstract: Disclosed is an application interface that takes into account the user's gaze direction relative to who is speaking in an interactive multi-participant environment where audio-based contextual information and/or visual-based semantic information is being presented. Among these various implementations, two different types of microphone array devices (MADs) may be used. The first type of MAD is a steerable microphone array (a.k.a. a steerable array) which is worn by a user in a known orientation with regard to the user's eyes, and wherein multiple users may each wear a steerable array. The second type of MAD is a fixed-location microphone array (a.k.a. a fixed array) which is placed in the same acoustic space as the users (one or more of which are using steerable arrays).

13.

发明申请
CLOUD-BASED PROCESSING USING LOCAL DEVICE PROVIDED SENSOR DATA AND LABELS 审中-公开

公开(公告)号：US20170270406A1

公开(公告)日：2017-09-21

申请号：US15273496

申请日：2016-09-22

Applicant: QUALCOMM Incorporated

Inventor： Erik VISSER , Minho JIN , Lae-Hoon KIM , Raghuveer PERI , Shuhua ZHANG

IPC: G06N3/08 , G06N3/04

CPC classification number: G06N3/08 , G06N3/04 , G06N3/0454

Abstract: A method of training a device specific cloud-based audio processor includes receiving sensor data captured from multiple sensors at a local device. The method also includes receiving spatial information labels computed on the local device using local configuration information. The spatial information labels are associated with the captured sensor data. Lower layers of a first neural network are trained based on the spatial information labels and sensor data. The trained lower layers are incorporated into a second, larger neural network for audio classification. The second, larger neural network may be retrained using the trained lower layers of the first neural network.

14.

发明公开
ENABLING A GESTURE INTERFACE FOR VOICE ASSISTANTS USING RADIO FREQUENCY (RF) SENSING 审中-公开

公开(公告)号：US20240221752A1

公开(公告)日：2024-07-04

申请号：US18558991

申请日：2022-05-05

Applicant: QUALCOMM Incorporated

Inventor： Jason FILOS , Xiaoxin ZHANG , Lae-Hoon KIM , Erik VISSER

IPC: G10L15/24 , G06F3/01 , G10L15/08 , G10L15/22

CPC classification number: G10L15/24 , G06F3/017 , G10L15/08 , G10L15/22 , G10L2015/088 , G10L2015/223

Abstract: In an aspect, a user equipment receives, via a microphone, an utterance from a user and determines, using radio frequency sensing, that the user performed a gesture while making the utterance. The user equipment determines an object associated with the gesture and transmits an enhanced directive to an application programming interface (API) of a smart assistance device. The enhanced directive is determined based on the object, the gesture, and the utterance. The enhanced directive causes the smart assistant device to perform an action.

15.

发明公开
CONTEXT-BASED SPEECH ENHANCEMENT 审中-公开

公开(公告)号：US20230326477A1

公开(公告)日：2023-10-12

申请号：US18334641

申请日：2023-06-14

Applicant: QUALCOMM Incorporated

Inventor： Kyungguen BYUN , Shuhua ZHANG , Lae-Hoon KIM , Erik VISSER , Sunkuk MOON , Vahid MONTAZERI

IPC: G10L21/0232 , G10L21/038 , G10L21/02

CPC classification number: G10L21/0232 , G10L21/038 , G10L21/02

Abstract: A device to perform speech enhancement includes one or more processors configured to process image data to detect at least one of an emotion, a speaker characteristic, or a noise type. The one or more processors are also configured to generate context data based at least in part on the at least one of the emotion, the speaker characteristic, or the noise type. The one or more processors are further configured to obtain input spectral data based on an input signal. The input signal represents sound that includes speech. The one or more processors are also configured to process, using a multi-encoder transformer, the input spectral data and the context data to generate output spectral data that represents a speech enhanced version of the input signal.

16.

发明申请
AUGMENTED AUDIO FOR COMMUNICATIONS 有权

公开(公告)号：US20230060774A1

公开(公告)日：2023-03-02

申请号：US17446498

申请日：2021-08-31

Applicant: QUALCOMM Incorporated

Inventor： S M Akramus SALEHIN , Lae-Hoon KIM , Xiaoxin ZHANG , Erik VISSER

IPC: H04S7/00 , G06F3/01

Abstract: A device includes one or more processors configured to determine, based on data descriptive of two or more audio environments, a geometry of a mutual audio environment. The one or more processors are also configured to process audio data, based on the geometry of the mutual audio environment, for output at an audio device disposed in a first audio environment of the two or more audio environments.

17.

发明申请
NOISE SUPPRESSION USING TANDEM NETWORKS 有权

公开(公告)号：US20230026735A1

公开(公告)日：2023-01-26

申请号：US17382166

申请日：2021-07-21

Applicant: QUALCOMM Incorporated

Inventor： Vahid MONTAZERI , Van NGUYEN , Hannes PESSENTHEINER , Lae-Hoon KIM , Erik VISSER , Rogerio Guedes ALVES

IPC: H04R3/04 , H04R3/00 , H04R5/04 , H04R5/033 , H04S7/00 , H04S1/00 , G06N3/08

Abstract: A device includes a memory configured to store instructions and one or more processors configured to execute the instructions. The one or more processors are configured to execute the instructions to receive audio data including a first audio frame corresponding to a first output of a first microphone and a second audio frame corresponding to a second output of a second microphone. The one or more processors are also configured to execute the instructions to provide the audio data to a first noise-suppression network and a second noise-suppression network. The first noise-suppression network is configured to generate a first noise-suppressed audio frame and the second noise-suppression network is configured to generate a second noise-suppressed audio frame. The one or more processors are further configured to execute the instructions to provide the noise-suppressed audio frames to an attention-pooling network. The attention-pooling network is configured to generate an output noise-suppressed audio frame.

18.

发明申请
USER SPEECH PROFILE MANAGEMENT 有权

公开(公告)号：US20220180859A1

公开(公告)日：2022-06-09

申请号：US17115158

申请日：2020-12-08

Applicant: QUALCOMM Incorporated

Inventor： Soo Jin PARK , Sunkuk MOON , Lae-Hoon KIM , Erik VISSER

IPC: G10L15/07 , G10L15/16 , G10L15/04 , G06F1/3231

Abstract: A device includes processors configured to determine, in a first power mode, whether an audio stream corresponds to speech of at least two talkers. The processors are configured to, based on determining that the audio stream corresponds to speech of at least two talkers, analyze, in a second power mode, audio feature data of the audio stream to generate a segmentation result. The processors are configured to perform a comparison of a plurality of user speech profiles to an audio feature data set of a plurality of audio feature data sets of a talker-homogenous audio segment to determine whether the audio feature data set matches any of the user speech profiles. The processors are configured to, based on determining that the audio feature data set does not match any of the plurality of user speech profiles, generate a user speech profile based on the plurality of audio feature data sets.

19.

发明申请
MIXED ADAPTIVE AND FIXED COEFFICIENT NEURAL NETWORKS FOR SPEECH ENHANCEMENT 有权

公开(公告)号：US20210343306A1

公开(公告)日：2021-11-04

申请号：US17243434

申请日：2021-04-28

Applicant: QUALCOMM Incorporated

Inventor： Erik VISSER , Vahid MONTAZERI , Shuhua ZHANG , Lae-Hoon KIM

IPC: G10L21/0208 , G10L25/30 , G06N3/08 , G06N3/04

Abstract: Systems, methods and computer-readable media are provided for speech enhancement using a hybrid neural network. An example process can include receiving, by a first neural network portion of the hybrid neural network, audio data and reference data, the audio data including speech data, noise data, and echo data; filtering, by the first neural network portion, a portion of the audio data based on adapted coefficients of the first neural network portion, the portion of the audio data including the noise data and/or echo data; based on the filtering, generating, by the first neural network portion, filtered audio data including the speech data and an unfiltered portion of the noise data and/or echo data; and based on the filtered audio data and the reference data, extracting, by a second neural network portion of the hybrid neural network, the speech data from the filtered audio data.

20.

发明申请
TRANSFORM AMBISONIC COEFFICIENTS USING AN ADAPTIVE NETWORK 有权

公开(公告)号：US20210304777A1

公开(公告)日：2021-09-30

申请号：US17210357

申请日：2021-03-23

Applicant: QUALCOMM Incorporated

Inventor： Lae-Hoon KIM , Shankar THAGADUR SHIVAPPA , S M Akramus SALEHIN , Shuhua ZHANG , Erik VISSER

IPC: G10L19/038 , G10L19/002 , H04R5/00

Abstract: A device includes a memory configured to store untransformed ambisonic coefficients at different time segments. The device also includes one or more processors configured to obtain the untransformed ambisonic coefficients at the different time segments, where the untransformed ambisonic coefficients at the different time segments represent a soundfield at the different time segments. The one or more processors are also configured to apply one adaptive network, based on a constraint, to the untransformed ambisonic coefficients at the different time segments to generate transformed ambisonic coefficients at the different time segments, wherein the transformed ambisonic coefficients at the different time segments represent a modified soundfield at the different time segments, that was modified based on the constraint.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification