Patent search ap:("QUALCOMM Incorporated") AND inv:"Sunkuk Moon" Page 2

11.

发明授权
Speaker template update with embedding vectors based on distance metric 有权

公开(公告)号：US11017783B2

公开(公告)日：2021-05-25

申请号：US16296733

申请日：2019-03-08

Applicant: QUALCOMM Incorporated

Inventor： Sunkuk Moon , Bicheng Jiang , Erik Visser

IPC: G10L17/04 , G10L17/08 , G10L17/18 , G10L17/22 , G10L17/06 , G10L17/02 , G10L17/00

Abstract: A device includes a processor configured to determine a feature vector based on an utterance and to determine a first embedding vector by processing the feature vector using a trained embedding network. The processor is configured to determine a first distance metric based on distances between the first embedding vector and each embedding vector of a speaker template. The processor is configured to determine, based on the first distance metric, that the utterance is verified to be from a particular user. The processor is configured to, based on a comparison of a first particular distance metric associated with the first embedding vector to a second distance metric associated with a first test embedding vector of the speaker template, generate an updated speaker template by adding the first embedding vector as a second test embedding vector and removing the first test embedding vector from test embedding vectors of the speaker template.

12.

发明申请
AUDIO ANALYTICS FOR NATURAL LANGUAGE PROCESSING 审中-公开

公开(公告)号：US20190341026A1

公开(公告)日：2019-11-07

申请号：US15972011

申请日：2018-05-04

Applicant: QUALCOMM Incorporated

Inventor： Erik Visser , Fatemeh Saki , Yinyi Guo , Sunkuk Moon , Lae-Hoon Kim , Ravi Choudhary

IPC: G10L15/18 , G10L25/63 , G06F3/16

Abstract: A device includes a memory configured to store category labels associated with categories of a natural language processing library. A processor is configured to analyze input audio data to generate a text string and to perform natural language processing on at least the text string to generate an output text string including an action associated with a first device, a speaker, a location, or a combination thereof. The processor is configured to compare the input audio data to audio data of the categories to determine whether the input audio data matches any of the categories and, in response to determining that the input audio data does not match any of the categories: create a new category label, associate the new category label with at least a portion of the output text string, update the categories with the new category label, and generate a notification indicating the new category label.

13.

发明申请
ENHANCED SPEECH GENERATION 审中-公开

公开(公告)号：US20180233127A1

公开(公告)日：2018-08-16

申请号：US15430791

申请日：2017-02-13

Applicant: QUALCOMM Incorporated

Inventor： ERIK VISSER , Shuhua Zhang , Lae-Hoon Kim , Yinyi Guo , Sunkuk Moon

IPC: G10L13/027 , G10L13/047 , G10L25/78 , G10L25/21 , G10L25/63 , G10L25/90 , G10L15/26

CPC classification number: G10L15/26 , G10L13/047 , G10L21/00 , G10L21/003 , G10L25/48

Abstract: In a particular aspect, an apparatus includes an audio sensor configured to receive an input audio signal. The apparatus also includes speech generative circuitry configured to generate a synthesized audio signal based at least partly on automatic speech recognition (ASR) data associated with the input audio signal and based on one or more parameters indicative of state information associated with the input audio signal.

14.

发明授权
Audio processing using sound source representations 有权

公开(公告)号：US11869478B2

公开(公告)日：2024-01-09

申请号：US17655511

申请日：2022-03-18

Applicant: QUALCOMM Incorporated

Inventor： Siddhartha Goutham Swaminathan , Sunkuk Moon , Shuhua Zhang , Erik Visser

IPC: G10K11/16 , G10K11/178 , H04R1/10 , H04R3/00

CPC classification number: G10K11/17881 , G10K11/17827 , H04R1/1083 , H04R3/005

Abstract: A device includes one or more processors configured to receive an input audio signal. The one or more processors are also configured to process the input audio signal based on a combined representation of multiple sound sources to generate an output audio signal. The combined representation is used to selectively retain or remove sounds of the multiple sound sources from the input audio signal. The one or more processors are further configured to provide the output audio signal to a second device.

15.

发明公开
AUDIO PROCESSING USING SOUND SOURCE REPRESENTATIONS 审中-公开

公开(公告)号：US20230298561A1

公开(公告)日：2023-09-21

申请号：US17655511

申请日：2022-03-18

Applicant: Qualcomm Incorporated

Inventor： Siddhartha Goutham SWAMINATHAN , Sunkuk Moon , Shuhua Zhang , Erik Visser

IPC: G10K11/178 , H04R3/00 , H04R1/10

CPC classification number: G10K11/17881 , H04R3/005 , H04R1/1083 , G10K11/17827

Abstract: A device includes one or more processors configured to receive an input audio signal. The one or more processors are also configured to process the input audio signal based on a combined representation of multiple sound sources to generate an output audio signal. The combined representation is used to selectively retain or remove sounds of the multiple sound sources from the input audio signal. The one or more processors are further configured to provide the output audio signal to a second device.

16.

发明授权
User speech profile management 有权

公开(公告)号：US11626104B2

公开(公告)日：2023-04-11

申请号：US17115158

申请日：2020-12-08

Applicant: QUALCOMM Incorporated

Inventor： Soo Jin Park , Sunkuk Moon , Lae-Hoon Kim , Erik Visser

IPC: G10L17/00 , G10L15/07 , G06F1/3231 , G10L15/04 , G10L15/16

Abstract: A device includes processors configured to determine, in a first power mode, whether an audio stream corresponds to speech of at least two talkers. The processors are configured to, based on determining that the audio stream corresponds to speech of at least two talkers, analyze, in a second power mode, audio feature data of the audio stream to generate a segmentation result. The processors are configured to perform a comparison of a plurality of user speech profiles to an audio feature data set of a plurality of audio feature data sets of a talker-homogenous audio segment to determine whether the audio feature data set matches any of the user speech profiles. The processors are configured to, based on determining that the audio feature data set does not match any of the plurality of user speech profiles, generate a user speech profile based on the plurality of audio feature data sets.

17.

发明授权
Multi-modal user interface 有权

公开(公告)号：US11348581B2

公开(公告)日：2022-05-31

申请号：US16685946

申请日：2019-11-15

Applicant: QUALCOMM Incorporated

Inventor： Ravi Choudhary , Lae-Hoon Kim , Sunkuk Moon , Yinyi Guo , Fatemeh Saki , Erik Visser

IPC: G10L15/22 , G06F3/16 , G06F3/01 , G10L15/26 , G06F3/038 , G06F3/04883 , G06F3/0484 , G06F9/451 , G10L15/20

Abstract: A device for multi-modal user input includes a processor configured to process first data received from a first input device. The first data indicates a first input from a user based on a first input mode. The first input corresponds to a command. The processor is configured to send a feedback message to an output device based on processing the first data. The feedback message instructs the user to provide, based on a second input mode that is different from the first input mode, a second input that identifies a command associated with the first input. The processor is configured to receive second data from a second input device, the second data indicating the second input, and to update a mapping to associate the first input to the command identified by the second input.

18.

发明授权
Shared speech processing network for multiple speech applications 有权

公开(公告)号：US11276415B2

公开(公告)日：2022-03-15

申请号：US16844836

申请日：2020-04-09

Applicant: QUALCOMM Incorporated

Inventor： Lae-Hoon Kim , Sunkuk Moon , Erik Visser , Prajakt Kulkarni

IPC: G10L21/02 , H04R5/04 , H04R3/00 , G06N20/00 , H04L29/06 , G06K9/62 , H04L65/60 , H04L65/80

Abstract: A device to process speech includes a speech processing network that includes an input configured to receive audio data corresponding to audio captured by one or more microphones. The speech processing network also includes one or more network layers configured to process the audio data to generate an output representation of the audio data. The speech processing network includes an output configured to be coupled to multiple speech application modules to enable the output representation to be provided as a common input to each of the multiple speech application modules.

19.

发明申请
MULTI-MODAL USER INTERFACE 有权

公开(公告)号：US20210012770A1

公开(公告)日：2021-01-14

申请号：US16685946

申请日：2019-11-15

Applicant: QUALCOMM Incorporated

Inventor： Ravi Choudhary , Lae-Hoon Kim , Sunkuk Moon , Yinyi Guo , Fatemeh Saki , Erik Visser

IPC: G10L15/22 , G06F3/16 , G06F3/01 , G10L15/26 , G06F3/0484 , G06F3/038 , G06F3/0488

Abstract: A device for multi-modal user input includes a processor configured to process first data received from a first input device. The first data indicates a first input from a user based on a first input mode. The first input corresponds to a command. The processor is configured to send a feedback message to an output device based on processing the first data. The feedback message instructs the user to provide, based on a second input mode that is different from the first input mode, a second input that identifies a command associated with the first input. The processor is configured to receive second data from a second input device, the second data indicating the second input, and to update a mapping to associate the first input to the command identified by the second input.

20.

发明授权
Enhanced speech generation 有权

公开(公告)号：US10783890B2

公开(公告)日：2020-09-22

申请号：US16396311

申请日：2019-04-26

Applicant: QUALCOMM Incorporated

Inventor： Erik Visser , Shuhua Zhang , Lae-Hoon Kim , Yinyi Guo , Sunkuk Moon

IPC: G10L15/26 , G10L13/047 , G10L25/48 , G10L21/00 , G10L21/003

Abstract: In a particular aspect, a speech generator includes a signal input configured to receive a first audio signal. The speech generator also includes at least one speech signal processor configured to generate a second audio signal based on information associated with the first audio signal and based further on automatic speech recognition (ASR) data associated with the first audio signal.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification