Patent search ap:("Samsung Electronics Co. Page Ltd.") AND inv:"Taeyeon Ki"

1.

发明公开
EFFICIENT ADAPTATION OF SPOKEN LANGUAGE UNDERSTANDING BASED ON AUTOMATIC SPEECH RECOGNITION USING MULTI-TASK LEARNING 审中-公开

公开(公告)号：US20240304179A1

公开(公告)日：2024-09-12

申请号：US18596406

申请日：2024-03-05

Applicant: Samsung Electronics Co., Ltd.

Inventor： Euisung Kim , Aditya Jajodia , Cindy Sushen Tseng , Divya Neelagiri , Taeyeon Ki , Vijendra Raj Apsingekar

IPC: G10L15/06 , G10L25/30

CPC classification number: G10L15/063 , G10L25/30

Abstract: A method includes receiving, by an automatic speech recognition (ASR)-based spoken language understanding (SLU) model, an input utterance using an audio input device. The method also includes, for each token of the input utterance, generating, using a shared ASR encoder of the ASR-based SLU model, an acoustic representation of acoustic features of the token (the shared ASR encoder including a first adapter layer); determining, using an ASR decoder of the ASR-based SLU model, a text representation of the token using the acoustic representation and any previous tokens (the ASR decoder including a second adapter layer); combining, using a fusion model of the ASR-based SLU model, the text representation and the acoustic representation to generate a joint representation, and determining, using an SLU decoder of the ASR-based SLU model, a semantic label associated with the token based on the joint representation and any previous semantic labels.

2.

发明申请
JOINT END-TO-END SPOKEN LANGUAGE UNDERSTANDING AND AUTOMATIC SPEECH RECOGNITION 有权

公开(公告)号：US20250078824A1

公开(公告)日：2025-03-06

申请号：US18814275

申请日：2024-08-23

Applicant: Samsung Electronics Co., Ltd.

Inventor： Euisung Kim , Yun Tang , Taeyeon Ki , Divya Neelagiri , Vijendra Raj Apsingekar

IPC: G10L15/183 , G10L15/06

Abstract: A method includes receiving an utterance from an audio input device. The method also includes determining a context associated with the utterance. The method also includes providing the utterance as an input to a joint model for automatic speech recognition (ASR) and spoken language understanding (SLU), wherein the joint model operates in a single mode to perform both ASR and SLU or a dual mode to perform one of ASR or SLU depending on the context. The method also includes using an output of the joint model to perform an action requested in the utterance. The joint model is trained by training a shared encoder and a shared decoder using a text-to-text task and, after training the shared encoder and the shared decoder, training a speech encoder and the shared encoder using a speech self-supervised learning (SSL) learning task and a text-to-text task with a masked prediction loss.

3.

发明公开
CONTEXT-AWARE FALSE TRIGGER MITIGATION FOR AUTOMATIC SPEECH RECOGNITION (ASR) SYSTEMS OR OTHER SYSTEMS 审中-公开

公开(公告)号：US20240054999A1

公开(公告)日：2024-02-15

申请号：US18297509

申请日：2023-04-07

Applicant: Samsung Electronics Co., Ltd.

Inventor： Cindy Sushen Tseng , Srinivasa Rao Ponakala , Myungjong Kim , Taeyeon Ki , Vijendra Raj Apsingekar

IPC: G10L15/22 , G10L15/08 , G10L25/51

CPC classification number: G10L15/22 , G10L15/08 , G10L25/51 , G10L2015/223 , G10L2015/088

Abstract: A method includes obtaining an audio input and a location associated with an electronic device. The method also includes generating an audio embedding associated with the audio input. The method further includes determining a first difference between the audio embedding associated with the audio input and an audio embedding associated with a known user. The method also includes determining a second difference between the location associated with the electronic device and a known location associated with the known user. The method further includes generating, using a false trigger mitigation (FTM) system, a probability of the audio input including a false trigger for automatic speech recognition based on the audio input, the first difference, and the second difference. In addition, the method includes determining whether to perform automatic speech recognition based on the probability.

4.

发明公开
ONLINE SPEAKER DIARIZATION USING LOCAL AND GLOBAL CLUSTERING 审中-公开

公开(公告)号：US20230419979A1

公开(公告)日：2023-12-28

申请号：US18046041

申请日：2022-10-12

Applicant: Samsung Electronics Co., Ltd.

Inventor： Myungjong Kim , Taeyeon Ki , Vijendra Raj Apsingekar , Sungjae Park , SeungBeom Ryu , Hyuk Oh

IPC: G10L21/028 , G10L17/06 , G10L17/02

CPC classification number: G10L21/028 , G10L17/06 , G10L17/02

Abstract: A method includes obtaining at least a portion of an audio stream containing speech activity. At least the portion of the audio stream includes multiple segments. The method also includes, for each of the multiple segments, generating an embedding vector that represents the segment. The method further includes, within each of multiple local windows, clustering the embedding vectors into one or more clusters to perform speaker identification. Different clusters correspond to different speakers. The method also includes presenting at least one first sequence of speaker identities based on the speaker identification performed for the local windows. The method further includes, within each of multiple global windows, clustering the embedding vectors into one or more clusters to perform speaker identification. Each global window includes two or more local windows. In addition, the method includes presenting at least one second sequence of speaker identities based on the speaker identification performed for the global windows.

5.

发明授权
Method and apparatus for performing speaker diarization on mixed-bandwidth speech signals 有权

公开(公告)号：US12087307B2

公开(公告)日：2024-09-10

申请号：US17538604

申请日：2021-11-30

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventor： Myungjong Kim , Vijendra Raj Apsingekar , Aviral Anshu , Taeyeon Ki

IPC: G10L17/06 , G10L17/02 , G10L17/18 , G10L21/0272 , G10L21/0308

CPC classification number: G10L17/06 , G10L17/02 , G10L17/18 , G10L21/0272 , G10L21/0308

Abstract: An apparatus for processing speech data may include a processor configured to: separate an input speech into speech signals; identify a bandwidth of each of the speech signals; extract speaker embeddings from the speech signals based on the bandwidth of each of the speech signals, using at least one neural network configured to receive the speech signals and output the speaker embeddings; and cluster the speaker embeddings into one or more speaker clusters, each speaker cluster corresponding to a speaker identity.

6.

发明公开
SYSTEM AND METHOD FOR SPEAKER VERIFICATION FOR VOICE ASSISTANT 审中-公开

公开(公告)号：US20230419962A1

公开(公告)日：2023-12-28

申请号：US18047609

申请日：2022-10-18

Applicant: Samsung Electronics Co., Ltd.

Inventor： Myungjong Kim , Taeyeon Ki , Cindy Sushen Tseng , Srinivasa Rao Ponakala , Vijendra Raj Apsingekar

IPC: G10L15/22 , G10L15/08

CPC classification number: G10L15/22 , G10L2015/088 , G10L15/08

Abstract: A method includes obtaining audio data and identifying an utterance of a wake word or phrase in the audio data. The method also includes generating an embedding vector based on the utterance from the audio data and accessing a set of previously-generated vectors representing previous utterances of the wake word or phrase. The method further includes performing clustering on the embedding vector and the set of previously-generated vectors to identify a cluster including the embedding vector, where the identified cluster is associated with a speaker. The method also includes updating a speaker vector associated with the speaker based on the embedding vector and determining, using a speaker verification model, a similarity score between the updated speaker vector and the embedding vector. In addition, the method includes determining, based on the similarity score, whether a speaker providing the utterance matches the speaker associated with the identified cluster.

7.

发明授权
System and method for improving named entity recognition 有权

公开(公告)号：US12170079B2

公开(公告)日：2024-12-17

申请号：US17444367

申请日：2021-08-03

Applicant: Samsung Electronics Co., Ltd.

Inventor： Divya Neelagiri , Taeyeon Ki , Vijendra Raj Apsingekar

IPC: G10L15/06 , G10L15/18 , G10L15/26

Abstract: A method includes training a set of teacher models. Training the set of teacher models includes, for each individual teacher model of the set of teacher models, training the individual teacher model to transcribe unlabeled audio samples and predict a pseudo labeled dataset having multiple labels. At least some of the unlabeled audio samples contain named entity (NE) audio data. At least some of the labels include transcribed NE labels corresponding to the NE audio data. The method also includes correcting at least some of the transcribed NE labels using user-specific NE textual data. The method further includes retraining the set of teacher models based on the pseudo labeled dataset from a selected one of the teacher models, where the selected one of the teacher models predicts the pseudo labeled dataset more accurately than other teacher models of the set of teacher models.

8.

发明申请
SYSTEM AND METHOD FOR IMPROVING NAMED ENTITY RECOGNITION 有权

公开(公告)号：US20230040181A1

公开(公告)日：2023-02-09

申请号：US17444367

申请日：2021-08-03

Applicant: Samsung Electronics Co., Ltd.

Inventor： Divya Neelagiri , Taeyeon Ki , Vijendra Raj Apsingekar

IPC: G10L15/06 , G10L15/26 , G10L15/18

Abstract: A method includes training a set of teacher models. Training the set of teacher models includes, for each individual teacher model of the set of teacher models, training the individual teacher model to transcribe unlabeled audio samples and predict a pseudo labeled dataset having multiple labels. At least some of the unlabeled audio samples contain named entity (NE) audio data. At least some of the labels include transcribed NE labels corresponding to the NE audio data. The method also includes correcting at least some of the transcribed NE labels using user-specific NE textual data. The method further includes retraining the set of teacher models based on the pseudo labeled dataset from a selected one of the teacher models, where the selected one of the teacher models predicts the pseudo labeled dataset more accurately than other teacher models of the set of teacher models.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification