专利检索 ipc:"G10L17/02" 第 1 页

1.

发明授权
Machine learning for improving quality of voice biometrics 有权

公开(公告)号：US12131740B2

公开(公告)日：2024-10-29

申请号：US18331920

申请日：2023-06-08

申请人： Capital One Services, LLC

发明人： Bozhao Tan , Isabelle Alice Yvonne Moulinier , David Almquist , June Wu

IPC分类号： G10L15/06 , G10L17/02 , G10L17/04 , G10L17/06 , H04M3/42

CPC分类号： G10L17/04 , G10L17/02 , G10L17/06 , H04M3/42221

摘要： Methods and systems are disclosed herein for improving the quality of audio for use in a biometric. A biometric system may use machine learning to determine whether audio or a portion of the audio should be used as a biometric for a user. A sample of the user's voice may be used to generate a voice signature of the user. Portions of the audio that do not meet a similarity threshold when compared with the voice signature may be removed from the audio. Additionally or alternatively, interfering noises may be detected and removed from the audio to improve the quality of a voice biometric generated from the audio.

2.

发明公开
MULTIMEDIA DATA RECORDING METHOD AND DEVICE 审中-公开

公开(公告)号：US20240312487A1

公开(公告)日：2024-09-19

申请号：US18432914

申请日：2024-02-05

申请人： Lenovo (Beijing) Limited

发明人： Gang MA , Bo LIU

IPC分类号： G11B27/036 , G06V20/40 , G10L17/02 , G10L25/57 , G10L25/63

CPC分类号： G11B27/036 , G06V20/41 , G10L17/02 , G10L25/57 , G10L25/63

摘要： A multimedia data recording method includes performing real-time analysis on multimedia data that includes simultaneously collected first audio data and image frame data to obtain voice content and a demonstration action of a target object, determining semantic correlation between the demonstration action and the voice content, performing video understanding on an image frame recording the demonstration action to convert the demonstration action to second audio data in response to the semantic correlation indicating that a content indicated by the demonstration action is inconsistent with the voice content, and dynamically inserting the second audio data into the first audio data to update the multimedia data.

3.

发明公开
VOICE AUTHENTICATION DEVICE AND APPLIANCE 审中-公开

公开(公告)号：US20240312467A1

公开(公告)日：2024-09-19

申请号：US18596879

申请日：2024-03-06

申请人： ROHM CO., LTD.

发明人： Koji TAMANO , Takahiro NISHIYAMA

IPC分类号： G10L17/18 , G10L17/02 , G10L17/04 , G10L17/24 , G10L25/18

CPC分类号： G10L17/18 , G10L17/02 , G10L17/04 , G10L17/24 , G10L25/18

摘要： A voice authentication device for incorporation in an appliance including a voice conversion portion configured to convert voice from outside into a voice signal that is an electrical signal includes a voice registration portion configured to learn a parameter of an AI model based on the voice signal, a voice verification portion configured to perform voice verification on input data based on the voice signal in accordance with an inference result yielded by the AI model having the learned parameter. The voice authentication is performed based on the voice registration portion and the voice verification portion.

4.

发明公开
LABELING SUPPORT DEVICE, LABELING SUPPORT METHOD, AND PROGRAM 审中-公开

公开(公告)号：US20240303265A1

公开(公告)日：2024-09-12

申请号：US18279592

申请日：2021-03-01

申请人： NIPPON TELEGRAPH AND TELEPHONE CORPORATION

发明人： Shota ORIHASHI , Masato SAWADA

IPC分类号： G06F16/35 , G10L15/26 , G10L17/02

CPC分类号： G06F16/353 , G10L15/26 , G10L17/02

摘要： A label assignment support device according to the present disclosure includes a preliminary label estimation unit that assigns preliminary labels for each of a plurality of elements, a label assignment work screen output unit that generates a label assignment work screen for each of the plurality of elements and an update operation for labels assigned to the plurality of elements by a user, the label assignment work screen indicating each of the plurality of elements and labels assigned to each of the plurality of elements in association with each other, and a label update unit that, when a label assigned to one of the elements is updated by the update operation via the label assignment work screen, assigns the label after update to the one of the elements.

5.

发明公开
DETERMINING PHRASES FOR USE IN A MULTI-STEP AUTHENTICATION PROCESS 审中-公开

公开(公告)号：US20240296211A1

公开(公告)日：2024-09-05

申请号：US18116776

申请日：2023-03-02

申请人： Oracle International Corporation

发明人： Kent Arthur Spaulding , Kenneth Joseph Meltsner

IPC分类号： G06F21/32 , G06F40/30 , G10L17/02 , G10L17/04 , G10L17/08 , G10L17/24

CPC分类号： G06F21/32 , G06F40/30 , G10L17/02 , G10L17/04 , G10L17/08 , G10L17/24

摘要： The present disclosure provide a multiple factor authentication process using text pass codes. A process performs a first verification of a user using an authentication credential transmitted via a first communication channel. Based on successfully performing the first verification, the process performs a second verification using a textual phrase transmitted to the user via a different communication channel. The words included in the textual phrase can be selected to avoid ambiguous pronunciations and spellings.

6.

发明公开
Systems and Methods for Language Identification in Audio Content 审中-公开

公开(公告)号：US20240265925A1

公开(公告)日：2024-08-08

申请号：US18165817

申请日：2023-02-07

申请人： Spotify AB

发明人： Xingran Zhu , Md Iftekhar Tanveer , Yang Janet Liu , Rosemary Ellen Jones , Oluseye Ojumu

IPC分类号： G10L17/18 , G10L17/02 , G10L21/028

CPC分类号： G10L17/18 , G10L17/02 , G10L21/028

摘要： The various implementations described herein include methods and devices for identifying a language in audio content. In one aspect, a method includes obtaining audio content and generating a speaker embedding from the audio content. The method further includes determining, via a language identification model, a language of the audio content based on the speaker embedding.

7.

发明授权
Multi-register-based speech detection method and related apparatus, and storage medium 有权

公开(公告)号：US12051441B2

公开(公告)日：2024-07-30

申请号：US17944067

申请日：2022-09-13

申请人： Tencent Technology (Shenzhen) Company Limited

发明人： Jimeng Zheng , Lianwu Chen , Weiwei Li , Zhiyi Duan , Meng Yu , Dan Su , Kaiyu Jiang

IPC分类号： G10L25/84 , G06T7/20 , G10L17/02 , G10L17/22 , G10L21/028 , G10L25/21

CPC分类号： G10L25/84 , G06T7/20 , G10L17/02 , G10L17/22 , G10L21/028 , G10L25/21 , G06T2207/30201

摘要： This application discloses a multi-sound area-based speech detection method and related apparatus, and a storage medium, which is applied to the field of artificial intelligence. The method includes: obtaining sound area information corresponding to N sound areas including multiple users speaking simultaneously; generating a control signal corresponding to each target detection sound area according to user information corresponding to the target detection sound area; processing multi-user speech input signals by using the control signals, to obtain a speech output signal corresponding to each target detection sound area; generating a speech detection result of the target detection sound area according to the speech output signal corresponding to the target detection sound area; and selecting, among the multiple users, a main speaker based on the user information, the speech output signals and speech detection results of multiple users in the N sound areas.

8.

发明授权
System and method for secure data augmentation for speech processing systems 有权

公开(公告)号：US12033614B2

公开(公告)日：2024-07-09

申请号：US17840787

申请日：2022-06-15

申请人： Nuance Communications, Inc.

发明人： Dushyant Sharma , Patrick Aubrey Naylor , Francesco Nespoli

IPC分类号： G10L13/08 , G06F21/62 , G06F40/166 , G10L13/033 , G10L17/02

CPC分类号： G10L13/08 , G06F21/6245 , G06F40/166 , G10L13/033 , G10L17/02

摘要： A method, computer program product, and computing system for receiving an input speech signal. A transcription of the input speech signal may be received. A speaker embedding may be extracted from the input speech signal. Acoustic properties from the input speech signal may be extracted. An obscured transcription may be generated from the transcription, where the obscured transcription includes obscured representations of sensitive content from the transcription. An obscured speech signal may be generated based upon, at least in part, the extracted speaker embedding and the obscured transcription, where the obscured speech signal includes obscured representations of sensitive content from the input speech signal. The obscured speech signal may be augmented based upon, at least in part, the extracted acoustic properties.

9.

发明公开
ELECTRONIC DEVICE FOR IDENTIFYING SYNTHETIC VOICE AND CONTROL METHOD THEREOF 审中-公开

公开(公告)号：US20240194206A1

公开(公告)日：2024-06-13

申请号：US18438225

申请日：2024-02-09

申请人： SAMSUNG ELECTRONICS CO., LTD.

发明人： Oleksandra SOKOL , Dmytro PROGONOV , Heorhii NAUMENKO , Kostiantyn VOLOBUIEV , Vasyl KUZNETSOV , Viacheslav DERKACH

IPC分类号： G10L17/26 , G10L17/02 , G10L17/04 , G10L17/06 , G10L25/63

CPC分类号： G10L17/26 , G10L17/02 , G10L17/04 , G10L17/06 , G10L25/63

摘要： An electronic device includes a microphone, and at least one processor configured to, based on receiving voice data through the microphone, input the voice data into a non-semantic feature extractor model and acquire a non-semantic feature included in the voice data using the non-semantic feature extractor model, input the non-semantic feature into a synthetic voice classifier model and classify the voice data into a synthetic voice or a user voice the synthetic voice classifier model, and provide a result of the classification, and the synthetic voice classifier model is a model that is transfer-learned based on the non-semantic feature extractor model.

10.

发明授权
Voiceprint recognition method, apparatus and device, and storage medium 有权

公开(公告)号：US12002473B2

公开(公告)日：2024-06-04

申请号：US17617314

申请日：2020-12-24

申请人： PING AN TECHNOLOGY (SHENZHEN) CO., LTD.

发明人： Yuechao Guo , Yixuan Qiao , Yijun Tang , Jun Wang , Peng Gao , Guotong Xie

IPC分类号： G10L17/02 , G10L17/06 , G10L17/10 , G10L17/18 , G10L17/20 , G10L25/18 , G10L25/30

CPC分类号： G10L17/02 , G10L17/06 , G10L17/10 , G10L17/18 , G10L17/20 , G10L25/18 , G10L25/30

摘要： A voiceprint recognition method includes: obtaining a target speech information set to be recognized that includes speech information corresponding to at least one object; extracting target feature information from the target speech information set by using a preset algorithm, and optimizing the target feature information based on a first loss function to obtain a first voiceprint recognition result; obtaining target speech channel information of a target speech channel, where the target speech channel information includes channel noise information, and the target speech channel is used to transmit the target speech information set; extracting target feature vectors in the channel noise information, and optimizing the target feature vectors based on a second loss function to obtain a second voiceprint recognition result; and fusing the first voiceprint recognition result and the second voiceprint recognition result to determine a final voiceprint recognition result.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类