专利检索 ipc:G10L17/06 第 6 页

51.

发明授权
Method, apparatus and device for voiceprint recognition of original speech, and storage medium 有权

公开(公告)号：US11798563B2

公开(公告)日：2023-10-24

申请号：US17617296

申请日：2020-08-26

申请人： PING AN TECHNOLOGY (SHENZHEN) CO., LTD.

发明人： Yuechao Guo , Yixuan Qiao , Yijun Tang , Jun Wang , Peng Gao , Guotong Xie

IPC分类号： G10L17/06 , G10L17/02 , G10L17/18 , G10L25/18 , G10L25/21

CPC分类号： G10L17/06 , G10L17/02 , G10L17/18 , G10L25/18 , G10L25/21

摘要： A method for voiceprint recognition of an original speech is used to reduce information losses and system complexity of a model for data recognition of a speaker's original speech. The method includes: obtaining original speech data, and segmenting the original speech data based on a preset time length to obtain segmented speech data; performing tail-biting convolution processing and discrete Fourier transform on the segmented speech data through a preset convolution filter bank to obtain voiceprint feature data; pooling the voiceprint feature data through a preset deep neural network to obtain a target voiceprint feature; performing embedded vector transformation on the target voiceprint feature to obtain corresponding voiceprint feature vectors; and performing calculation on the voiceprint feature vectors through a preset loss function to obtain target voiceprint data, where the loss function includes a cosine similarity matrix loss function and a minimum mean square error matrix loss function.

52.

发明授权
Method, apparatus, and non-transitory computer readable medium for processing audio of virtual meeting room 有权

公开(公告)号：US11798561B2

公开(公告)日：2023-10-24

申请号：US17566250

申请日：2021-12-30

申请人： Fulian Precision Electronics (Tianjin) Co., LTD.

发明人： Cheng-Yu Wang , Po-Cheng Chen , Yu-Te Lee

IPC分类号： G10L17/06 , G06T17/20 , G10L17/02 , G10L21/007 , G06T7/20

CPC分类号： G10L17/06 , G06T7/20 , G06T17/20 , G10L17/02 , G10L21/007 , G06T2207/30201

摘要： A method for processing audio generated in a virtual meeting room (VMR) includes setting a quantity of mesh vertexes according to seats in the VMR, obtaining first voiceprint information of a presenter, the first voiceprint information comprising a frequency, an amplitude, and a phase difference of an audio signal, adjusting the frequency or amplitude of the first voiceprint information according to the quantity of the mesh vertexes, and obtaining second voiceprint information; and determining a seat of the presenter in the VMR according to the second voiceprint information. An apparatus and a non-transitory computer readable medium for processing audio as above are also disclosed.

53.

发明授权
Speaker separation based on real-time latent speaker state characterization 有权

公开(公告)号：US11790921B2

公开(公告)日：2023-10-17

申请号：US17169843

申请日：2021-02-08

申请人： OTO Systems Inc.

发明人： Valentin Alain Jean Perret , Nándor Kedves , Nicolas Lucien Perony

IPC分类号： G10L17/06 , G10L17/02 , G10L17/04 , G10L17/18 , G06N3/04 , G06N3/08 , G06N3/049 , G10L21/0272 , G06N3/045

CPC分类号： G10L17/06 , G06N3/045 , G06N3/049 , G06N3/08 , G10L17/02 , G10L17/04 , G10L17/18 , G10L21/0272

摘要： Systems, methods, and non-transitory computer-readable media can obtain a stream of audio waveform data that represents speech involving a plurality of speakers. As the stream of audio waveform data is obtained, a plurality of audio chunks can be determined. An audio chunk can be associated with one or more identity embeddings. The stream of audio waveform data can be segmented into a plurality of segments based on the plurality of audio chunks and respective identity embeddings associated with the plurality of audio chunks. A segment can be associated with a speaker included in the plurality of speakers. Information describing the plurality of segments associated with the stream of audio waveform data can be provided.

54.

发明公开
MACHINE LEARNING FOR IMPROVING QUALITY OF VOICE BIOMETRICS 审中-公开

公开(公告)号：US20230326464A1

公开(公告)日：2023-10-12

申请号：US18331920

申请日：2023-06-08

申请人： Capital One Services, LLC

发明人： Bozhao TAN , Isabelle Alice Yvonne MOULINIER , David ALMQUIST , June WU

IPC分类号： G10L17/06 , H04M3/42 , G10L17/04 , G10L17/02

CPC分类号： G10L17/04 , G10L17/02 , G10L17/06 , H04M3/42221

摘要： Methods and systems are disclosed herein for improving the quality of audio for use in a biometric. A biometric system may use machine learning to determine whether audio or a portion of the audio should be used as a biometric for a user. A sample of the user’s voice may be used to generate a voice signature of the user. Portions of the audio that do not meet a similarity threshold when compared with the voice signature may be removed from the audio. Additionally or alternatively, interfering noises may be detected and removed from the audio to improve the quality of a voice biometric generated from the audio.

55.

发明公开
IN-EAR LIVENESS DETECTION FOR VOICE USER INTERFACES 审中-公开

公开(公告)号：US20230306971A1

公开(公告)日：2023-09-28

申请号：US18325873

申请日：2023-05-30

申请人： JVCKENWOOD Corporation

发明人： Jan Jasper van den Berg

IPC分类号： G10L17/24 , G10L17/06 , G10L17/04 , A61B5/00 , A61B5/107 , G06Q20/40 , G06F21/32 , G06F16/635 , A61B5/117 , G06V40/70 , G06V40/18 , G06V40/12

CPC分类号： G10L17/24 , G10L17/06 , G10L17/04 , A61B5/6817 , A61B5/1076 , G06Q20/40145 , G06F21/32 , G06F16/636 , A61B5/117 , G06V40/70 , G06V40/197 , G06V40/1365 , G16H50/50

摘要： Introduced here are approaches to authenticating the identity of speakers based on the “liveness” of the input. To prevent spoofing, an authentication platform may establish the likelihood that a voice sample represents a recording of word(s) uttered by a speaker whose identity is to be authenticated and then, based on the likelihood, determine whether to authenticate the speaker.

56.

发明公开
GESTURE AND VOICE CONTROLLED INTERFACE DEVICE 审中-公开

公开(公告)号：US20230305633A1

公开(公告)日：2023-09-28

申请号：US18109315

申请日：2023-02-14

申请人： Wearable Devices Ltd.

发明人： Guy WAGNER , Leeor Langer , Asher Dahan

IPC分类号： G06F3/01 , G06F3/16 , G06F3/0346 , G10L17/22 , G10L17/06

CPC分类号： G06F3/017 , G06F3/015 , G06F3/0346 , G06F3/167 , G10L17/06 , G10L17/22 , G06F3/013

摘要： A gesture and voice-controlled interface device comprising one or a plurality of gesture sensors for sensing gestures of a user; one or a plurality of audio sensors for sensing sounds made by the user; and a processor configured to obtain one or a plurality of sensed gestures from said one or a plurality of gesture sensors and to obtain one or a plurality of sensed sounds from said one or a plurality of audio sensors, to analyze the sensed gesture and sensed sounds to identify an input from the user, and to generate an output signal corresponding to the input to a controlled device

57.

发明授权
Customizable audio in prescription reminders 有权

公开(公告)号：US11762965B2

公开(公告)日：2023-09-19

申请号：US16808033

申请日：2020-03-03

申请人： WALGREEN CO.

发明人： Andrew Schweinfurth , Julija Alegra Petkus , Gunjan Dhanesh Bhow

IPC分类号： G16H20/10 , G16H10/60 , G06Q10/10 , G06F21/32 , G16H40/20 , G10L17/06 , G10L25/51 , G06Q50/26 , G16H40/63 , G10L15/22 , G06F3/16 , G16H80/00 , G06F3/0481 , G10L17/22 , G10L17/00 , G06F21/62

CPC分类号： G06F21/32 , G06F3/0481 , G06F3/167 , G06F21/6245 , G06Q50/265 , G10L15/22 , G10L17/00 , G10L17/06 , G10L17/22 , G10L25/51 , G16H10/60 , G16H20/10 , G16H40/20 , G16H40/63 , G16H80/00 , G10L2015/223 , G10L2015/228

摘要： Methods and systems may incorporate voice interaction and other audio interaction to facilitate access to prescription related information and processes. Particularly, voice/audio interactions may be utilized to achieve authentication to access prescription-related information and action capabilities. Additionally, voice/audio interactions may be utilized in performance of processes such as obtaining prescription refills and receiving reminders to consume prescription products.

58.

发明授权
Biometric authentication through voice print categorization using artificial intelligence 有权

公开(公告)号：US11756555B2

公开(公告)日：2023-09-12

申请号：US17313040

申请日：2021-05-06

申请人： NICE LTD.

发明人： Natan Katz , Tal Haguel

IPC分类号： G10L17/06 , G10L17/04 , G06N3/04 , G10L19/00 , G10L17/18

CPC分类号： G10L17/06 , G06N3/04 , G10L17/04 , G10L17/18 , G10L19/00

摘要： A system is provided to categorize voice prints during a voice authentication. The system includes a processor and a computer readable medium operably coupled thereto, to perform voice authentication operations which include receiving an enrollment of a user in the biometric authentication system, requesting a first voice print comprising a sample of a voice of the user, receiving the first voice print of the user during the enrollment, accessing a plurality of categorizations of the voice prints for the voice authentication, wherein each of the plurality of categorizations comprises a portion of the voice prints based on a plurality of similarity scores of distinct voice prints in the portion to a plurality of other voice prints, determining, using a hidden layer of a neural network, one of the plurality of categorizations for the first voice print, and encoding the first voice print with the one of the plurality of categorizations.

59.

发明授权
Voice capturing method and voice capturing system 有权

公开(公告)号：US11749296B2

公开(公告)日：2023-09-05

申请号：US17485644

申请日：2021-09-27

申请人： Realtek Semiconductor Corporation

发明人： Chung-Shih Chu , Ming-Tang Lee , Chieh-Min Tsai

IPC分类号： G10L21/057 , G10L17/06 , G10L21/0232 , H04R1/40 , H04R3/00 , G06N20/00 , G10L21/0216

CPC分类号： G10L21/057 , G10L17/06 , G10L21/0232 , H04R1/406 , H04R3/005 , G06N20/00 , G10L2021/02166

摘要： A voice capturing method includes following operations: storing, by a buffer, voice data from a plurality of microphones; determining, by a processor, whether a target speaker exists and whether a direction of the target speaker changes according to the voice data and target speaker information; inserting a voice segment corresponding to a previous tracking direction into a current position in the voice data to generate fusion voice data when the target speaker exists and the direction of the target speaker changes from the previous tracking direction to a current tracking direction; performing, by the processor, a voice enhancement process on the fusion voice data according to the current tracking direction to generate enhanced voice data; performing, by the processor, a voice shortening process on the enhanced voice data to generate voice output data; and playing, by a playing circuit, the voice output data.

60.

发明授权
Adapting hotword recognition based on personalized negatives 有权

公开(公告)号：US11749267B2

公开(公告)日：2023-09-05

申请号：US16953510

申请日：2020-11-20

申请人： Google LLC

发明人： Aleksandar Kracun , Matthew Sharifi

IPC分类号： G10L15/22 , G10L15/197 , G10L17/06 , G10L17/24 , G10L15/30 , G10L15/08

CPC分类号： G10L15/22 , G10L15/197 , G10L15/30 , G10L17/06 , G10L17/24 , G10L2015/088 , G10L2015/223

摘要： A method for adapting hotword recognition includes receiving audio data characterizing a hotword event detected by a first stage hotword detector in streaming audio captured by a user device. The method also includes processing, using a second stage hotword detector, the audio data to determine whether a hotword is detected by the second stage hot word detector in a first segment of the audio data. When the hotword is not detected by the second stage hotword detector, the method includes, classifying the first segment of the audio data as containing a negative hotword that caused a false detection of the hotword event in the streaming audio by the first stage hotword detector. Based on the first segment of the audio data classified as containing the negative hotword, the method includes updating the first stage hotword detector to prevent triggering the hotword event in subsequent audio data that contains the negative hotword.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类