专利检索 ipc:G10L17/06 第 1 页

1.

发明公开
VOICE SHORTCUT DETECTION WITH SPEAKER VERIFICATION 审中-公开

公开(公告)号：US20240363122A1

公开(公告)日：2024-10-31

申请号：US18765108

申请日：2024-07-05

申请人： GOOGLE LLC

发明人： Rajeev Rikhye , Quan Wang , Yanzhang He , Qiao Liang , Ian C. McGraw

IPC分类号： G10L17/24 , G10L15/26 , G10L17/06 , G10L21/028

CPC分类号： G10L17/24 , G10L15/26 , G10L17/06 , G10L21/028

摘要： Techniques disclosed herein are directed towards streaming keyphrase detection which can be customized to detect one or more particular keyphrases, without requiring retraining of any model(s) for those particular keyphrase(s). Many implementations include processing audio data using a speaker separation model to generate separated audio data which isolates an utterance spoken by a human speaker from one or more additional sounds not spoken by the human speaker, and processing the separated audio data using a text independent speaker identification model to determine whether a verified and/or registered user spoke a spoken utterance captured in the audio data. Various implementations include processing the audio data and/or the separated audio data using an automatic speech recognition model to generate a text representation of the utterance. Additionally or alternatively, the text representation of the utterance can be processed to determine whether at least a portion of the text representation of the utterance captures a particular keyphrase. When the system determines the registered and/or verified user spoke the utterance and the system determines the text representation of the utterance captures the particular keyphrase, the system can cause a computing device to perform one or more actions corresponding to the particular keyphrase.

2.

发明授权
User interface to select field of view of a camera in a smart glass 有权

公开(公告)号：US12132983B2

公开(公告)日：2024-10-29

申请号：US17859808

申请日：2022-07-07

申请人： Meta Platforms Technologies, LLC

发明人： Sapna Shroff , Sebastian Sztuk , Jun Hu , Johana Gabriela Coyoc Escudero

IPC分类号： H04N23/60 , G02B27/01 , G02C7/10 , G02C7/16 , G06F3/01 , G06F3/044 , G06F3/16 , G06F40/205 , G06V10/25 , G06V10/75 , G06V20/50 , G10L17/06 , G10L17/22 , H04N23/56 , H04N23/62 , H04N23/63 , H04N23/66 , H04N23/695 , H04N23/90

CPC分类号： H04N23/64 , G02B27/0172 , G02C7/101 , G02C7/16 , G06F3/013 , G06F3/016 , G06F3/017 , G06F3/044 , G06F3/167 , G06F40/205 , G06V10/25 , G06V10/759 , G06V20/50 , G10L17/06 , G10L17/22 , H04N23/56 , H04N23/62 , H04N23/63 , H04N23/66 , H04N23/695 , H04N23/90 , G02B2027/0138 , G02B2027/0141 , G02B2027/0178

摘要： A wearable device for use in immersive reality applications is provided. The wearable device includes eyepieces to provide a forward-image to a user, a first forward-looking camera mounted on the frame and having a field of view, a processor configured to identify a region of interest within the forward-image, and an interface device to indicate to the user that a field of view of the first forward-looking camera is misaligned with the region of interest. Methods of use of the device, a memory storing instructions and a processor to execute the instructions to cause the device to perform the methods of use, are also provided.

3.

发明授权
Machine learning for improving quality of voice biometrics 有权

公开(公告)号：US12131740B2

公开(公告)日：2024-10-29

申请号：US18331920

申请日：2023-06-08

申请人： Capital One Services, LLC

发明人： Bozhao Tan , Isabelle Alice Yvonne Moulinier , David Almquist , June Wu

IPC分类号： G10L15/06 , G10L17/02 , G10L17/04 , G10L17/06 , H04M3/42

CPC分类号： G10L17/04 , G10L17/02 , G10L17/06 , H04M3/42221

摘要： Methods and systems are disclosed herein for improving the quality of audio for use in a biometric. A biometric system may use machine learning to determine whether audio or a portion of the audio should be used as a biometric for a user. A sample of the user's voice may be used to generate a voice signature of the user. Portions of the audio that do not meet a similarity threshold when compared with the voice signature may be removed from the audio. Additionally or alternatively, interfering noises may be detected and removed from the audio to improve the quality of a voice biometric generated from the audio.

4.

发明公开
AUDIO SIGNAL PROCESSING METHOD AND APPARATUS, DEVICE AND STORAGE MEDIUM 审中-公开

公开(公告)号：US20240355335A1

公开(公告)日：2024-10-24

申请号：US18685019

申请日：2022-11-08

申请人： Zhejiang Alibaba Robot Co., Ltd.

发明人： Xianliang Wang , Hongbin Suo

IPC分类号： G10L17/06 , G10L15/04 , G10L15/06 , G10L17/04

CPC分类号： G10L17/06 , G10L15/04 , G10L15/063 , G10L17/04 , G10L2015/0631

摘要： The present disclosure relates to an audio signal processing method and apparatus, a device and a storage medium. The present disclosure performs a segmenting processing on an audio signal to obtain multiple audio segments, performs a clustering processing on the multiple audio segments according to feature information of each audio segment in the multiple audio segments to obtain one or more first sets, determines a first cluster center of each first set according to the feature information of the audio segment included in each first set, and performs a clustering processing on the multiple audio segments according to the first cluster center of each first set to obtain one or more second sets, where audio segments in a same second set corresponding to a same role label. In this way, an accuracy of an unsupervised role separation based on a single channel speech is improved.

5.

发明授权
Voice analysis platform for voiceprint tracking and anomaly detection 有权

公开(公告)号：US12120262B2

公开(公告)日：2024-10-15

申请号：US17954422

申请日：2022-09-28

申请人： Bank of America Corporation

发明人： George Albero , Youshika C. Scott , Brian H. Corr , Thomas G. Frost , Scott Nielsen , Charlene L. Ramsue

IPC分类号： H04M3/22 , G10L17/04 , G10L17/06 , G10L25/90 , H04M3/42 , H04M3/51

CPC分类号： H04M3/2281 , G10L17/04 , G10L17/06 , G10L25/90 , H04M3/42042 , H04M3/5175 , H04M3/5183 , H04M2201/41

摘要： Aspects of the disclosure relate to voiceprint tracking and anomaly detection. A computing platform may detect voice information from a call management system. The computing platform may establish voiceprints for employees and clients of an enterprise. The computing platform may detect a call between an employee and a caller attempting to access a client account. The computing platform may identify a first voiceprint corresponding to the employee and a second voiceprint corresponding to the caller. The computing platform may compare the second voiceprint to a known voiceprint corresponding to the client. Based on the comparison of the second voiceprint to the known voiceprint, the computing platform may determine that the second voiceprint does not match the known voiceprint. The computing platform may identify that the second voiceprint corresponds to another employee of the enterprise, and may send a security notification indicating potential unauthorized account access to an enterprise computing device.

6.

发明授权
Electronic apparatus and controlling method thereof 有权

公开(公告)号：US12114039B2

公开(公告)日：2024-10-08

申请号：US17437281

申请日：2021-07-27

申请人： SAMSUNG ELECTRONICS CO., LTD.

发明人： Manmohan Singh Bohara

IPC分类号： H04N21/4415 , G10L17/06 , G10L17/22 , H04N21/422 , H04N21/45 , H04N21/472

CPC分类号： H04N21/4415 , G10L17/06 , G10L17/22 , H04N21/42203 , H04N21/4532 , H04N21/47202

摘要： An electronic apparatus is provided. The electronic apparatus includes: a display; and a processor configured to: control the display to display a content based on one mode of a plurality of display modes, receive a user voice in real time while the content is being displayed, identify user's age information corresponding to the received user voice, identify whether or not the one mode is a kids mode when the identified user's age information is less than a threshold value, and change the one mode to the kids mode when it is identified that the one mode is not the kids mode.

7.

发明公开
Matching Active Speaker Pose Between Two Cameras 审中-公开

公开(公告)号：US20240314427A1

公开(公告)日：2024-09-19

申请号：US18634869

申请日：2024-04-12

申请人： Hewlett-Packard Development Company, L.P.

发明人： Jian David Wang , Xiangdong Wang , Varun Ajay Kulkarni

IPC分类号： H04N23/60 , G06T7/73 , G10L17/06 , G10L17/18 , G10L25/57 , H04N5/268 , H04N23/611 , H04R1/40 , H04R3/00

CPC分类号： H04N23/64 , G06T7/73 , G10L17/06 , G10L17/18 , G10L25/57 , H04N5/268 , H04N23/611 , H04R1/406 , H04R3/005 , G06T2207/10016 , G06T2207/20084 , G06T2207/30201

摘要： Described are multiple cameras in a conference room, each pointed in a different direction. A primary camera includes a microphone array to perform sound source localization (SSL). The SSL is used in combination with a video image to identify the speaker from among multiple individuals that appear in the video image. Pose information of the speaker is developed. Pose information of each individual identified in each other camera is developed. The speaker pose information is compared to the pose information of the individuals from the other cameras. The best match for each other camera is selected as the speaker in that camera. The speaker views of each camera are compared to determine the speaker view with the most frontal view of the speaker. That camera is selected to provide the video for provision to the far end.

8.

发明授权
Service authentication through a voice assistant 有权

公开(公告)号：US12063214B2

公开(公告)日：2024-08-13

申请号：US16799867

申请日：2020-02-25

申请人： VMware LLC

发明人： Rohit Pradeep Shetty

IPC分类号： G10L17/00 , G10L17/06 , G10L17/24 , G10L25/84 , H04L9/40

CPC分类号： H04L63/0861 , G10L17/06 , G10L17/24 , G10L25/84 , H04L63/0838 , H04L63/0846 , H04L63/0853

摘要： Disclosed are various approaches for authenticating a user through a voice assistant device and creating an association between the device and a user account. The request is associated with a network or federated service. The user can use a client device, such as a smartphone, to initiate an authentication flow. A passphrase is provided to the client device can captured by the client device and a voice assistant device. Audio captured by the client device and voice assistant device can be sent to an assistant connection service. The passphrase and an audio signature calculated from the audio can be validated. An association between the user account and the voice assistant device can then be created.

9.

发明授权
Audio privacy protection for surveillance systems 有权

公开(公告)号：US12051397B2

公开(公告)日：2024-07-30

申请号：US17176697

申请日：2021-02-16

申请人： Western Digital Technologies, Inc.

发明人： Shaomin Xiong , Toshiki Hirano , Pritam Das , Ramy Ayad , Rajeev Nagabhirava

IPC分类号： G10K11/175 , G06F21/32 , G06F21/62 , G06V20/40 , G06V20/52 , G06V40/16 , G10L17/06 , G10L21/028 , G10L25/57 , H04N5/76 , H04N5/77 , H04N9/802 , H04N9/804 , H04N9/82

CPC分类号： G10K11/1754 , G06F21/32 , G06V20/40 , G06V40/172 , G10L17/06 , G10L21/028 , G10L25/57 , H04N5/76 , G06V20/44

摘要： Systems and methods for audio privacy in network video surveillance systems are described. A video camera may include an image sensor and a microphone to generate a video stream. Responsive to detecting a human speaking condition in the video stream, the audio data may be selectively modified to mask a human voice component of the audio data for storing and/or displaying the surveillance video stream.

10.

发明公开
COMMUNICATIONS AND CONTENT PLATFORM 审中-公开

公开(公告)号：US20240249727A1

公开(公告)日：2024-07-25

申请号：US18605696

申请日：2024-03-14

申请人： Prevail Legal, Inc.

发明人： Robert FEIGENBAUM , Random BARES

IPC分类号： G10L17/06 , G10L15/26 , H04L65/1069

CPC分类号： G10L17/06 , G10L15/26 , H04L65/1069

摘要： A system and method that overcomes technological hurdles related to litigation-related management is disclosed. The technological hurdles were overcome with industry-transformative innovations in in-person, hybrid, and remote legal proceedings; court reporting; testimony management; trial preparation; and utilization of video evidence, to name several. These innovations resulted in many advantages, such as could-based testimony management, scalable digital transformation, dramatic savings in litigation costs, and fast turn-around on certified transcripts, to name several.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类