专利检索 ipc:G10L15/07 第 1 页

1.

发明授权
Multiple wake words for systems with multiple smart assistants 有权

公开(公告)号：US12131523B2

公开(公告)日：2024-10-29

申请号：US17182951

申请日：2021-02-23

申请人： Meta Platforms, Inc.

发明人： Xiaohu Liu , Baiyang Liu , Rajen Subba

IPC分类号： G06V10/82 , G06F3/01 , G06F3/16 , G06F7/14 , G06F9/451 , G06F16/176 , G06F16/22 , G06F16/23 , G06F16/242 , G06F16/2455 , G06F16/2457 , G06F16/248 , G06F16/33 , G06F16/332 , G06F16/338 , G06F16/903 , G06F16/9032 , G06F16/9038 , G06F16/904 , G06F16/951 , G06F16/9535 , G06F18/2411 , G06F40/205 , G06F40/295 , G06F40/30 , G06F40/40 , G06N3/006 , G06N3/08 , G06N7/01 , G06N20/00 , G06Q50/00 , G06V10/764 , G06V20/10 , G06V40/20 , G10L15/02 , G10L15/06 , G10L15/07 , G10L15/16 , G10L15/18 , G10L15/183 , G10L15/187 , G10L15/22 , G10L15/26 , G10L17/06 , G10L17/22 , H04L5/02 , H04L12/28 , H04L41/00 , H04L41/22 , H04L43/0882 , H04L43/0894 , H04L51/02 , H04L51/18 , H04L51/216 , H04L51/52 , H04L67/306 , H04L67/50 , H04L67/5651 , H04L67/75 , H04W12/08 , G10L13/00 , G10L13/04 , H04L51/046 , H04L67/10 , H04L67/53

CPC分类号： G06V10/82 , G06F3/011 , G06F3/013 , G06F3/017 , G06F3/167 , G06F7/14 , G06F9/453 , G06F16/176 , G06F16/2255 , G06F16/2365 , G06F16/243 , G06F16/24552 , G06F16/24575 , G06F16/24578 , G06F16/248 , G06F16/3323 , G06F16/3329 , G06F16/3344 , G06F16/338 , G06F16/90332 , G06F16/90335 , G06F16/9038 , G06F16/904 , G06F16/951 , G06F16/9535 , G06F18/2411 , G06F40/205 , G06F40/295 , G06F40/30 , G06F40/40 , G06N3/006 , G06N3/08 , G06N7/01 , G06N20/00 , G06Q50/01 , G06V10/764 , G06V20/10 , G06V40/28 , G10L15/02 , G10L15/063 , G10L15/07 , G10L15/16 , G10L15/1815 , G10L15/1822 , G10L15/183 , G10L15/187 , G10L15/22 , G10L15/26 , G10L17/06 , G10L17/22 , H04L5/02 , H04L12/2816 , H04L41/20 , H04L41/22 , H04L43/0882 , H04L43/0894 , H04L51/02 , H04L51/18 , H04L51/216 , H04L51/52 , H04L67/306 , H04L67/535 , H04L67/5651 , H04L67/75 , H04W12/08 , G06F2216/13 , G10L13/00 , G10L13/04 , G10L2015/223 , G10L2015/225 , H04L51/046 , H04L67/10 , H04L67/53

摘要： In one embodiment, a method includes by a client system associated with a user, receiving, at the client system, a user input from the user, parsing, by the client system, the first user input to identify a request to execute a function to be performed by an assistant system of several assistant systems associated with the client system, determining whether the user is authorized to access the assistant system by comparing a voiceprint of the user to several voiceprints stored on the client system, sending, from the client system to the assistant system in response to determining the user is authorized to access the assistant system, a request to set an assistant xbot of the assistant system into a listening mode, and receiving, at the client system from the assistant system, an indication that the assistant xbot is in listening mode.

2.

发明授权
Method and system for conversation transcription with metadata 有权

公开(公告)号：US12125487B2

公开(公告)日：2024-10-22

申请号：US17450551

申请日：2021-10-11

申请人： SoundHound, Inc.

发明人： Kiersten L. Bradley , Ethan Coeytaux , Ziming Yin

IPC分类号： G10L15/26 , G06F40/134 , G06F40/166 , G06F40/284 , G10L15/02 , G10L15/06 , G10L15/07

CPC分类号： G10L15/26 , G06F40/134 , G06F40/166 , G06F40/284 , G10L15/02 , G10L15/063 , G10L15/07 , G10L2015/0631

摘要： Methods and systems for enabling an efficient review of meeting content via a metadata-enriched, speaker-attributed and multiuser-editable transcript are disclosed. By incorporating speaker diarization and other metadata, the system can provide a structured and effective way to review and/or edit the transcript by one or more editors. One type of metadata can be image or video data to represent the meeting content. Furthermore, the present subject matter utilizes a multimodal diarization model to identify and label different speakers. The system can synchronize various sources of data, e.g., audio channel data, voice feature vectors, acoustic beamforming, image identification, and extrinsic data, to implement speaker diarization.

3.

发明授权
Method and apparatus for decentralized supervised learning in NLP applications 有权

公开(公告)号：US12112129B2

公开(公告)日：2024-10-08

申请号：US17527167

申请日：2021-11-16

申请人： Fujitsu Limited

发明人： Nuria Garcia Santa , Kendrick Cetina

IPC分类号： G10L15/16 , G06F18/214 , G06F40/169 , G06F40/226 , G06N3/04 , G10L15/06 , G10L15/07 , G10L15/18 , G06F40/279 , G06F40/295 , G10L15/183

CPC分类号： G06F40/226 , G06F18/214 , G06F40/169 , G06N3/04 , G10L15/063 , G10L15/075 , G10L15/16 , G10L15/18 , G06F40/279 , G06F40/295 , G10L2015/0635 , G10L15/1822 , G10L15/183

摘要： A method of training a neural network as a natural language processing, NLP, model, comprises: inputting annotated training data to first architecture portions of the neural network, the first architecture portions being executed respectively in a plurality of distributed client computing devices in communication with a server computing device, the training data being derived from text data private to the client computing device in which the first architecture portion is executed, the server computing device having no access to any of the private text data; deriving from the training data, using the first architecture portions, weight matrices of numeric weights which are decoupled from the private text data; concatenating the weight matrices, in a second architecture portion of the neural network executed in the server computing device, to obtain a single concatenated weight matrix; and training, on the second architecture portion, the NLP model using the concatenated weight matrix.

4.

发明公开
METHOD AND SYSTEM FOR CONVERSATION TRANSCRIPTION WITH METADATA 审中-公开

公开(公告)号：US20240331702A1

公开(公告)日：2024-10-03

申请号：US18743562

申请日：2024-06-14

申请人： SoundHound AI IP, LLC.

发明人： Kiersten L. BRADLEY , Ethan COEYTAUX , Ziming YIN

IPC分类号： G10L15/26 , G06F40/134 , G06F40/166 , G06F40/284 , G10L15/02 , G10L15/06 , G10L15/07

CPC分类号： G10L15/26 , G06F40/134 , G06F40/166 , G06F40/284 , G10L15/02 , G10L15/063 , G10L15/07 , G10L2015/0631

摘要： Methods and systems for enabling an efficient review of meeting content via a metadata-enriched, speaker-attributed transcript are disclosed. By incorporating speaker diarization and other metadata, the system can provide a structured and effective way to review and/or edit the transcript. One type of metadata can be image or video data to represent the meeting content. Furthermore, the present subject matter utilizes a multimodal diarization model to identify and label different speakers. The system can synchronize various sources of data, e.g., audio channel data, voice feature vectors, acoustic beamforming, image identification, and extrinsic data, to implement speaker diarization.

5.

发明授权
Automatic synchronization for an offline virtual assistant 有权

公开(公告)号：US12020696B2

公开(公告)日：2024-06-25

申请号：US16659260

申请日：2019-10-21

申请人： SoundHound, Inc.

发明人： Karl Stahl

IPC分类号： G10L15/00 , G06F16/242 , G06F40/253 , G10L15/07 , G10L15/19 , G10L15/22 , G10L15/30

CPC分类号： G10L15/19 , G06F16/243 , G06F40/253 , G10L15/07 , G10L15/22 , G10L15/30 , G10L2015/223

摘要： [Object] Technology is provided to enable a mobile terminal to function as a digital assistant even when the mobile terminal is in a state where it cannot communicate with a server apparatus.
[Solution] When a user terminal 200 receives a query A from a user, user terminal 200 sends query A to a server 100. Server 100 interprets the meaning of query A using a grammar A. Server 100 obtains a response to query A based on the meaning of query A and sends the response to user terminal 200. Server 100 further sends grammar A to user terminal 200. That is, server 100 sends to user terminal 200 a grammar used to interpret the query received from user terminal 200.

6.

发明授权
Translating a media asset with vocal characteristics of a speaker 有权

公开(公告)号：US11997344B2

公开(公告)日：2024-05-28

申请号：US17509401

申请日：2021-10-25

申请人： Rovi Guides, Inc.

发明人： Vijay Kumar , Rajendran Pichaimurthy , Madhusudhan Seetharam

IPC分类号： G06F40/40 , G10L13/027 , G10L13/033 , G10L15/07 , G10L15/19 , G10L25/63 , H04N21/43 , H04N21/81

CPC分类号： H04N21/43072 , G10L13/027 , G10L15/07 , G10L15/19 , G10L25/63 , H04N21/8106

摘要： Systems and methods are described herein for generating alternate audio for a media stream. The media system receives media that is requested by the user. The media comprises a video and audio. The audio includes words spoken in a first language. The media system stores the received media in a buffer as it is received. The media system separates the audio from the buffered media and determines an emotional state expressed by spoken words of the first language. The media system translates the words spoken in the first language into words spoken in a second language. Using the translated words of the second language, the media system synthesizes speech having the emotional state previously determined. The media system then retrieves the video of the received media from the buffer and synchronizes the synthesized speech with the video to generate the media content in a second language.

7.

发明公开
MULTI-DIALECT AND MULTILINGUAL SPEECH RECOGNITION 审中-公开

公开(公告)号：US20240161732A1

公开(公告)日：2024-05-16

申请号：US18418246

申请日：2024-01-20

申请人： Google LLC

发明人： Zhifeng Chen , Bo Li , Eugene Weinstein , Yonghui Wu , Pedro J. Moreno Mengibar , Ron J. Weiss , Khe Chai Sim , Tara N. Sainath , Patrick An Phu Nguyen

IPC分类号： G10L15/00 , G10L15/07 , G10L15/16

CPC分类号： G10L15/005 , G10L15/07 , G10L15/16 , G10L2015/0631

摘要： Methods, systems, and apparatus, including computer programs encoded on a computer-readable media, for speech recognition using multi-dialect and multilingual models. In some implementations, audio data indicating audio characteristics of an utterance is received. Input features determined based on the audio data are provided to a speech recognition model that has been trained to output score indicating the likelihood of linguistic units for each of multiple different language or dialects. The speech recognition model can be one that has been trained using cluster adaptive training. Output that the speech recognition model generated in response to receiving the input features determined based on the audio data is received. A transcription of the utterance generated based on the output of the speech recognition model is provided.

8.

发明授权
Multi-user configuration 有权

公开(公告)号：US11983551B2

公开(公告)日：2024-05-14

申请号：US18207053

申请日：2023-06-07

申请人： Apple Inc.

发明人： Taylor G. Carrigan , Patrick L. Coffman , David C. Graham

IPC分类号： G06F9/451 , G06F3/0481 , G06F3/0484 , G10L15/06 , G10L15/07 , G10L15/22 , G10L17/00

CPC分类号： G06F9/451 , G06F3/0481 , G06F3/0484 , G10L15/063 , G10L15/07 , G10L15/22 , G10L17/00 , G10L2015/0638 , G10L2015/223 , G10L2015/225

摘要： Examples of multi-user configuration are disclosed. An example method includes, at an electronic device: receiving a request; and in response to the request: if the voice input does not match a voice profile associated with an account associated with the electronic device: causing output of first information based on the request using a first account associated with the electronic device; if a setting of the electronic device has a first state, causing update of account data of the first account based on the request; and if the setting has a second state, forgoing causing update of the account data; and if the voice input matches a voice profile associated with an account associated with the electronic device: causing output of the first information using the account associated with the matching voice profile; and causing update of account data of the account based on the request.

9.

发明公开
Systems And Methods For Short- and Long- Term Dialog Management Between A Robot Computing Device/Digital Companion And A User 审中-公开

公开(公告)号：US20240152705A1

公开(公告)日：2024-05-09

申请号：US18414224

申请日：2024-01-16

申请人： Embodied, Inc.

发明人： Stefan A. Scherer , Mario E. Munich , Paolo Pirjanian , Kevin D. Saunders , Wilson Harron , Marissa Kohan

IPC分类号： G06F40/35 , G10L13/027 , G10L15/07 , G10L15/22 , G10L15/26

CPC分类号： G06F40/35 , G10L13/027 , G10L15/07 , G10L15/22 , G10L15/26 , G10L2015/223

摘要： Systems and methods for managing conversations between a robot computing device and a user are disclosed. Exemplary implementations may: initiate a first-time user experience sequence with the user; teach the user the robot computing capabilities and/or characteristics; initiate, utilizing a dialog manager, a conversation with the user; receive, one or more command files from the user via one or more microphones; and generate conversation response files and communicating the generated conversation files to the dialog manager in response to the one or more received user global command files to initiate an initial conversation exchange.

10.

发明授权
Systems and methods to briefly deviate from and resume back to amending a section of a note 有权

公开(公告)号：US11934769B2

公开(公告)日：2024-03-19

申请号：US18300120

申请日：2023-04-13

申请人： Suki AI, Inc.

发明人： Nithyanand Kota , Yashas Rao , Hao Ran Raymond Lin , Maneesh Dewan , Arunan Rabindran , Jatin Chhugani , Sudheer Tumu

IPC分类号： G06F17/00 , G06F40/166 , G10L15/07 , G10L15/08 , G10L15/22 , G10L15/26 , G16H15/00

CPC分类号： G06F40/166 , G10L15/07 , G10L15/083 , G10L15/22 , G10L15/26 , G16H15/00 , G10L2015/088

摘要： Systems and methods to briefly deviate from and resume back to amending a section of a note are disclosed. Exemplary implementations may: obtain audio information representing sound captured by an audio section of a client computing platform, such sound including speech from a user associated with the client computing platform; effectuate presentation of a graphical user interface that includes sections of the note; analyze the audio information to determine which individual ones of the spoken inputs are the primary spoken input or the deviant spoken input; determine, based on analysis, which section of the note to which the deviant spoken input is related; alternately amend, based on the determination, sections of the note by deviating from one section to another section and returning back to the one section for continued population; and effectuate, via the user interface, presentation of the alternating amendments to the sections of the note.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类