专利检索 cpc:"G10L13/033" 第 1 页

1.

发明授权
System and method for data augmentation of feature-based voice data 有权

公开(公告)号：US12073818B2

公开(公告)日：2024-08-27

申请号：US17197740

申请日：2021-03-10

申请人： Microsoft Technology Licensing, LLC

发明人： Dushyant Sharma , Patrick A. Naylor , James W. Fosburgh , Do Yeong Kim

IPC分类号： G10L13/02 , G06F3/16 , G06N5/02 , G06N20/00 , G10K15/08 , G10L13/033 , G10L15/02 , G10L15/06 , G10L15/065 , G10L21/0224 , G10L25/03 , H04S7/00

CPC分类号： G10L13/02 , G06F3/165 , G06N5/02 , G06N20/00 , G10K15/08 , G10L13/033 , G10L15/02 , G10L15/063 , G10L15/065 , G10L21/0224 , G10L25/03 , H04S7/30 , H04S7/302 , H04S7/303

摘要： A method, computer program product, and computing system for receiving feature-based voice data. One or more data augmentation characteristics may be received. One or more augmentations of the feature-based voice data may be generated, via a machine learning model, based upon, at least in part, the feature-based voice data and the one or more data augmentation characteristics.

2.

发明公开
LATENT SPACE EDITING AND NEURAL ANIMATION TO GENERATE HYPERREAL SYNTHETIC FACES 审中-公开

公开(公告)号：US20240212249A1

公开(公告)日：2024-06-27

申请号：US18089487

申请日：2022-12-27

申请人： Metaphysic.AI

发明人： Chris Ume , Jo Plaete , Martin Adams , Thomas Graham

IPC分类号： G06T13/40 , G06N20/00 , G06T13/20 , G06T19/00 , G10L13/033

CPC分类号： G06T13/40 , G06N20/00 , G06T13/205 , G06T19/006 , G10L13/033

摘要： Using latent space manipulation and neural animation to generate hyperreal synthetic faces is described. A machine learning model(s) may be trained to generate a synthetic face of a subject featured in unaltered video content based at least in part on video data of an actor making a mouth-generated sound or a three-dimensional (3D) model of a face of the subject that has been animated in accordance with the mouth-generated sound. Latent space manipulation and neural animation may be used with the trained machine learning model(s) to generate instances of the synthetic face, and the instances of the synthetic face can be used to create altered video content featuring the subject with the synthetic face making the mouth-generated sound.

3.

发明授权
System and method for data augmentation of feature-based voice data 有权

公开(公告)号：US12014722B2

公开(公告)日：2024-06-18

申请号：US17197587

申请日：2021-03-10

申请人： Microsoft Technology Licensing, LLC

发明人： Dushyant Sharma , Patrick A. Naylor , James W. Fosburgh

IPC分类号： G10L13/02 , G06F3/16 , G06N5/02 , G06N20/00 , G10K15/08 , G10L13/033 , G10L15/02 , G10L15/06 , G10L15/065 , G10L21/0224 , G10L25/03 , H04S7/00

CPC分类号： G10L13/02 , G06F3/165 , G06N5/02 , G06N20/00 , G10K15/08 , G10L13/033 , G10L15/02 , G10L15/063 , G10L15/065 , G10L21/0224 , G10L25/03 , H04S7/30 , H04S7/302 , H04S7/303

摘要： A method, computer program product, and computing system for receiving feature-based voice data associated with a first acoustic domain. One or more gain-based augmentations may be performed on at least a portion of the feature-based voice data, thus defining gain-augmented feature-based voice data.

4.

发明授权
Multi-source based knowledge data for artificial intelligence characters 有权

公开(公告)号：US12002470B1

公开(公告)日：2024-06-04

申请号：US18401544

申请日：2023-12-31

申请人： Theai, Inc.

发明人： Ilya Gelfenbeyn , Mikhail Ermolenko , Kylan Gibbs , Kirill Ryzhov , Nathan Yu

IPC分类号： G10L15/00 , G06F16/332 , G10L13/033 , G10L15/183 , G10L15/22 , G10L15/30 , G06F40/30 , G10L15/18

CPC分类号： G10L15/22 , G06F16/3329 , G10L13/033 , G10L15/183 , G10L15/30 , G06F40/30 , G10L15/1822

摘要： Systems and methods for providing multi-source based knowledge data for Artificial Intelligence (AI) characters are provided. An example method includes providing a plurality of data sources; receiving, from a user, at least one word during a conversation between the user and an AI character; ascertaining a speech style of the AI character; analyzing the at least one word to determine a type of information needed to generate a reply to the user; selecting, based on the type of information, at least one data source from the plurality of data sources; generating, based on the at least one word, one or more queries; sending the one or more queries to the at least one data source; receiving one or more responses from the at least one data source; forming, based on the one or more responses and the speech style of the AI character, the reply for providing to the user.

5.

发明公开
SYSTEM FOR REPLY GENERATION 审中-公开

公开(公告)号：US20240096236A1

公开(公告)日：2024-03-21

申请号：US18038520

申请日：2021-11-09

申请人： ROLLS-ROYCE PLC

发明人： Stuart Brian MOSS , Muhannad Abdul Rahman ALOMARI , James Frederick Sebastian ARNEY

IPC分类号： G09B21/00 , G06F3/01 , G10L13/033 , G10L15/06 , G10L15/18 , G10L15/22

CPC分类号： G09B21/00 , G06F3/013 , G10L13/033 , G10L15/063 , G10L15/18 , G10L15/22

摘要： A device for generating conversational replies, including a processor with a memory; a speech input module, a user input module; a natural language processing module including one or more encoder-decode modules; the device being configured to: record portions of a conversation through the speech input module, use a speech recognition module to identify words in the conversation, and when one or more words have been recognised: generate one or more responses based on the one or more words using the natural language processing module; selecting a group of the context sensitive responses, prompt the user via the user input module to select a response from the group, output the selected response.

6.

发明授权
Digital assistant voice input integration 有权

公开(公告)号：US11915696B2

公开(公告)日：2024-02-27

申请号：US17379777

申请日：2021-07-19

申请人： Microsoft Technology Licensing, LLC

发明人： Derek Liddell , Francis Zhou , Cheng-Yi Yen

IPC分类号： G10L15/22 , G06F3/16 , G10L13/033 , G10L15/24 , G10L15/26

CPC分类号： G10L15/22 , G06F3/167 , G10L13/033 , G10L2015/223 , G10L2015/227 , G10L2015/228 , G10L15/24 , G10L15/26

摘要： A digital assistant supported on devices such as smartphones, tablets, personal computers, game consoles, etc. includes an extensibility client that exposes an interface and service that enables third party applications to be integrated with the digital assistant so the application user experiences are rendered using the native voice of the digital assistant. Specific voice inputs associated with a given application may be registered by developers using a manifest that is loaded when the application is launched on the device so that voice inputs from the device user can be mapped by the digital assistant extensibility client to the appropriate application as input events for consumption. In typical implementations, the manifest is arranged as a declarative document that streamlines application development and provides a seamless user experience by enabling customization of third party applications to integrate the digital assistant's voice and behaviors within the user experience of the application's domain.

7.

发明授权
Text-based virtual object animation generation method, apparatus, storage medium, and terminal 有权

公开(公告)号：US11908451B2

公开(公告)日：2024-02-20

申请号：US18024021

申请日：2021-08-09

申请人： MOFA (SHANGHAI) INFORMATION TECHNOLOGY CO., LTD. , SHANGHAI MOVU TECHNOLOGY CO., LTD.

发明人： Congyi Wang , Yu Chen , Jinxiang Chai

IPC分类号： G10L13/10 , G06T13/00 , G10L13/033 , G10L13/047 , G10L15/02 , G10L15/26

CPC分类号： G10L13/10 , G06T13/00 , G10L13/033 , G10L13/047 , G10L15/02 , G10L15/26 , G10L2013/105

摘要： A text-based virtual object animation generation includes acquiring text information, where the text information includes an original text of a virtual object animation to be generated; analyzing an emotional feature of the text information; performing speech synthesis according to the emotional feature, a rhyme boundary, and the text information to obtain audio information, where the audio information includes emotional speech obtained by conversion based on the original text; and generating a corresponding virtual object animation based on the text information and the audio information, where the virtual object animation is synchronized in time with the audio information.

8.

发明授权
Conversational agent response determined using a sentiment 有权

公开(公告)号：US11900938B2

公开(公告)日：2024-02-13

申请号：US17867161

申请日：2022-07-18

申请人： Google LLC

发明人： Johnny Chen , Thomas L. Dean , Qiangfeng Peter Lau , Sudeep Gandhe , Gabriel Schine

IPC分类号： G10L15/00 , G10L15/22 , G10L17/22 , H04L67/104 , G10L15/26 , G10L13/00 , G06F16/332 , G10L15/18 , G10L13/033 , G10L15/30 , G10L13/08 , G06F21/62

CPC分类号： G10L15/22 , G06F16/3329 , G06F21/6245 , G10L13/00 , G10L13/033 , G10L13/08 , G10L15/1815 , G10L15/1822 , G10L15/26 , G10L15/30 , G10L17/22 , H04L67/104 , G10L2015/223 , G10L2015/228

摘要： Methods, systems, and apparatus, including computer programs encoded on computer storage media, for handing off a user conversation between computer-implemented agents. One of the methods includes receiving, by a computer-implemented agent specific to a user device, a digital representation of speech encoding an utterance, determining, by the computer-implemented agent, that the utterance specifies a requirement to establish a communication with another computer-implemented agent, and establishing, by the computer-implemented agent, a communication between the other computer-implemented agent and the user device.

9.

发明公开
DEVICES AND METHODS FOR A SPEECH-BASED USER INTERFACE 审中-公开

公开(公告)号：US20240029706A1

公开(公告)日：2024-01-25

申请号：US18479785

申请日：2023-10-02

申请人： Google LLC

发明人： Ioannis Agiomyrgiannakis , Fergus James Henderson

IPC分类号： G10L13/033 , G06F3/16 , G10L13/10

CPC分类号： G10L13/033 , G06F3/167 , G10L13/10 , G10L2021/0135

摘要： A device may identify a plurality of sources for outputs that the device is configured to provide. The plurality of sources may include at least one of a particular application in the device, an operating system of the device, a particular area within a display of the device, or a particular graphical user interface object. The device may also assign a set of distinct voices to respective sources of the plurality of sources. The device may also receive a request for speech output. The device may also select a particular source that is associated with the requested speech output. The device may also generate speech having particular voice characteristics of a particular voice assigned to the particular source.

10.

发明授权
Systems and methods for dynamically improving user intelligibility of synthesized speech in a work environment 有权

公开(公告)号：US11817078B2

公开(公告)日：2023-11-14

申请号：US18328189

申请日：2023-06-02

申请人： Vocollect, Inc.

发明人： James Hendrickson , Debra Drylie Stiffey , Duane Littleton , John Pecorari , Arkadiusz Slusarczyk

IPC分类号： G10L13/00 , G10L13/02 , G10L13/033

CPC分类号： G10L13/02 , G10L13/033

摘要： A method and apparatus that dynamically adjust operational parameters of a text-to-speech engine in a speech-based system are disclosed. A voice engine or other application of a device provides a mechanism to alter the adjustable operational parameters of the text-to-speech engine. In response to one or more environmental conditions, the adjustable operational parameters of the text-to-speech engine are modified to increase the intelligibility of synthesized speech.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类