专利检索 ap:("Adobe Inc." OR "Princeton University") AND inv:"Adam Finkelstein" 第 1 页

1.

发明授权
Using a predictive model to automatically enhance audio having various audio quality issues 有权

公开(公告)号：US11514925B2

公开(公告)日：2022-11-29

申请号：US16863591

申请日：2020-04-30

申请人： Adobe Inc. , THE TRUSTEES OF PRINCETON UNIVERSITY

发明人： Zeyu Jin , Jiaqi Su , Adam Finkelstein

IPC分类号： G10L21/0364 , G10L25/30 , G10L25/18 , G06N3/08 , G06N3/04

摘要： Operations of a method include receiving a request to enhance a new source audio. Responsive to the request, the new source audio is input into a prediction model that was previously trained. Training the prediction model includes providing a generative adversarial network including the prediction model and a discriminator. Training data is obtained including tuples of source audios and target audios, each tuple including a source audio and a corresponding target audio. During training, the prediction model generates predicted audios based on the source audios. Training further includes applying a loss function to the predicted audios and the target audios, where the loss function incorporates a combination of a spectrogram loss and an adversarial loss. The prediction model is updated to optimize that loss function. After training, based on the new source audio, the prediction model generates a new predicted audio as an enhanced version of the new source audio.

2.

发明申请
TEXT-BASED INSERTION AND REPLACEMENT IN AUDIO NARRATION 审中-公开

公开(公告)号：US20190130894A1

公开(公告)日：2019-05-02

申请号：US15796292

申请日：2017-10-27

申请人： Adobe Inc. , The Trustees of Princeton University

发明人： Zeyu Jin , Gautham J. Mysore , Stephen DiVerdi , Jingwan Lu , Adam Finkelstein

IPC分类号： G10L13/08 , G10L13/07 , G10L13/04 , G10L15/02

CPC分类号： G10L13/08 , G06F17/24 , G10L13/00 , G10L13/04 , G10L13/06 , G10L13/07 , G10L15/02 , G10L21/00 , G10L2021/0135 , G11B27/022

摘要： Systems and techniques are disclosed for synthesizing a new word or short phrase such that it blends seamlessly in the context of insertion or replacement in an existing narration. In one such embodiment, a text-to-speech synthesizer is utilized to say the word or phrase in a generic voice. Voice conversion is then performed on the generic voice to convert it into a voice that matches the narration. An editor and interface are described that support fully automatic synthesis, selection among a candidate set of alternative pronunciations, fine control over edit placements and pitch profiles, and guidance by the editors own voice.

3.

发明公开
DISCONTINUITY MODELING OF COMPUTING FUNCTIONS 审中-公开

公开(公告)号：US20240281577A1

公开(公告)日：2024-08-22

申请号：US18170735

申请日：2023-02-17

申请人： Adobe Inc. , The Trustees of Princeton University

发明人： Connelly Stuart Barnes , Yuting Yang , Adam Finkelstein , Andrew Bensley Adams

IPC分类号： G06F30/27 , G06N3/084

CPC分类号： G06F30/27 , G06N3/084

摘要： Discontinuity modeling techniques of computing functions of a program are described. In one example, a program has a computing function that includes a discontinuity. An input is received by the data modeling system that identifies an axis. A plurality of samples is then generated by the data modeling system along the axis based on an output of the program. The samples are then used as a basis by the data modeling system to generate a data model that models the discontinuity. The data model includes, in one example, one or more gradients and models the discontinuity using a 1D box kernel.

4.

发明授权
Pose selection and animation of characters using video data and training techniques 有权

公开(公告)号：US11361467B2

公开(公告)日：2022-06-14

申请号：US16692450

申请日：2019-11-22

申请人： Adobe Inc. , Princeton University

发明人： Wilmot Li , Hijung Shin , Adam Finkelstein , Nora Willett

IPC分类号： G06T13/00 , G06T7/73 , G06T13/80 , G06T7/60 , G06K9/62 , G06T13/40 , G06V20/40 , G06V40/10

摘要： This disclosure generally relates to character animation. More specifically, this disclosure relates to pose selection using data analytics techniques applied to training data, and generating 2D animations of illustrated characters using performance data and the selected poses. An example process or system includes extracting sets of joint positions from a training video including the subject, grouping the plurality of frames into frame groups using the sets of joint positions for each frame, identifying a representative frame for each frame group using the frame groups, clustering the frame groups into clusters using the representative frames, outputting a visualization of the clusters at a user interface, and receiving a selection of a cluster for animation of the subject.

5.

发明授权
Text-based insertion and replacement in audio narration 有权

公开(公告)号：US10347238B2

公开(公告)日：2019-07-09

申请号：US15796292

申请日：2017-10-27

申请人： Adobe Inc. , The Trustees of Princeton University

发明人： Zeyu Jin , Gautham J. Mysore , Stephen DiVerdi , Jingwan Lu , Adam Finkelstein

IPC分类号： G10L13/08 , G10L15/02 , G10L13/04 , G10L13/07

摘要： Systems and techniques are disclosed for synthesizing a new word or short phrase such that it blends seamlessly in the context of insertion or replacement in an existing narration. In one such embodiment, a text-to-speech synthesizer is utilized to say the word or phrase in a generic voice. Voice conversion is then performed on the generic voice to convert it into a voice that matches the narration. An editor and interface are described that support fully automatic synthesis, selection among a candidate set of alternative pronunciations, fine control over edit placements and pitch profiles, and guidance by the editors own voice.

6.

发明申请
USING A PREDICTIVE MODEL TO AUTOMATICALLY ENHANCE AUDIO HAVING VARIOUS AUDIO QUALITY ISSUES 有权

公开(公告)号：US20210343305A1

公开(公告)日：2021-11-04

申请号：US16863591

申请日：2020-04-30

申请人： Adobe Inc. , THE TRUSTEES OF PRINCETON UNIVERSITY

发明人： Zeyu Jin , Jiaqi Su , Adam Finkelstein

IPC分类号： G10L21/02 , G10L25/30 , G10L25/18 , G06N3/04 , G06N3/08

摘要： Operations of a method include receiving a request to enhance a new source audio. Responsive to the request, the new source audio is input into a prediction model that was previously trained. Training the prediction model includes providing a generative adversarial network including the prediction model and a discriminator. Training data is obtained including tuples of source audios and target audios, each tuple including a source audio and a corresponding target audio. During training, the prediction model generates predicted audios based on the source audios. Training further includes applying a loss function to the predicted audios and the target audios, where the loss function incorporates a combination of a spectrogram loss and an adversarial loss. The prediction model is updated to optimize that loss function. After training, based on the new source audio, the prediction model generates a new predicted audio as an enhanced version of the new source audio.

7.

发明申请
POSE SELECTION AND ANIMATION OF CHARACTERS USING VIDEO DATA AND TRAINING TECHNIQUES 有权

公开(公告)号：US20210158593A1

公开(公告)日：2021-05-27

申请号：US16692471

申请日：2019-11-22

申请人： Adobe Inc. , Princeton University

发明人： Wilmot Li , Hijung Shin , Adam Finkelstein , Nora Willett

IPC分类号： G06T13/80 , G06T7/73 , G06T7/20 , G06T13/40

摘要： This disclosure generally relates to character animation. More specifically, but not by way of limitation, this disclosure relates to pose selection using data analytics techniques applied to training data, and generating 2D animations of illustrated characters using performance data and the selected poses. An example process or system includes obtaining a selection of training poses of the subject and a set of character poses, obtaining a performance video of the subject, wherein the performance video includes a plurality of performance frames that include poses performed by the subject, grouping the plurality of performance frames into groups of performance frames, assigning a selected training pose from the selection of training poses to each group of performance frames using the clusters of training frames, generating a sequence of character poses based on the groups of performance frames and their assigned training poses, outputting the sequence of character poses.

8.

发明申请
POSE SELECTION AND ANIMATION OF CHARACTERS USING VIDEO DATA AND TRAINING TECHNIQUES 有权

公开(公告)号：US20210158565A1

公开(公告)日：2021-05-27

申请号：US16692450

申请日：2019-11-22

申请人： Adobe Inc. , Princeton University

发明人： Wilmot Li , Hijung Shin , Adam Finkelstein , Nora Willett

IPC分类号： G06T7/73 , G06T13/80 , G06K9/00 , G06T7/60 , G06K9/62 , G06T13/40

摘要： This disclosure generally relates to character animation. More specifically, this disclosure relates to pose selection using data analytics techniques applied to training data, and generating 2D animations of illustrated characters using performance data and the selected poses. An example process or system includes extracting sets of joint positions from a training video including the subject, grouping the plurality of frames into frame groups using the sets of joint positions for each frame, identifying a representative frame for each frame group using the frame groups, clustering the frame groups into clusters using the representative frames, outputting a visualization of the clusters at a user interface, and receiving a selection of a cluster for animation of the subject.

9.

发明授权
Real-time speaker-dependent neural vocoder 有权

公开(公告)号：US10770063B2

公开(公告)日：2020-09-08

申请号：US16108996

申请日：2018-08-22

申请人： Adobe Inc. , The Trustees of Princeton University

发明人： Zeyu Jin , Gautham J. Mysore , Jingwan Lu , Adam Finkelstein

IPC分类号： G10L15/16 , G06F17/14 , G10L15/22 , G06N3/08 , G06N3/04

摘要： Techniques for a recursive deep-learning approach for performing speech synthesis using a repeatable structure that splits an input tensor into a left half and right half similar to the operation of the Fast Fourier Transform, performs a 1-D convolution on each respective half, performs a summation and then applies a post-processing function. The repeatable structure may be utilized in a series configuration to operate as a vocoder or perform other speech processing functions.

10.

发明授权
Pose selection and animation of characters using video data and training techniques 有权

公开(公告)号：US11282257B2

公开(公告)日：2022-03-22

申请号：US16692471

申请日：2019-11-22

申请人： Adobe Inc. , Princeton University

发明人： Wilmot Li , Hijung Shin , Adam Finkelstein , Nora Willett

IPC分类号： G06T13/00 , G06T13/80 , G06T7/73 , G06T7/20 , G06T13/40

摘要： This disclosure generally relates to character animation. More specifically, but not by way of limitation, this disclosure relates to pose selection using data analytics techniques applied to training data, and generating 2D animations of illustrated characters using performance data and the selected poses. An example process or system includes obtaining a selection of training poses of the subject and a set of character poses, obtaining a performance video of the subject, wherein the performance video includes a plurality of performance frames that include poses performed by the subject, grouping the plurality of performance frames into groups of performance frames, assigning a selected training pose from the selection of training poses to each group of performance frames using the clusters of training frames, generating a sequence of character poses based on the groups of performance frames and their assigned training poses, outputting the sequence of character poses.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类