Using a predictive model to automatically enhance audio having various audio quality issues

    公开(公告)号:US11514925B2

    公开(公告)日:2022-11-29

    申请号:US16863591

    申请日:2020-04-30

    摘要: Operations of a method include receiving a request to enhance a new source audio. Responsive to the request, the new source audio is input into a prediction model that was previously trained. Training the prediction model includes providing a generative adversarial network including the prediction model and a discriminator. Training data is obtained including tuples of source audios and target audios, each tuple including a source audio and a corresponding target audio. During training, the prediction model generates predicted audios based on the source audios. Training further includes applying a loss function to the predicted audios and the target audios, where the loss function incorporates a combination of a spectrogram loss and an adversarial loss. The prediction model is updated to optimize that loss function. After training, based on the new source audio, the prediction model generates a new predicted audio as an enhanced version of the new source audio.

    USING A PREDICTIVE MODEL TO AUTOMATICALLY ENHANCE AUDIO HAVING VARIOUS AUDIO QUALITY ISSUES

    公开(公告)号:US20210343305A1

    公开(公告)日:2021-11-04

    申请号:US16863591

    申请日:2020-04-30

    摘要: Operations of a method include receiving a request to enhance a new source audio. Responsive to the request, the new source audio is input into a prediction model that was previously trained. Training the prediction model includes providing a generative adversarial network including the prediction model and a discriminator. Training data is obtained including tuples of source audios and target audios, each tuple including a source audio and a corresponding target audio. During training, the prediction model generates predicted audios based on the source audios. Training further includes applying a loss function to the predicted audios and the target audios, where the loss function incorporates a combination of a spectrogram loss and an adversarial loss. The prediction model is updated to optimize that loss function. After training, based on the new source audio, the prediction model generates a new predicted audio as an enhanced version of the new source audio.

    POSE SELECTION AND ANIMATION OF CHARACTERS USING VIDEO DATA AND TRAINING TECHNIQUES

    公开(公告)号:US20210158593A1

    公开(公告)日:2021-05-27

    申请号:US16692471

    申请日:2019-11-22

    摘要: This disclosure generally relates to character animation. More specifically, but not by way of limitation, this disclosure relates to pose selection using data analytics techniques applied to training data, and generating 2D animations of illustrated characters using performance data and the selected poses. An example process or system includes obtaining a selection of training poses of the subject and a set of character poses, obtaining a performance video of the subject, wherein the performance video includes a plurality of performance frames that include poses performed by the subject, grouping the plurality of performance frames into groups of performance frames, assigning a selected training pose from the selection of training poses to each group of performance frames using the clusters of training frames, generating a sequence of character poses based on the groups of performance frames and their assigned training poses, outputting the sequence of character poses.

    POSE SELECTION AND ANIMATION OF CHARACTERS USING VIDEO DATA AND TRAINING TECHNIQUES

    公开(公告)号:US20210158565A1

    公开(公告)日:2021-05-27

    申请号:US16692450

    申请日:2019-11-22

    摘要: This disclosure generally relates to character animation. More specifically, this disclosure relates to pose selection using data analytics techniques applied to training data, and generating 2D animations of illustrated characters using performance data and the selected poses. An example process or system includes extracting sets of joint positions from a training video including the subject, grouping the plurality of frames into frame groups using the sets of joint positions for each frame, identifying a representative frame for each frame group using the frame groups, clustering the frame groups into clusters using the representative frames, outputting a visualization of the clusters at a user interface, and receiving a selection of a cluster for animation of the subject.

    Pose selection and animation of characters using video data and training techniques

    公开(公告)号:US11282257B2

    公开(公告)日:2022-03-22

    申请号:US16692471

    申请日:2019-11-22

    摘要: This disclosure generally relates to character animation. More specifically, but not by way of limitation, this disclosure relates to pose selection using data analytics techniques applied to training data, and generating 2D animations of illustrated characters using performance data and the selected poses. An example process or system includes obtaining a selection of training poses of the subject and a set of character poses, obtaining a performance video of the subject, wherein the performance video includes a plurality of performance frames that include poses performed by the subject, grouping the plurality of performance frames into groups of performance frames, assigning a selected training pose from the selection of training poses to each group of performance frames using the clusters of training frames, generating a sequence of character poses based on the groups of performance frames and their assigned training poses, outputting the sequence of character poses.