-
公开(公告)号:US11461649B2
公开(公告)日:2022-10-04
申请号:US16823538
申请日:2020-03-19
Applicant: Adobe Inc.
Inventor: Jongpil Lee , Nicholas J. Bryan , Justin J. Salamon , Zeyu Jin
Abstract: In implementations of searching for music, a music search system can receive a music search request that includes a music file including music content. The music search system can also receive a selected musical attribute from a plurality of musical attributes. The music search system includes a music search application that can generate musical features of the music content, where a respective one or more of the musical features correspond to a respective one of the musical attributes. The music search application can then compare the musical features that correspond to the selected musical attribute to audio features of audio files, and determine similar audio files to the music file based on the comparison of the musical features to the audio features of the audio files.
-
公开(公告)号:US20190318726A1
公开(公告)日:2019-10-17
申请号:US16108996
申请日:2018-08-22
Applicant: Adobe Inc. , The Trustees of Princeton University
Inventor: Zeyu Jin , Gautham J. Mysore , Jingwan Lu , Adam Finkelstein
Abstract: Techniques for a recursive deep-learning approach for performing speech synthesis using a repeatable structure that splits an input tensor into a left half and right half similar to the operation of the Fast Fourier Transform, performs a 1-D convolution on each respective half, performs a summation and then applies a post-processing function. The repeatable structure may be utilized in a series configuration to operate as a vocoder or perform other speech processing functions.
-
公开(公告)号:US11830481B2
公开(公告)日:2023-11-28
申请号:US17538683
申请日:2021-11-30
Applicant: Adobe Inc.
Inventor: Maxwell Morrison , Zeyu Jin , Nicholas Bryan , Juan Pablo Caceres Chomali , Lucas Rencker
IPC: G10L15/18 , G10L25/90 , G10L15/187 , G10L15/02 , G10L15/04 , G10L15/16 , G10L21/0208
CPC classification number: G10L15/1807 , G10L15/02 , G10L15/04 , G10L15/16 , G10L15/187 , G10L21/0208 , G10L25/90 , G10L2015/025 , G10L2021/02082
Abstract: Methods are performed by one or more processing devices for correcting prosody in audio data. A method includes operations for accessing subject audio data in an audio edit region of the audio data. The subject audio data in the audio edit region potentially lacks prosodic continuity with unedited audio data in an unedited audio portion of the audio data. The operations further include predicting, based on a context of the unedited audio data, phoneme durations including a respective phoneme duration of each phoneme in the unedited audio data. The operations further include predicting, based on the context of the unedited audio data, a pitch contour comprising at least one respective pitch value of each phoneme in the unedited audio data. Additionally, the operations include correcting prosody of the subject audio data in the audio edit region by applying the phoneme durations and the pitch contour to the subject audio data.
-
公开(公告)号:US11636342B2
公开(公告)日:2023-04-25
申请号:US17959011
申请日:2022-10-03
Applicant: Adobe Inc.
Inventor: Jongpil Lee , Nicholas J. Bryan , Justin J. Salamon , Zeyu Jin
Abstract: In implementations of searching for music, a music search system can receive a music search request that includes a music file including music content. The music search system can also receive a selected musical attribute from a plurality of musical attributes. The music search system includes a music search application that can generate musical features of the music content, where a respective one or more of the musical features correspond to a respective one of the musical attributes. The music search application can then compare the musical features that correspond to the selected musical attribute to audio features of audio files, and determine similar audio files to the music file based on the comparison of the musical features to the audio features of the audio files.
-
公开(公告)号:US20230097356A1
公开(公告)日:2023-03-30
申请号:US17959011
申请日:2022-10-03
Applicant: Adobe Inc.
Inventor: Jongpil Lee , Nicholas J. Bryan , Justin J. Salamon , Zeyu Jin
Abstract: In implementations of searching for music, a music search system can receive a music search request that includes a music file including music content. The music search system can also receive a selected musical attribute from a plurality of musical attributes. The music search system includes a music search application that can generate musical features of the music content, where a respective one or more of the musical features correspond to a respective one of the musical attributes. The music search application can then compare the musical features that correspond to the selected musical attribute to audio features of audio files, and determine similar audio files to the music file based on the comparison of the musical features to the audio features of the audio files.
-
16.
公开(公告)号:US11514925B2
公开(公告)日:2022-11-29
申请号:US16863591
申请日:2020-04-30
Applicant: Adobe Inc. , THE TRUSTEES OF PRINCETON UNIVERSITY
Inventor: Zeyu Jin , Jiaqi Su , Adam Finkelstein
IPC: G10L21/0364 , G10L25/30 , G10L25/18 , G06N3/08 , G06N3/04
Abstract: Operations of a method include receiving a request to enhance a new source audio. Responsive to the request, the new source audio is input into a prediction model that was previously trained. Training the prediction model includes providing a generative adversarial network including the prediction model and a discriminator. Training data is obtained including tuples of source audios and target audios, each tuple including a source audio and a corresponding target audio. During training, the prediction model generates predicted audios based on the source audios. Training further includes applying a loss function to the predicted audios and the target audios, where the loss function incorporates a combination of a spectrogram loss and an adversarial loss. The prediction model is updated to optimize that loss function. After training, based on the new source audio, the prediction model generates a new predicted audio as an enhanced version of the new source audio.
-
公开(公告)号:US20190130894A1
公开(公告)日:2019-05-02
申请号:US15796292
申请日:2017-10-27
Applicant: Adobe Inc. , The Trustees of Princeton University
Inventor: Zeyu Jin , Gautham J. Mysore , Stephen DiVerdi , Jingwan Lu , Adam Finkelstein
CPC classification number: G10L13/08 , G06F17/24 , G10L13/00 , G10L13/04 , G10L13/06 , G10L13/07 , G10L15/02 , G10L21/00 , G10L2021/0135 , G11B27/022
Abstract: Systems and techniques are disclosed for synthesizing a new word or short phrase such that it blends seamlessly in the context of insertion or replacement in an existing narration. In one such embodiment, a text-to-speech synthesizer is utilized to say the word or phrase in a generic voice. Voice conversion is then performed on the generic voice to convert it into a voice that matches the narration. An editor and interface are described that support fully automatic synthesis, selection among a candidate set of alternative pronunciations, fine control over edit placements and pitch profiles, and guidance by the editors own voice.
-
-
-
-
-
-