Singing Voice Separation with Deep U-Net Convolutional Networks

    公开(公告)号:US20250087232A1

    公开(公告)日:2025-03-13

    申请号:US18956913

    申请日:2024-11-22

    Applicant: Spotify AB

    Abstract: A system, method and computer product for training a neural network system. The method comprises applying an audio signal to the neural network system, the audio signal including a vocal component and a non-vocal component. The method also comprises comparing an output of the neural network system to a target signal, and adjusting at least one parameter of the neural network system to reduce a result of the comparing, for training the neural network system to estimate one of the vocal component and the non-vocal component. In one example embodiment, the system comprises a U-Net architecture. After training, the system can estimate vocal or instrumental components of an audio signal, depending on which type of component the system is trained to estimate.

    SINGING VOICE SEPARATION WITH DEEP U-NET CONVOLUTIONAL NETWORKS

    公开(公告)号:US20210256995A1

    公开(公告)日:2021-08-19

    申请号:US17135127

    申请日:2020-12-28

    Applicant: Spotify AB

    Abstract: A system, method and computer product for training a neural network system. The method comprises applying an audio signal to the neural network system, the audio signal including a vocal component and a non-vocal component. The method also comprises comparing an output of the neural network system to a target signal, and adjusting at least one parameter of the neural network system to reduce a result of the comparing, for training the neural network system to estimate one of the vocal component and the non-vocal component. In one example embodiment, the system comprises a U-Net architecture. After training, the system can estimate vocal or instrumental components of an audio signal, depending on which type of component the system is trained to estimate.

    Automatic isolation of multiple instruments from musical mixtures

    公开(公告)号:US11568256B2

    公开(公告)日:2023-01-31

    申请号:US17205296

    申请日:2021-03-18

    Applicant: Spotify AB

    Abstract: A system, method and computer product for training a neural network system. The method comprises inputting an audio signal to the system to generate plural outputs f(X, Θ). The audio signal includes one or more of vocal content and/or musical instrument content, and each output f(X, Θ) corresponds to a respective one of the different content types. The method also comprises comparing individual outputs f(X, Θ) of the neural network system to corresponding target signals. For each compared output f(X, Θ), at least one parameter of the system is adjusted to reduce a result of the comparing performed for the output f(X, Θ), to train the system to estimate the different content types. In one example embodiment, the system comprises a U-Net architecture. After training, the system can estimate various different types of vocal and/or instrument components of an audio signal, depending on which type of component(s) the system is trained to estimate.

    Singing voice separation with deep u-net convolutional networks

    公开(公告)号:US10923141B2

    公开(公告)日:2021-02-16

    申请号:US16055870

    申请日:2018-08-06

    Applicant: Spotify AB

    Abstract: A system, method and computer product for training a neural network system. The method comprises applying an audio signal to the neural network system, the audio signal including a vocal component and a non-vocal component. The method also comprises comparing an output of the neural network system to a target signal, and adjusting at least one parameter of the neural network system to reduce a result of the comparing, for training the neural network system to estimate one of the vocal component and the non-vocal component. In one example embodiment, the system comprises a U-Net architecture. After training, the system can estimate vocal or instrumental components of an audio signal, depending on which type of component the system is trained to estimate.

    SINGING VOICE SEPARATION WITH DEEP U-NET CONVOLUTIONAL NETWORKS

    公开(公告)号:US20200043516A1

    公开(公告)日:2020-02-06

    申请号:US16055870

    申请日:2018-08-06

    Applicant: Spotify AB

    Abstract: A system, method and computer product for training a neural network system. The method comprises applying an audio signal to the neural network system, the audio signal including a vocal component and a non-vocal component. The method also comprises comparing an output of the neural network system to a target signal, and adjusting at least one parameter of the neural network system to reduce a result of the comparing, for training the neural network system to estimate one of the vocal component and the non-vocal component. In one example embodiment, the system comprises a U-Net architecture. After training, the system can estimate vocal or instrumental components of an audio signal, depending on which type of component the system is trained to estimate.

    AUTOMATIC ISOLATION OF MULTIPLE INSTRUMENTS FROM MUSICAL MIXTURES

    公开(公告)号:US20200042879A1

    公开(公告)日:2020-02-06

    申请号:US16521756

    申请日:2019-07-25

    Applicant: Spotify AB

    Abstract: A system, method and computer product for training a neural network system. The method comprises inputting an audio signal to the system to generate plural outputs f(X, Θ). The audio signal includes one or more of vocal content and/or musical instrument content, and each output f(X, Θ) corresponds to a respective one of the different content types. The method also comprises comparing individual outputs f(X, Θ) of the neural network system to corresponding target signals. For each compared output f(X, Θ), at least one parameter of the system is adjusted to reduce a result of the comparing performed for the output f(X, Θ), to train the system to estimate the different content types. In one example embodiment, the system comprises a U-Net architecture. After training, the system can estimate various different types of vocal and/or instrument components of an audio signal, depending on which type of component(s) the system is trained to estimate.

Patent Agency Ranking