SYSTEMS AND METHODS FOR LYRICS ALIGNMENT
摘要:
A method includes obtaining lyrics text and audio for a media item and generating, using a first encoder, a first plurality of embeddings representing symbols that appear in the lyrics text for the media item. The method includes generating, using a second encoder, a second plurality of embeddings representing an acoustic representation of the audio for the media item. The method includes determining respective similarities between embeddings of the first plurality of embeddings and embeddings of the second plurality of embeddings and aligning the lyrics text and the audio for the media item based on the respective similarities. The method includes, while streaming the audio for the media item, providing, for display, the aligned lyrics text with the streamed audio.
公开/授权文献
信息查询
0/0