摘要:
Unsupervised learning algorithms for audio source separation such as non-negative matrix factorization (NMF) and principal components analysis (PCA) can be understood as a data matrix factorization subject to different constraints. These algorithms provide components with a relevant structure and homogeneous musical events. The invention presents an automatic fusion method to merge these components into tracks associated to the different instruments present in the sound source.
摘要:
A method is provided that comprises segmenting an audio source file; optimizing a model based upon probability; and separating the audio source file.
摘要:
A method for splitting a digital signal using prosodic features included in the signal is provided that includes calculating onset value locations in the signal. The onset values correspond to stress accents in the signal. Moreover, the method includes splitting, using a processor, the signal into a prosodic unit candidate sequence by superimposing the stress accent locations on the signal, and processing the sequence to include only true prosodic units.
摘要:
A source separation system is provided. The system includes a plurality of sources being subjected to an automatic source separation via a joint use of segmental information and spatial diversity. The system further includes a set of spectral shapes representing spectral diversity derived from the automatic source separation being automatically provided. The system still further includes a plurality of mixing parameters derived from the set of spectral shapes. Within a sampling range, a triplet is processed wherein a reconstruction of a Short Term Fourier Transform (STFT) corresponding to a source triplet among the set of triplets is performed.
摘要:
A method for determining user liveness is provided that includes calculating, by a computing device, a spectral property difference between voice biometric data captured from a user and user record voice biometric data. The user and the computing device constitute a user-computing device pair, and the voice biometric data is captured by the computing device during a verification transaction. Moreover, the method includes inputting the spectral property difference into a machine learning algorithm, calculating an output score with the machine learning algorithm, and determining the voice biometric data was captured from a live user when the output score satisfies a threshold score.