摘要:
Disclosed herein is a pitch estimation apparatus and associated methods for estimating a fundamental frequency of an audio signal from a fundamental frequency probability density function by modeling the audio signal as a weighted mixture of a plurality of tone models corresponding respectively to harmonic structures of individual fundamental frequencies, so that the fundamental frequency probability density function of the audio signal is given as a distribution of respective weights of the plurality of the tone models.
摘要:
In a pitch estimation apparatus, a function estimation part estimates a fundamental frequency probability density function of an audio signal by repeating a weight calculation process and an estimated shape specification process. The weight calculation process calculates a weight of each tone model of each fundamental frequency based on an estimated shape of each tone model of each fundamental frequency. The estimated shape indicates a degree of dominancy of a corresponding tone model in a total harmonic structure of the audio signal. The estimated shape specification process specifies each estimated shape of each tone model based on an amplitude spectrum of the audio signal, the harmonic structure of each tone model and the weight of each tone model. A similarity analysis part calculates a similarity index value indicating a degree of similarity between each tone model and corresponding estimated shape. A weight correction part reduces a weight of a tone model of a certain fundamental frequency having the similarity index value indicating that the tone model and the corresponding estimated shape are not similar to each other.
摘要:
A sound analysis apparatus stores sound source structure data defining a constraint on one or more of sounds that can be simultaneously generated by a sound source of an input audio signal. A form estimation part selects fundamental frequencies of one or more of sounds likely to be contained in the input audio signal with peaked weights from various fundamental frequencies during sequential updating and optimizing of weights of tone models corresponding to the various fundamental frequencies, so that the sounds of the selected fundamental frequencies satisfy the sound source structure data, and creates form data specifying the selected fundamental frequencies. A previous distribution imparting part imparts a previous distribution to the weights of the tone models corresponding to the various fundamental frequencies so as to emphasize weights corresponding to the fundamental frequencies specified by the form data created by the form estimation part.
摘要:
A sound analysis apparatus employs tone models which are associated with various fundamental frequencies and each of which simulates a harmonic structure of a performance sound generated by a musical instrument, then defines a weighted mixture of the tone models to simulate frequency components of the performance sound, further sequentially updates and optimizes weight values of the respective tone models so that a frequency distribution of the weighted mixture of the tone models corresponds to a distribution of the frequency components of the performance sound, and estimates the fundamental frequency of the performance sound based on the optimized weight values.
摘要:
A sound analysis apparatus employs tone models which are associated with various fundamental frequencies and each of which simulates a harmonic structure of a performance sound generated by a musical instrument, then defines a weighted mixture of the tone models to simulate frequency components of the performance sound, further sequentially updates and optimizes weight values of the respective tone models so that a frequency distribution of the weighted mixture of the tone models corresponds to a distribution of the frequency components of the performance sound, and estimates the fundamental frequency of the performance sound based on the optimized weight values.
摘要:
A sound analysis apparatus stores sound source structure data defining a constraint on one or more of sounds that can be simultaneously generated by a sound source of an input audio signal. A form estimation part selects fundamental frequencies of one or more of sounds likely to be contained in the input audio signal with peaked weights from various fundamental frequencies during sequential updating and optimizing of weights of tone models corresponding to the various fundamental frequencies, so that the sounds of the selected fundamental frequencies satisfy the sound source structure data, and creates form data specifying the selected fundamental frequencies. A previous distribution imparting part imparts a previous distribution to the weights of the tone models corresponding to the various fundamental frequencies so as to emphasize weights corresponding to the fundamental frequencies specified by the form data created by the form estimation part.
摘要:
Character value of a sound signal is extracted for each unit portion, and degrees of similarity between the character values of the individual unit portions are calculated and arranged in a matrix configuration. The matrix has arranged in each column the degrees of similarity acquired by comparing, for each of the unit portions, the sound signal and a delayed sound signal obtained by delaying the sound signal by a time difference equal to an integral multiple of a time length of the unit portion, and it has a plurality of the columns in association with different time differences. Repetition probability is calculated for each of the columns corresponding to the different time differences in the matrix. A plurality of peaks in a distribution of the repetition probabilities are identified. The loop region in the sound signal is identified by collating a reference matrix with the degree of similarity matrix.
摘要:
Character value of a sound signal is extracted for each unit portion, and degrees of similarity between the character values of the individual unit portions are calculated and arranged in a matrix configuration. The matrix has arranged in each column the degrees of similarity acquired by comparing, for each of the unit portions, the sound signal and a delayed sound signal obtained by delaying the sound signal by a time difference equal to an integral multiple of a time length of the unit portion, and it has a plurality of the columns in association with different time differences. Repetition probability is calculated for each of the columns corresponding to the different time differences in the matrix. A plurality of peaks in a distribution of the repetition probabilities are identified. The loop region in the sound signal is identified by collating a reference matrix with the degree of similarity matrix.
摘要:
It is an object of the present invention to provide an improved technique for searching for a tone data set of a phrase constructed in a rhythm pattern that satisfies a predetermined condition of similarity to a rhythm pattern intended by a user. The user inputs a rhythm pattern via a rhythm input device. An input rhythm pattern storage section stores the input rhythm pattern into a RAM on the basis of clock signals output from a bar line clock output section and trigger data included in the input rhythm pattern. A rhythm pattern search section searches through a rhythm database for a tone data set presenting the highest degree of similarity to the stored input rhythm pattern. A performance processing section causes a sound output section to audibly output the searched-out tone data set.
摘要:
A similarity assessment apparatus is provided for assessing a performance sound based on a model performance sound. In the apparatus, a probability density function generating unit divides data of a performance sound into a sequence of frames each having a predetermined temporal length, and generates a probability density function of a fundamental frequency for each frame of the performance sound. A probability density function providing portion provides a probability density function of a fundamental frequency for each frame of the model performance sound. A similarity assessment unit compares the generated probability density function of a frame of the performance sound with the provided probability density function of a frame of the model performance sound so as to assess a similarity between the performance sound and the model performance sound.