摘要:
The present invention relates to a method of synthesizing a first sound signal based on a second sound signal, the first sound signal having a required first fundamental frequency and the second sound signal having a second fundamental frequency, the method comprising the steps of, a) determining of required pitch bell locations in the time domain of the first sound signal, the pitch bell locations being distanced by one period of the first fundamental frequency, b) providing of pitch bells by windowing the second sound signal on pitch bell locations in the time domain of the second sound signal, the pitch bell locations being distanced by one period of the second fundamental frequency, c) randomly selecting of a pitch bell from the provided pitch bells for each of the required pitch bell locations, d) performing an overlap and add operation on the selected pitch bells for synthesizing the first signal.
摘要:
In order to play waveform data back at a variable performance tempo by using waveform data which complies with a desired reference tempo, the present invention performs a timeline-expansion/contraction control on the waveform data to be played back, according to the relationship between the performance tempo and the reference tempo. The present invention also determines whether to limit the playback of the waveform data according to the relationship between the performance tempo and the reference tempo. In the case that playback is to be limited, the present invention stops playback of the waveform data, or reduces the resolution of playback processing and continues playback of the waveform data. The present invention stops playback of the waveform data when, for example, the relationship between the performance tempo and the reference tempo is a relationship in which the waveform data would be played back at a performance tempo which would cause a processing delay or a deterioration of sound quality. As a result, it is possible to preemptively prevent a system freeze and solve problems such as the generation of music which has a slower tempo than the desired performance tempo, or the generation of music which includes the intermittent cutting out of sound due to noise, or a significant reduction to sound quality.
摘要:
The invention concerns digital audio processing and in particular the detection of periods where samples can be deleted or repeated unobtrusively so as to change the average sample-rate or to provide time delay modification. Differences between succeeding sample values are evaluated and compared with a threshold and samples are deleted or repeated where two or more consecutive sample value differences are less than the said threshold value.
摘要:
A method for reproducing speech signals at a controlled speed whereby rate conversion of the time axis may be facilitated, and a method for synthesizing the speech whereby pitch conversion can be realized by a simplified structure based on the encoded speech data without changing the phoneme. With the speech reproducing method, an encoding unit 2 discriminates whether an input speech signal is voiced or unvoiced. Based on the results of discrimination, the encoding unit 2 performs sinusoidal synthesis and encoding for a signal portion found to be voiced, while performing vector quantization by closed-loop search for an optimum vector for a portion found to be unvoiced using an analysis-by-synthesis method, in order to find encoded parameters. The decoding unit 3 compands the time axis of the encoded parameters obtained every pre-set frames at a period modification unit 4 for modifying the output period of the parameters for creating modified encoded parameters associated with different time points corresponding to the pre-set frames. A speech synthesis unit 6 synthesizes the voiced speech portion and the unvoiced speech portion based on the modified encoded parameters. With the speech synthesizing unit, an encoded bit stream or encoded data is outputted by an encoded data outputting unit 301., Of these data, at least pitch data and amplitude data of the spectral envelope are sent via a data conversion unit 302 to a waveform synthesis unit 302, where the number of amplitude data of the spectral envelope is changed without changing the shape of the spectral envelope depending on a pitch desired pitch value. A waveform synthesis unit 303 synthesizes the speech waveform based on the converted spectral envelope data and pitch data.
摘要:
The reproducing apparatus of the invention reproduces a plurality of band signals which have been subjected to a band division and includes a time-scale modifier which receives the plurality of band signals and time-axis compresses the respective band signals at the same ratio, thereby outputting a plurality of time-axis compressed band signals and a synthesis filter bank for synthesizing the plurality of time-axis compressed band signals.
摘要:
A method of communicating speech comprising time-warping a residual low band speech signal to an expanded or compressed version of the residual low band speech signal, time-warping a high band speech signal to an expanded or compressed version of the high band speech signal, and merging the time-warped low band and high band speech signals to give an entire time-warped speech signal. In the low band, the residual low band speech signal is synthesized after time-warping of the residual low band signal while in the high band, an unwarped high band signal is synthesized before time-warping of the high band speech signal. The method may further comprise classifying speech segments and encoding the speech segments. The encoding of the speech segments may be one of code-excited linear prediction, noise-excited linear prediction or 1/8 frame (silence) coding.
摘要:
The present invention provides a method and system for processing an audio signal. According to an exemplary method, an audio signal such as a digital voice signal is received and divided into one or more individual unit cycles. An audio speed conversion operation is enabled by repeating or removing one or more of the individual unit cycles. In particular, repeating one or more of the individual unit cycles decreases audio speed, and removing one or more of the individual unit cycles increases audio speed.
摘要:
An apparatus for processing audio signals is provided with a memory for storing the audio signals. The audio signals are written in the memory at write addresses in the memory. The audio signals are read from the memory in accordance with reading addresses at a speed lower than a speed for writing the audio signals into the memory. It is determined whether an amount of audio signals stored in the memory and not yet read therefrom is increasing. The write addresses are then updated when the amount of audio signals not yet read is increasing. When small signals levels of which are lower than a reference level are detected among the audio signals, updating of the write addresses of the small signals may be halted.
摘要:
A method for reproducing speech signals at a controlled speed whereby rate conversion of the time axis may be facilitated, and a method for synthesizing the speech whereby pitch conversion can be realized by a simplified structure based on the encoded speech data without changing the phoneme. With the speech reproducing method, an encoding unit 2 discriminates whether an input speech signal is voiced or unvoiced. Based on the results of discrimination, the encoding unit 2 performs sinusoidal synthesis and encoding for a signal portion found to be voiced, while performing vector quantization by closed-loop search for an optimum vector for a portion found to be unvoiced using an analysis-by-synthesis method, in order to find encoded parameters. The decoding unit 3 compands the time axis of the encoded parameters obtained every pre-set frames at a period modification unit 4 for modifying the output period of the parameters for creating modified encoded parameters associated with different time points corresponding to the pre-set frames. A speech synthesis unit 6 synthesizes the voiced speech portion and the unvoiced speech portion based on the modified encoded parameters. With the speech synthesizing unit, an encoded bit stream or encoded data is outputted by an encoded data outputting unit 301., Of these data, at least pitch data and amplitude data of the spectral envelope are sent via a data conversion unit 302 to a waveform synthesis unit 302, where the number of amplitude data of the spectral envelope is changed without changing the shape of the spectral envelope depending on a pitch desired pitch value. A waveform synthesis unit 303 synthesizes the speech waveform based on the converted spectral envelope data and pitch data.