摘要:
A sentence or a singing is to be synthesized with a natural speech close to the human voice. To this end, singing metrical data are formed in a tag processing unit 211 in a singing synthesis unit 212 in a speech synthesis apparatus 200 based on singing data and an analyzed text portion. A language analysis unit 213 performs language processing on text portions other than the singing data. As for a text portion registered in a natural metrical dictionary, as determined by this language processing, corresponding natural metrical data is selected and its parameters are adjusted in a metrical data adjustment unit 222 based on phonemic segment data of a phonemic segment storage unit 223 in the metrical data adjustment unit 222. As for a text portion not registered in the natural metrical dictionary, a phonemic symbol string is generated in a natural metrical dictionary storage unit 214, after which metrical data are generated in a metrical generating unit 221. A waveform generating unit 224 concatenates necessary phonemic segment data, based on the natural metrical data, metrical data and the singing metrical data to generate speech waveform data.
摘要:
A signal processing system comprises a delta-sigma modulation unit supplied with an input analog signal for producing one-bit data as a result of a delta-sigma modulation, an arithmetic unit supplied with the output one-bit data of the delta-sigma modulation unit for applying a predetermined arithmetic operation thereon, an integration unit for integrating an output of the arithmetic unit to produce multiple-bit data as an output, a comparator unit supplied with the multiple-bit data from the integration unit at a first input port and multiple-bit reference data at a second input port for producing one-bit data as a result of comparison, a feedback unit supplied with the output of the comparator unit for producing the multiple-bit reference data based upon the one-bit data produced by the comparator unit such that said reference data predicts the multiple-bit data supplied to the first input port, and a digital-to-analog conversion unit supplied with said output one-bit data of the comparator unit for converting the same to an analog output signal.
摘要:
A speech synthesis apparatus and a speech synthesis method, in which a waveform of a desired formant shape may be generated with a small volume of computing operations. A voiced sound generating unit of the speech synthesis apparatus includes n single formant generating units, an adder for summing these outputs to generate a one-pitch waveform, a one-pitch buffer unit, and a waveform overlapping unit for overlapping a number of the one-pitch waveforms as the one-pitch waveform is shifted by one pitch period each time. Each single formant generating unit is supplied with three parameters, namely a center frequency of a formant representing the formant position, a formant bandwidth, and a formant gain and reads out the band characteristics waveform at a readout interval, derived from the bandwidth wn, from a band characteristics waveform storage unit to effect expansion along the time axis. The resulting waveform is multiplied with a sine wave of the center frequency to output a pitch waveform for a formant representing characteristics of a formant.
摘要:
Identification information, access information indicating that a main body of a mail message has not been accessed nor read at a destination terminal, information about the sender, information about the receiver, time and date information, and subject information each collated to each other is stored and managed as a transmitted mail managing information in a mail box. When a mail message is transmitted, a transmitted mail message including a return mail program actuated to return a response, when the transmitted mail message is accessed and read at the destination terminal, an acknowledgement that the transmitted mail message has been accessed and read and identification information for the transmitted mail message as a returned mail to a server. When information as to whether the transmitted mail message has been accessed and read or not is obtained from the received return mail message, the identification information is extracted from the received return mail message and the access information for the transmitted mail message corresponding to the identification information is changed to information indicating that the transmitted mail message has been accessed and read.
摘要:
A voice-generating information making apparatus comprises: a talking way data storing section for storing therein talking way data comprising character string information grouped according to the character string information, a character string input unit for inputting a character string (consisting of a control section, an application storing section, a key entry section, and a display section), a retrieving unit for retrieving a group having the same character string information as the inputted character string, a voice tone data storing section for storing therein a plurality of voice tone data, a voice synthesizing section for synthesizing a voice, a voice selecting unit for selecting a desired voice from the synthesized voice, and a voice-generating document storing section for storing therein talking way data corresponding to the selected voice as a voice-generating document in correlation to the inputted character string.
摘要:
A speech synthesis apparatus and a speech synthesis method, in which a waveform of a desired formant shape may be generated with a small volume of computing operations. A voiced sound generating unit of the speech synthesis apparatus includes n single formant generating units, an adder for summing these outputs to generate a one-pitch waveform, a one-pitch buffer unit, and a waveform overlapping unit for overlapping a number of the one-pitch waveforms as the one-pitch waveform is shifted by one pitch period each time. Each single formant generating unit is supplied with three parameters, namely a center frequency of a formant representing the formant position, a formant bandwidth, and a formant gain and reads out the band characteristics waveform at a readout interval, derived from the bandwidth wn, from a band characteristics waveform storage unit to effect expansion along the time axis. The resulting waveform is multiplied with a sine wave of the center frequency to output a pitch waveform for a formant representing characteristics of a formant.
摘要:
Voice-generating information, comprising discrete voice data for velocity or pitch of a voice is made by dispensing the discrete data so that the voice data is not dependent on a time lag between phonemes and at the same time is present at a relative level against a reference thereof. The said information includes data on plural types of voice tone, and is stored in a voice-generating information storing section. Voice tone data indicating sound parameters for each voice element, such as phoneme for each voice tone type, is stored in a voice tone storing section. Voice data, corresponding to the type of voice tone in the voice-generating information stored in the voice-generating storing section, is selected from a plurality of voice type data stored in the voice tone storing section under control by a control section. Meter patterns, which occur successively in the direction of a time axis, are developed according to the voice-generating information. A voice waveform is synthesized according to the meter patterns and to the selected voice tone data with the voice outputted from a speaker.
摘要:
A rule based speech synthesis apparatus by which concatenation distortion may be less than a preset value without dependency on utterance, wherein a parameter correction unit reads out a target parameter for a vowel from a target parameter storage, responsive to the phoneme at the a leading end and at a trailing end of a speech element and acoustic feature parameters output from a speech element selector, and accordingly corrects the acoustic feature parameters of the speech element. The parameter correction unit corrects the parameters, so that the parameters ahead and behind the speech element are equal to the target parameter for the vowel of the corresponding phoneme, and outputs the so corrected parameters.
摘要:
An information communication system, having host and remote terminal devices, and method for generating a voice in which one voice tone data is selected from a plurality of types of voice tone data and stored according to received voice generating information. The voice is reproduced by generating a voice waveform according to a meter pattern and selected voice tone data. The discrete voice data may be presented for either one or both of velocity and pitch of a voice correlated to a time lag between discrete voice data. The discrete data is dispensed so that each voice data is not dependent on a time lag between phonemes and at the same time is present at a level relative to a reference value. Voice tone data indicating a sound parameter for each voice element such as a phoneme for each voice tone type is stored in a voice tone data storing section in a terminal device. File information is transferred from a host device to a terminal device according to a request from the terminal device, and the terminal device reads out voice tone data specified by the voice-generating information in the file information thereto from a voice tone storing section. A voice is synthesized according to the voice tone data and the voice generating information.
摘要:
In a speaker verification system, a detecting part detects a speech section of an input speech signal by using a time-series acoustic parameters thereof. A segmentation part calculates individuality information for segmentation by using the time-series acoustic parameters within the speech section, and segments the input speech section into a plurality of blocks based on the individuality information. A feature extracting part extracts features of an unknown speaker for every segmented block by using the time-series acoustic parameters. A distance calculating part calculates a distance between the features of the speaker extracted by the feature extracting part and reference features stored in a memory. A decision part makes a decision as to whether or not the unknown speaker is a real speaker by comparing the calculated distance with a predetermined threshold value. Segmentation is made by calculating a primary moment of the spectrum, over a block, and finding successive values which satisfy a predetermined criterion.