摘要:
A voice recognizing method apparatus in which an input voice is recognized by obtaining a similar pattern by comparing the input voice and voice standard patterns. Voice standard patterns are stored into a memory. A voice is inputted. Voice duration lengths and distances are calculated by performing matching processes between the input voice and the standard patterns. The distance is corrected in accordance with the voice duration length so that the voice duration length having the best matching result is used as a reference, or such that the distance is small as the voice duration length is long. A recognition result is determined in accordance with the corrected distance. The matching is executed by a word spotting method. The input voice to be matched and the voice standard patterns are expressed by voice characteristic parameters.
摘要:
An apparatus and method for recognizing speech includes a memory for storing data representing a reference pattern composed of the combination of a word reference pattern and a silence pattern, and a calculator for calculating the differences between data representing the reference pattern and data representing input speech. The use of such a silence pattern in the reference pattern permits a word such as "other" to be distinguished from the word "mother".
摘要:
A method for encoding syllables of a language, particularly the Japanese language, and for facilitating the extraction of sound codes from the input syllables, for voice recognition or voice synthesis includes the step of providing a syllable classifying table, in which each syllable is represented by an upper byte code indicating the consonant part of the syllable and a lower byte code indicating the non-consonant part of the syllable. The consonants constitute a first category of data classified by phonetic features, while the non-consonants constitute a second category of data classified by phonetic features, so that the extraction of consonant or non-consonant sounds can be made by a search in only the first or the second categories. The encoding of diphthongs are made in such a manner that those containing the same vowel have the same remainder corresponding to the code of this vowel, when the codes are divided by the number of vowels contained in the second category, so that the extraction of a vowel from diphthongs can be achieved by a simple mathematical division.
摘要:
A speech recognition method and apparatus in which a speech section is sliced by the unit of a word by spotting and candidate words are selected. Next, in a second stage, matching is conducted by the unit of a phoneme. Consequently, selection of the candidate words and slicing of the speech section can be performed concurrently. Furthermore, narrowing of the candidate words is facilitated. Furthermore, since reference phoneme patterns under a plurality of environments are prepared, recognition of an input speech under a larger number of conditions is possible using a smaller amount of data when compared with the case in which reference word patterns under a plurality of environments are prepared.
摘要:
Speech recognition is achieved using a normalized cumulative distance. A normalized Dynamic Programming (DP) value is calculated by dividing a cumulative path distance by an optimal integral path length. The path length is calculated iteratively by adding 2 if the warping path is diagonal or by adding 3 if the warping path is horizontal or vertical. Distance may be calculated by measuring a difference between input power and average power. The power difference is weighted by a coefficient (.lambda.) between 0 and 1. A Mahalanobis distance is then weighted by (1-.lambda.) and added to the weighted power difference.
摘要:
The speech processing apparatus and method includes a microphone, an analyzer, a selector, and a memory. The microphone converts input speech into an electrical signal representing speech data. The analyzer converts the speech data into non-linear frequency converted speech data in accordance with a non-linear frequency conversion. The selector selects a coefficient of the non-linear frequency conversion suitable for each of the phonemes or frames of the speech. The memory stores the speech data.
摘要:
A method and apparatus for reading out a feature parameter and a driver sound source stored in a VCV (vowel-consonant-vowel) speech segment file, sequentially connecting the readout parameter and the readout sound source information in accordance with a predetermined rule, and supplying connected data to a speech synthesizer, thereby generating a speech output, includes a memory for storing the average power of each vowel, and a power controller for controlling the apparatus to normalize a VCV speech segment so that powers at both ends of each VCV segment coincide with the average power of each vowel.
摘要:
A voice processing apparatus capable of varying the speed of speech, in which a voice of a predetermined duration is represented by feature parameters and propriety information indicating whether a change in the speech speed is permitted or not. During voice synthesis, the speech speed is varied by skipping or repeating only the feature parameters for which the variation in speech speed is permitted by the associated propriety information.
摘要:
When gas is explosively burnt within a combustion chamber 5, a striking piston is impulsively driven to drive out a fastener from a nose portion. A gas tube 26 is provided between the combustion chamber 5 and a feed cylinder 21 which accommodates a feed piston that reciprocally moves a feed claw 23 engaged with and disengaged from connected fasteners to a nail feed direction for feeding to the nose portion side and an evacuation direction in opposite thereto. A check valve 31 having a restriction hole 33 is disposed on the way of the gas tube 26 so as to be opened and closed. The check valve 31 is normally urged by a spring to a closed direction and opens against the force of the spring by the gas pressure from the combustion chamber.
摘要:
The present invention provides a camera data transfer system comprising an imaging means for generating image data and outputting the same, and a camera interface means including a first holding means for holding one frame-preceding image data, a second holding means for holding the present image data, a comparing means for comparing the contents of the first and second holding means, and a bus interface means for controlling the input/output of data to and from a bus. In the camera data transfer system, the data is transferred to a frame memory of a display means connected to the bus through the bus interface means to cause the frame memory to display the data thereon. Further, when the data held in the first holding means and the data held in the second holding means are found not to coincide with each other as the result of comparison by the comparing means, only a place corresponding to the inconsistent data is transferred to the frame memory of the display means, and the data at the corresponding place of the first holding means is rewritten.