摘要:
The present invention records a loudness-level-reference segment of audio when creating speech audio files and audio files including background sounds. The speech audio files can then be combined with the background sound containing audio files in any desirable combination. When combining the files, the relative audio level of the files is matched, by matching the loudness-level-reference segments with each other. Any of a variety of known digital signal processing techniques can be used to normalize the component audio files. The combined audio files containing speech and background sounds (e.g. ambient noise) having matching relative audio levels can be used to test and/or train a speech recognition engine or a speech processing system.
摘要:
A computer system includes a central processing unit (CPU) for providing computer output signals; a non-computational base transceiver coupled with the CPU. The base transceiver comprises an interface for communicating with the CPU; a modulator for receiving computer output signals and providing modulated signals representing said computer output signals; and a transmitter for receiving said modulated signals, and transmitting said modulated signals via wireless media; and a non-computational remote transceiver comprising: a receiver for receiving signals via wireless media; a transmitter for transmitting signals via wireless media.
摘要:
A method of automatically selecting processing parameters for encoding digital content. Metadata containing the genre of the digital content, receiving the compression level selected for encoding the digital content is received. An algorithm selected for encoding the digital content is received. And a previously defined table to select the processing parameters for encoding the digital content based on the genre of the content, the compression level selected and the algorithm selected is indexed and the processing parameters are retrieved. In accordance with another aspect of the invention, an apparatus is described to carry out the above method
摘要:
The present invention records a loudness-level-reference segment of audio when creating speech audio files and audio files including background sounds. The speech audio files can then be combined with the background sound containing audio files in any desirable combination. When combining the files, the relative audio level of the files is matched, by matching the loudness-level-reference segments with each other. Any of a variety of known digital signal processing techniques can be used to normalize the component audio files. The combined audio files containing speech and background sounds (e.g. ambient noise) having matching relative audio levels can be used to test and/or train a speech recognition engine or a speech processing system.