摘要:
A text to speech system (100) uses differential voice coding (230, 416) to compress a database of digitized speech waveform segments (210). A seed waveform (535) is used to precondition each speech waveform prior to encoding which, upon encoding, provides a seeded preconditioned encoded speech token (550). The seed portion (541) may be removed and the preconditioned encoded speech token portion (542) may be stored in a database for text to speech synthesis. When speech it to be synthesized, upon requesting the appropriate speech waveform for the present sound to be produced, the seed portion is preappended to the preconditioned encoded speech token for differential decoding.
摘要:
A hands-free digital push-to-talk device (102) includes a digital background noise suppressor (302), a digital voice activity detector (304), an audio buffer (306), as well as a decision handler (308), embedded inside the device's (102) digital signal processor (222). Audio is buffered until the decision handler (308) determines that speech is present on an audio stream fed to the voice activity detector (304). The decision handler (308) makes the decision by assigning weighted values to each voice activity detector (304) determination, the weighted value varying depending on the state of the device (102) and temporal distance from the present time.