Abstract:
Several embodiments of a digital speech signal enhancer are described that use an artificial neural network that produces clean speech coding parameters based on noisy speech coding parameters as its input features. A vocoder parameter generator produces the noisy speech coding parameters from a noisy speech signal. A vocoder model generator processes the clean speech coding parameters into estimated clean speech spectral magnitudes. In one embodiment, a magnitude modifier modifies an original frequency spectrum of the noisy speech signal using the estimated clean speech spectral magnitudes, to produce an enhanced frequency spectrum, and a synthesis block converts the enhanced frequency spectrum into time domain, as an output speech sequence. Other embodiments are also described.
Abstract:
An audio system includes one or more loudspeaker cabinets, each having loudspeakers. Sensing logic determines an acoustic environment of the loudspeaker cabinets. The sensing logic may include an echo canceller. A low frequency filter corrects an audio program based on the acoustic environment of the loudspeaker cabinets. The system outputs an omnidirectional sound pattern, which may be low frequency sound, to determine the acoustic environment. The system may produce a directional pattern superimposed on an omnidirectional pattern, if the acoustic environment is in free space. The system may aim ambient content toward a wall and direct content away from the wall, if the acoustic environment is not in free space. The sensing logic automatically determines the acoustic environment upon initial power up and when position changes of loudspeaker cabinets are detected. Accelerometers may detect position changes of the loudspeaker cabinets.
Abstract:
A speech recognition system for resolving impaired utterances can have a speech recognition engine configured to receive a plurality of representations of an utterance and concurrently to determine a plurality of highest-likelihood transcription candidates corresponding to each respective representation of the utterance. The recognition system can also have a selector configured to determine a most-likely accurate transcription from among the transcription candidates. As but one example, the plurality of representations of the utterance can be acquired by a microphone array, and beamforming techniques can generate independent streams of the utterance across various look directions using output from the microphone array.
Abstract:
Systems and methods for determining the operating condition of multiple microphones of an electronic device are disclosed. A system can include a plurality of microphones operative to receive signals, a microphone condition detector, and a plurality of microphone condition determination sources. The microphone condition detector can determine a condition for each of the plurality of microphones by using the received signals and accessing at least one microphone condition determination source.
Abstract:
Systems and methods for determining the operating condition of multiple microphones of an electronic device are disclosed. A system can include a plurality of microphones operative to receive signals, a microphone condition detector, and a plurality of microphone condition determination sources. The microphone condition detector can determine a condition for each of the plurality of microphones by using the received signals and accessing at least one microphone condition determination source.