Abstract:
Methods and systems for modification of electronic system operation based on acoustic ambience classification are presented. In an example method, at least one audio signal present in a physical environment of a user is detected. The at least one audio signal is analyzed to extract at least one audio feature from the audio signal. The audio signal is classified based on the audio feature to produce at least one classification of the audio signal. Operation of an electronic system interacting with the user in the physical environment is modified based on the classification of the audio signal.
Abstract:
Streaming audio is received. The streaming audio includes a frame having plurality of samples. An energy estimate is obtained for the plurality of samples. The energy estimate is compared to at least one threshold. In addition, a band pass estimate of the signal is determined. An energy estimate is obtained for the band-passed plurality of samples. The two energy estimates are compared to at least one threshold each. Based upon the comparison operation, a determination is made as to whether speech is detected.
Abstract:
Provided are methods and systems for enhancing the intelligibility of an audio (e.g., speech) signal rendered in a noisy environment, subject to a constraint on the power of the rendered signal. A quantitative measure of intelligibility is the mean probability of decoding of the message correctly. The methods and systems simplify the procedure by approximating the maximization of the decoding probability with the maximization of the similarity of the spectral dynamics of the noisy speech to the spectral dynamics of the corresponding noise-free speech. The intelligibility enhancement procedures provided are based on this principle, and all have low computational cost and require little delay, thus facilitating real-time implementation.
Abstract:
A method on a mobile device (100) for processing an audio input is described. A trigger for the audio input is received. At least one parameter is determined for an audio processor (303) based on at least one input characteristic for the audio input. The audio input is routed to the audio processor (303) with the at least one parameter.
Abstract:
A method includes obtaining a speech sample from a pre-processing front-end (120) of a first device, identifying at least one condition, and selecting a voice recognition speech model from a database of speech models (160), the selected voice recognition speech model trained under the at least one condition. The method may include performing voice recognition on the speech sample using the selected speech model. A device includes a microphone signal pre-processing front end (120) and operating-environment logic (130), operatively coupled to the pre-processing front end (120. The operating-environment logic (130) is operative to identify at least one condition. A voice recognition configuration selector (140) is operatively coupled to the operating-environment logic (130), and is operative to receive information related to the at least one condition from the operating-environment logic (130) and to provide voice recognition logic (150) with an identifier (135) for a voice recognition speech model trained under the at least one condition.
Abstract:
One method of operation includes beamforming a plurality of microphone outputs to obtain a plurality of virtual microphone audio channels. Each virtual microphone audio channel corresponds to a beamform. The virtual microphone audio channels include at least one voice channel (135) and at least one noise channel (136). The method includes performing voice activity detection (151) on the at least one voice channel (135) and adjusting a corresponding voice beamform until voice activity detection (151) indicates that voice is present on the at least one voice channel (135). Another method beamforms the plurality of microphone outputs to obtain a plurality of virtual microphone audio channels, where each virtual microphone audio channel corresponds to a beamform, and with at least one voice channel (135) and at least one noise channel (136). The method performs voice recognition on the at least one voice channel (135) and adjusts the corresponding voice beamform to improve a voice recognition confidence metric (159).