摘要:
Provided are systems and methods for image enhancement based on combining multiple related images, such as images of the same object taken from different imaging angles. This approach allows simulating images captured from longer distances using telephoto lenses. Initial images may be captured using simple cameras equipped with shorter focal length lenses, typically used on camera phones, tablets, and laptops. The initial images may be taken using two different cameras positioned a certain distance from each other. An object or, more specifically, a center line of the object is identified in each image. The object is typically present in the foreground portion of the initial images. The initial images may be cross-faded along the object center line to yield a combined image. Separating of the foreground and background portions of each image may be separated and separately processed, such as blurring the background portion and sharpening the foreground portion.
摘要:
Systems and methods for utilizing inter-microphone level differences to attenuate noise and enhance speech are provided. In exemplary embodiments, energy estimates of acoustic signals received by a primary microphone and a secondary microphone are determined in order to determine an inter-microphone level difference (ILD). This ILD in combination with a noise estimate based only on a primary microphone acoustic signal allow a filter estimate to be derived. In some embodiments, the derived filter estimate may be smoothed. The filter estimate is then applied to the acoustic signal from the primary microphone to generate a speech estimate.
摘要:
The present technology measures distortion introduced by a noise suppression system. The distortion may be measured as the difference between a noise-reduced speech signal and an estimated idealized noise reduced reference (EINRR). The EINRR may be determined from a speech component and noise component that are pre-processed, and the EINRR may be used with masks associated with energies lost and added in the speech component and noise component. The EINRR may be calculated on a time varying basis.
摘要:
Systems and methods for audio signal processing are provided. In exemplary embodiments, a filter cascade of complex-valued filters are used to decompose an input audio signal into a plurality of frequency components or sub-band signals. These sub-band signals may be processed for phase alignment, amplitude compensation, and time delay prior to summation of real portions of the sub-band signals to generate a reconstructed audio signal.
摘要:
The present technology measures distortion introduced by a noise suppression system. The distortion may be measured as the difference between a noise-reduced speech signal and an estimated idealized noise reduced reference (EINRR). The EINRR may be determined from a speech component and noise component that are pre-processed, and the EINRR may be used with masks associated with energies lost and added in the speech component and noise component. The EINRR may be calculated on a time varying basis.
摘要:
Systems and methods for audio signal processing are provided. In exemplary embodiments, a filter cascade of complex-valued filters are used to decompose an input audio signal into a plurality of frequency components or sub-band signals. These sub-band signals may be processed for phase alignment, amplitude compensation, and time delay prior to summation of real portions of the sub-band signals to generate a reconstructed audio signal.
摘要:
A system and method are disclosed for analyzing an input signal into a plurality of frequency components. In one embodiment, the input signal is processed with a first set of low pass filters to derive a first set of frequency components wherein the first set of low pass filters are arranged serially in a chain having a first low pass filter and a last low pass filter, the output of each low pass filter being fed to the next low pass filter in the chain until the last low pass filter. The output of the last low pass filter is downsampled to produce a downsampled signal. The downsampled signal is processed with a second set of low pass filters to derive a second set of frequency components.
摘要:
Background audio can be provided during telephonic communication. Telephonic communication can be established via a network, such as between a user of a telephony device and a communication partner having a second telephony device. A voice signal may be received from the user via a microphone integral with the telephony device. An audio track can be retrieved, for example, from memory integral with the telephony device or from a third-party service provider via the communications network. Noise reduction is performed on the voice signal to produce a clean voice signal. The clean voice signal may be combined with the audio track to produce a combined signal, such that the audio track provides background audio to the clean voice signal. The combined signal can then be transmitted from the telephony device to the second telephony device via the communications network.
摘要:
Systems and methods for audio signal processing are provided. In exemplary embodiments, a filter cascade of complex-valued filters are used to decompose an input audio signal into a plurality of frequency components or sub-band signals. These sub-band signals may be processed for phase alignment, amplitude compensation, and time delay prior to summation of real portions of the sub-band signals to generate a reconstructed audio signal.
摘要:
A method of removing reverberation from audio signals is disclosed. The method comprises spectro-temporally analyzing the first audio signal and the second audio signal to derive an energy function of time for a plurality of frequency bands. The method further comprises determining a delay stability between the energy function of time for the first audio signal and the second audio signal in each band, determining a gain function in each band based on the delay stability, adjusting the energy of the first audio signal and the second audio signal using the gain function within each band, and resynthesizing audio signals from the energy in each band of the first audio signal and the second audio signal.