摘要:
A system processes an audio signal using spectrally orthogonal sound components. The system includes a circuitry that generates a mid component and a side component from a left channel and a right channel of the audio signal. The circuitry generates a hyper mid component including spectral energy of the side component removed from spectral energy of the mid component, and generate a residual mid component including spectral energy of the hyper mid component removed from the spectral energy of the mid component. The circuitry filters subbands of the residual mid component, such as to apply a subband spatial processing. The circuitry generates a left output channel and a right output channel using the filtered subbands of the residual mid component.
摘要:
An embodiment of the present invention provides an apparatus for noise canceling that includes: an input unit configured to receive an input voice signal that is a target of noise canceling; and a processor configured to generate a first voice signal by canceling noise from the input voice signal on the basis of a noise canceling model which is trained using a plurality of reference voice signals through a deep learning algorithm, generate a second voice signal by canceling residual noise from the first voice signal on the basis of statistical analysis, and generate an output voice signal corresponding to the second voice signal.
摘要:
Methods, apparatus, systems, and articles of manufacture are disclosed to fingerprint audio via mean normalization. An example apparatus for audio fingerprinting includes a frequency range separator to transform an audio signal into a frequency domain, the transformed audio signal including a plurality of time-frequency bins including a first time-frequency bin, an audio characteristic determiner to determine a first characteristic of a first group of time-frequency bins of the plurality of time-frequency bins, the first group of time-frequency bins surrounding the first time-frequency bin and a signal normalizer to normalize the audio signal to thereby generate normalized energy values, the normalizing of the audio signal including normalizing the first time-frequency bin by the first characteristic. The example apparatus further includes a point selector to select one of the normalized energy values and a fingerprint generator to generate a fingerprint of the audio signal using the selected one of the normalized energy values.
摘要:
Audio distortion by a speaker may be reduced by detecting onset audio events within an audio signal and modifying the audio to reduce the audio distortion perceived by a listener. The onsets may be detected using a psych-acoustic model by determining critical sub-band powers and corresponding masking thresholds. When a loudness value calculated from the CSBs and masking thresholds exceeds a threshold level, certain frequency bands may be attenuated and other frequency bands may be amplified. The audio modification may be performed on a frame-by-frame basis and each frame may be processed multiple times until the onset is sufficiently masked or attenuated.
摘要:
A signal processing device, a signal processing method, a speaker and an electronics apparatus. The signal processing device comprises a multi-band dynamic range controller, wherein the multi-band dynamic range controller receives an audio signal (S1100) and includes a first band splitting unit and a resonant band adjustment unit; the first band splitting unit is configured to split the audio signal into multiple bands and obtain at least one resonant band therefrom (S1200), which has a resonant frequency band signal in a resonant frequency range of the audio signal; and the resonant band adjustment unit is configured to adjust the resonant frequency band signal based at least on a resonant band dynamic range control gain (S1300) and output an adjusted resonant frequency band signal (S1400) for combination with other band signals into a compression output signal.
摘要:
The present invention relates to a method for measuring behavioral change in human consciousness that is based on 12 different personality consciousness codes, wherein each code enables to instantly change the state-of-mind of an individual person. The method comprises: a) storing reference voice characteristics of different persons that represent acoustic information as expressed by human voice in a form of a time to frequency component relation; b) classifying the acoustic information into 12 different personality consciousness codes by using support vector machine that analyzes said acoustic information; c) receiving data indicative of a sound energy generated by the voice of said individual; d) performing spectral analysis of said received sound energy in order to obtain voice characteristics from an electronic representation of said sound energy; and e) comparing said obtained voice characteristics with the reference voice characteristics and determining the personality consciousness code of said individual by using the support vector machines, and using the obtained voice characteristics to determine the level of consciousness.
摘要:
A method and device for detecting errors when practicing fluency shaping exercises, are presented. The method includes receiving a set of initial energy levels; setting a set of thresholds to their respective initial values; receiving a voice production of a user practicing a fluency shaping exercise; analyzing the received voice production to compute a set of energy levels composing the voice production; detecting based on the computed set of energy levels, the set of initial energy levels, and the set of a threshold of at least one speech-related error, wherein the detection of the at least one speech-related error is respective of the fluency shaping exercise being practiced by the user; and upon detection of the at least one speech-related error, generating a feedback indicating the at least one detected speech-related error.