摘要:
A music classification technique computes histograms of Daubechies wavelet coefficients at various frequency subbands with various resolutions. The coefficients are then used as an input to a machine learning technique to identify the genre and emotional content of music.
摘要:
An electronic music system which imitates acoustic instruments addresses the problem wherein the audio spectrum of a a recorded note is entirely shifted in pitch by transposition. The consequence of this is that unnatural formant shifts occur, resulting in the phenomenon known in the industry as "munchkinization." The present invention eliminates munchkinization, thus allowing a substantially wider transposition range for a single recording. Also, the present invention allows even shorter recordings to be used for still further memory improvements. An analysis stage separates and stores the formant and excitation components of sounds from an instrument. On playback, either the formant component or the excitation component may be manipulated.
摘要:
Systems, methods, and apparatus for pitch trajectory analysis are described. Such techniques may be used to remove vocals and/or vibrato from an audio mixture signal. For example, such a technique may be used to pre-process the signal before an operation to decompose the mixture signal into individual instrument components.
摘要:
A system and method that quantifies a sound into dynamic pitch-based graphs that correlate to the pitch frequencies of the sound. The system records a sound, such as musical notes. A pitch detection algorithm identifies and quantifies the pitch frequencies of the notes. The algorithm analyzes the pitch frequencies, and graphically displays the pitch frequency and notes in real time as fluctuating circles, rectangular bars, and lines that represent variances in pitch. The algorithm comprises a modified Type 2 Normalized Square Difference Function that transforms the musical notes into the pitch frequencies. The Type 2 Normalized Square Difference Function analyzes the peaks of the pitch frequency to arrive at a precise pitch frequency, such as 440 Hertz. A Lagrangian interpolation enables comparative analysis and teaching of the pitches and notes. The algorithm also performs transformations and heuristic comparisons to generate the real time graphical representation of the pitch frequency.
摘要:
System, apparatus and method for determining semantic information from audio, where incoming audio is sampled and processed to extract audio features, including temporal, spectral, harmonic and rhythmic features. The extracted audio features are compared to stored audio templates that include ranges and/or values for certain features and are tagged for specific ranges and/or values. The semantic information may be associated with audio signature dataExtracted audio features that are most similar to one or more templates from the comparison are identified according to the tagged information. The tags are used to determine the semantic audio data that includes genre, instrumentation, style, acoustical dynamics, and emotive descriptor for the audio signal.
摘要:
Described are methods and systems of identifying one or more fundamental frequency component(s) of an audio signal. The methods and systems may include any one or more of an audio event receiving step, a signal discretization step, a masking step, and/or a transcription step.
摘要:
System, apparatus and method for determining semantic information from audio, where incoming audio is sampled and processed to extract audio features, including temporal, spectral, harmonic and rhythmic features. The extracted audio features are compared to stored audio templates that include ranges and/or values for certain features and are tagged for specific ranges and/or values. The semantic information may be associated with audio signature dataExtracted audio features that are most similar to one or more templates from the comparison are identified according to the tagged information. The tags are used to determine the semantic audio data that includes genre, instrumentation, style, acoustical dynamics, and emotive descriptor for the audio signal.
摘要:
A method for producing an electronically-simulated live musical performance, the method comprising providing morph-friendly solo tracks, morphing the morph-friendly solo tracks to produce a morphed track, and post-processing the morphed track. The method may also include combining the post-processed morphed track with one or more supporting tracks to produce an acoustic image for playback.
摘要:
A method and an electronic data processing apparatus for wave synthesis that retains the true qualities of naturally occurring sounds, such as those of musical instruments, speech, or other sounds. Transfer functions representative of recorded sound samples are pre-calculated and stored for use in an interpolative process to generate a transfer function representative of the sound to be synthesized. The preferred transfer functions are Chebyshev polynomial-based transfer functions, which assure a highly predictable harmonic content of synthesized sound. Output sound generation is driven by time domain signals produced by reconversion of a sequence of interpolated transfer functions. Non-harmonic sounds are synthesized using multiple frequency inputs to the reconverting (waveshaping) stage, or by parallel waveshaping stages. Speech sibilants and noise envelopes of instruments are synthesized by the input of noise into the waveshaping stage by modulation of a sinusoid with band-limited noise.
摘要:
Example articles of manufacture and apparatus disclosed herein for producing supplemental information for audio signature data obtain the audio signature data of a first time period including data relating to at least one of time or frequency components representing a first characteristic of media. Disclosed examples also obtain first semantic audio signature data, for the first time period, that is a measure of generalized information representing characteristics of the media. Disclosed examples further store the audio signature data of the first time period in association with a second time period when it is determined that second semantic audio signature data for the second time period substantially matches the first semantic audio signature data for the first time period.