摘要:
A system and method are disclosed for retrieving audio segments from a spoken document. The spoken document preferably is one having moderate word error rates such as telephone calls or teleconferences. The method comprises converting speech associated with a spoken document into a lattice representation and indexing the lattice representation of speech. These steps are performed typically off-line. Upon receiving a query from a user, the method further comprises searching the indexed lattice representation of speech and returning retrieved audio segments from the spoken document that match the user query.
摘要:
A system and method are disclosed for retrieving audio segments from a spoken document. The spoken document preferably is one having moderate word error rates such as telephone calls or teleconferences. The method comprises converting speech associated with a spoken document into a lattice representation and indexing the lattice representation of speech. These steps are performed typically off-line. Upon receiving a query from a user, the method further comprises searching the indexed lattice representation of speech and returning retrieved audio segments from the spoken document that match the user query.
摘要:
A method and apparatus for retrieving data from a database is disclosed. A plurality of entities are stored in a first memory and information about each stored entity is stored in a second memory. Criteria in the form of at least one indefinite expression is received from a user for selecting entites from the stored entities. The received criteria are translated into terms used in the stored information. A sequence of entites based on the translated criteria are then selected.
摘要:
Examples described herein relate to music discovery. In one aspect, a method is provided that involves (a) receiving by a computing device an indication of a search tool from among a plurality of search tools, where each search tool of the plurality of search tools is associated with at least one respective media service, (b) receiving by the computing device an indication of a media characteristic, where the computing device receives the media characteristic via the indicated search tool, (c) selecting by the computing device one or more of the at least one respective media service that maintains media associated with the indicated media characteristic, and (d) sending by the computing device an indication of the selected one or more of the at least one respective media service.
摘要:
Systems, devices, apparatuses, components, methods, and techniques for cadence and media content phase alignment are provided. An example media-playback device includes a content output device that operates to output media content, a cadence-acquiring device, a phase-delay calibration engine, a cadence-based media content selection engine, and a phase-aligned media playback engine. The cadence-acquiring device includes a movement-determining device and a cadence-determination engine configured to determine a cadence based on movement data captured by the movement-determining device. The phase-delay calibration engine configured to determine phase delay values for at least one cadence value. The cadence-based media content selection engine configured to identify a media content item based on the cadence determined by the cadence-acquiring device. The phase-aligned media playback engine configured to align the identified media content item to the repetitive-motion activity and cause the media-output device to output the aligned media content item.
摘要:
For each of a plurality of performance parts, a database (221) stores therein a plurality of part performance data. The part performance data for each of the parts includes a sound generation pattern and tone data corresponding to the sound generation pattern. A query pattern indicative of a sound generation pattern to be made an object of search is input by a user. A search is made through the database for part performance data including a sound generation pattern matching the query pattern. In response to a user's operation, one part performance data is identified from among searched-out results from the database, and the sound generation pattern of the identified part performance data is instructed as a new query pattern (Sa8b). Then, a further search is made through the database for part performance data including a sound generation pattern matching the new query pattern. In accordance with a user's operation, one part performance data is identified from among searched-out results, and the identified part performance data is edited. The thus-edited data can be registered into the database as new part performance data.
摘要:
Methods and systems for music information management are provided. When audio data is generated in an electronic device, a control module is notified to launch a specific application to perform a music recognition procedure for the audio data, thus to obtain music information corresponding to the audio data,
摘要:
Disclosed is an apparatus and method for retrieving multimedia contents represented in a Moving Picture Experts Group (MPEG) 7 by transforming a user query into an MPEG-7 query format. The method for retrieving multimedia contents includes: representing a user query by using an indicator indicating a specific region of a Moving Picture Experts Group 7 (MPEG-7) document and a reference for referring to the indicator; analyzing a meaning of the user query represented by using the indicator and the reference to thereby produce an analysis result; and retrieving multimedia contents according to the analysis result. The present research can satisfy more than two retrieval conditions within the same structure in an MPEG-7 query format and it can also clearly represent that two different MPEG-7 documents are referred to. Since the meaning of a user query is analyzed accurately during retrieval process, it is possible to precisely retrieve multimedia contents.
摘要:
The present invention generally relates to the field of content-based music information retrieval systems, in particular to a method and a query-by-humming (QbH) database system (100') for processing queries in the form of analog audio sequences which encompass recorded parts of sung, hummed or whistled tunes (102), recorded parts of a melody (300a) played on a musical instrument and/or a speaker's recorded voice (400) articulating at least one part of a song's lyrics to retrieve textual background information about a musical piece whose score is stored in an integrated database (103, 105) of said system after having analyzed and recognized said melody (300a). According to one embodiment of the present invention, said method is characterized by the steps of recording (S1) said analog audio sequences (102, 300a, 400), extracting (S4a) and analyzing (S4b) various acoustic-phonetic speech characteristics of the speaker's voice and pronunciation from spoken parts (400) of a recorded song's lyrics (102") and recognizing (S4c) syntax and semantics of said lyrics (102"). The method further comprises the steps of extracting (S2a), analyzing (S2b) and recognizing (S2c) musical key characteristics from the analog audio sequences (102, 300a, 400), which are given by the semitone numbers of the particular notes, the intervals and/or interval directions of the melody and the time values of the notes and pauses the rhythm of said melody is composed of, the key, beat, tempo, volume, agogics, dynamics, phrasing, articulation, timbre and instrumentation of said melody, the harmonies of accompaniment chords and/or electronic sound effects generated by said musical instrument. The invention is characterized by the step of calculating (S3a) a similarity measure indicating the similarity of melody and lyrics of the recorded audio sequence (102, 300a) compared to melody and lyrics of various music files stored in said database (103, 105) by performing a Viterbi search algorithm on a three-dimensional search space, said search space having a first dimension ( t ) for the time, a second dimension ( S ) for an appropriate coding of the acoustic-phonetic speech characteristics and a third dimension ( H ) for an appropriate coding of the musical key characteristics, and generating (S3b) a ranked list (107) of said music files.
摘要:
An indexing apparatus and method are described for use in identifying portions of data in a database for comparison with a query. In an embodiment, the index includes a key which comprises a sequence of phoneme classifications derived from the input query by classifying each of the phonemes in the input query with a number of phoneme classes, with the phonemes in each class being defined as those that are confusable with the other phonemes in the same class.