System for generation of a large-scale database of hetrogeneous speech
Abstract:
A system for generating a large-scale database of heterogeneous speech is provided. The system comprises a processor a plurality of independent computation cores configured to generate signatures of a plurality of speech segments; a large scale database configured to maintain a plurality of transcribed multimedia signals; a memory, the memory containing instructions that, when executed by the processor, configure the system to: randomly select a plurality of speech segments from the plurality of multimedia signals, wherein each speech segment of the plurality of speech segments is of a random length; provide the plurality of speech segments to the plurality of independent computation cores for generation of the signatures; collect the signatures from the plurality of independent computation cores; and populate the large-scale database with the plurality of signatures respective of the plurality of multimedia signals.
Information query
Patent Agency Ranking
0/0