Abstract:
A method for determining a user attention to at least one multimedia data element (MMDE) displayed in a web-page over a user computing device. The method comprises receiving a request to determine the user attention, wherein the request includes at least the web-page and an identification of the at least one MMDE in the web-page; receiving at least one sensory signal captured by at least one sensor connected to the user computing device; querying a deep-content-classification (DCC) system to find a match between at least one concept structure and the received sensory signal; receiving a first set of metadata related to the at least one matched concept structure; analyzing the returned set of metadata to determine the user attention with respect to the at least one MMDE; and associating the at least one MMDE with the determined user attention.
Abstract:
A method and apparatus for unsupervised clustering of a large-scale collection of multimedia data elements. The method comprises generating a first cluster from the large-scale collection by: matching each of the multimedia data elements to all other multimedia data elements in the large-scale collection, determining a clustering score for each match being performed, clustering multimedia data elements having a clustering score above a threshold to create the first cluster; and storing the first cluster in a storage unit.
Abstract:
A method for translating natural language text. The method comprises receiving at least one multimedia element including a first natural language text; generating metadata representing the first natural language text; generating at least one signature for the at least one multimedia element; determining the context of the at least one multimedia element respective of the signature; and searching for a multimedia content element (MMCE) corresponding to the received at least one multimedia element that includes a second natural language text, wherein the search is performed using the at least one signature, the context and metadata generated for the at least input text in a first natural language respective of the context, wherein the second natural language text is a translated text of the first natural language text.
Abstract:
A method and apparatus for symbol-space based compression of patterns are provided. The method comprises receiving an input sequence, the input sequence being of a first length and comprising a plurality of symbols; extracting all common patterns within the input sequence, wherein a common pattern includes at least two symbols; generating an output sequence responsive of the extraction of all common patterns, wherein the output sequence has a second length that is shorter than the first length; and storing in a memory the output sequence as a data layer, wherein the output sequence is provided as a new input sequence for a subsequent generation of a data layer.
Abstract:
An assembler for generating a complex signature of an input multimedia data element comprises a first interface for receiving a plurality of signatures respective of a plurality of minimum size multimedia data elements, wherein each of the plurality of the minimum size multimedia data elements is a minimal partition of the input multimedia data element; an assembly unit for combining the plurality of signatures respective of the plurality of minimum size multimedia data elements to generate the complex signature; and a second interface for storing at least the complex signature in a storage unit connected thereto.
Abstract:
A method for detection of common patterns within unstructured data elements. The method includes extracting a plurality of unstructured data elements retrieved from a plurality of big data sources; generating at least one signature for each of the plurality of unstructured data elements; identifying common patterns among the generated signatures; clustering the signatures identified to have common patterns; and correlating the generated clusters to identify associations between their respective identified common patterns.
Abstract:
A method for generating a concept database respective of a plurality of multimedia data elements (MMDEs) comprises generating a plurality of items from a received MMDE of the plurality of MMDEs; determining the items that are of interest for signature generation; generating at least one signature responsive to at least one item of interest of the received MMDE of the plurality of MMDEs; clustering a plurality of signatures received from the signature generator responsive of the plurality of MMDEs; reducing the number of signatures in each cluster to a create a signature reduced cluster (SRC) of the cluster; associating metadata with the SRC to a concept structure comprised of a plurality of SRCs and their associated metadata; and generating at least one index for mapping the received MMDE to at least one concept structure, wherein the concept database includes concept structures and the generated indices for the plurality of MMDEs.
Abstract:
Content-based clustering, recognition, classification and search of high volumes of multimedia data in real-time. The embodiments disclosed herein are dedicated to real-time fast generation of signatures to high-volume of multimedia content-segments, based on relevant audio and visual signals, and to scalable matching of signatures of high-volume database of content-segments' signatures. The embodiments disclosed herein can be implemented in any applications which involve large-scale content-based clustering, recognition and classification of multimedia data, such as, content-tracking, video filtering, multimedia taxonomy generation, video fingerprinting, speech-to-text, audio classification, object recognition, video search and any other application requiring content-based signatures generation and matching for large content volumes such as, web and other large-scale databases.
Abstract:
A method for reducing an amount of storage required for maintaining a large-scale collection of multimedia data elements by unsupervised clustering of multimedia data elements. The method comprises processing the multimedia data elements in the large-scale collection to generate a first cluster of multimedia data elements; storing the first cluster in a storage unit; repeating the generation of a new cluster from the first cluster and un-clustered multimedia elements in the large-scale collection until a single cluster is reached; and storing the new cluster generated at each iteration in the storage unit, wherein a N-th cluster generated at the N-th iteration is stored in the storage unit, wherein the amount of storage required to store the N-th cluster is less than an amount of storage of the large-scale collection, thereby the unsupervised clustering enables reducing the storage amount required to store the multimedia data elements in the large-scale collection.
Abstract:
A method for conducting search-by-content is provided. The method includes responsive to an input multimedia content item provided to a user device, checking if the input multimedia content item matches at least one concept of a plurality of concepts cached in the user device; retrieving characteristics set for a user of the user device; performing a search, using the at least one matching concept, for multimedia content items similar to the input multimedia content item; determining which of the search results are of interest to the user based on the characteristics set for the user; and saving results that are of interest to the user in the user device, wherein the saved results include multimedia content items.