摘要:
Provided are a system and a method for indexing high-dimensional data in parallel in a cluster environment. The system for indexing high-dimensional data in parallel in a cluster environment includes a Spill-tree creation means for creating a Spill-tree using an sampled N-dimensional feature vector, a feature vector division storage means for distributedly storing the N-dimensional feature vector in a terminal node of the Spill-tree, and a local signature creation means for creating and managing a local signature for the N-dimensional feature vector dispersed into each node of the Spill-tree.
摘要:
Provided is a stream data processing system and method for avoiding duplication of data process. The system including: an evaluation result storing unit for updating and storing a query condition evaluation result; a window evaluating unit for performing window evaluation; a data separating unit for separating data into new data and duplication input data; a reuse result extracting unit for receiving duplication input data from the data separating unit and extracting a query condition evaluation result; a query condition evaluating unit for receiving new data from the data separating unit, performing query condition evaluation and creating a query condition evaluation result; and a result organizing unit for receiving the query condition evaluation result, merging, outputting and transmitting the query condition evaluation result to the evaluation result storing unit.
摘要:
Provided are a content-based searching method and system for multimedia objects using a high-dimensional feature vector data based on a 2-level signature. The method for searching the high-dimensional data using a signature file includes calculating a first-level query signature and a second-level query signature by using the query feature vector, performing a first filtering operation to obtain a primary candidate cell group by searching a second-level signature file, and performing a secondary filtering operation to obtain a secondary candidate cell group having a high similarity in a primary candidate cell group. Accordingly, the high-dimensional data searching method and system can process a query quickly and accurately and can increase the searching accuracy by using an enhanced signature of the query feature vector.
摘要:
Provided are a content-based searching method and system for multimedia objects using a high-dimensional feature vector data based on a 2-level signature. The method for searching the high-dimensional data using a signature file includes calculating a first-level query signature and a second-level query signature by using the query feature vector, performing a first filtering operation to obtain a primary candidate cell group by searching a second-level signature file, and performing a secondary filtering operation to obtain a secondary candidate cell group having a high similarity in a primary candidate cell group. Accordingly, the high-dimensional data searching method and system can process a query quickly and accurately and can increase the searching accuracy by using an enhanced signature of the query feature vector.
摘要:
Provided is a stream data processing system and method for avoiding duplication of data process. The system including: an evaluation result storing unit for updating and storing a query condition evaluation result; a window evaluating unit for performing window evaluation; a data separating unit for separating data into new data and duplication input data; a reuse result extracting unit for receiving duplication input data from the data separating unit and extracting a query condition evaluation result; a query condition evaluating unit for receiving new data from the data separating unit, performing query condition evaluation and creating a query condition evaluation result; and a result organizing unit for receiving the query condition evaluation result, merging, outputting and transmitting the query condition evaluation result to the evaluation result storing unit.
摘要:
Provided are a system and a method for processing integrated queries against an input data stream and data stored in a database using trigger. The system for processing an integrated query against an input data stream and data stored in a database using a trigger, including: a data stream manager for managing a continuously inputted data stream; a trigger result manager for registering a trigger in a database which interworks with the trigger result manager and forming a set of results that are obtained by executing the registered trigger to thereby provide the set of results in real time; and an executer for processing an integrated query against the data stream from the data stream manager and data stored in the database, wherein the integrated query is processed by referring to the set of results from the trigger result manager for the data stored in the database.
摘要:
Provided are a system and method for processing continuous integrated queries on both data stream and stored data using user-defined shared trigger. The system includes a data stream manager for managing data stream inputted from outside; a continuous integrated queries manager for managing the continuous integrated queries inputted from an external application; a trigger manager for managing the user-defined shared trigger inputted from the external application and registering the user-defined shared trigger in an external database; a trigger result manager for forming and managing a trigger result set from a performance result of the user-defined shared trigger registered in the cooperation database; and a continuous integrated queries performer for processing the continuous integrated queries referring to the transmitted data stream and trigger result set.
摘要:
Disclosed herein are an apparatus and method for managing a data stream distributed parallel processing service. The apparatus includes a service management unit, a Quality of Service (QoS) monitoring unit, and a scheduling unit. The service management unit registers a plurality of tasks constituting the data stream distributed parallel processing service. The QoS monitoring unit gathers information about the load of the plurality of tasks and information about the load of a plurality of nodes constituting a cluster which provides the data stream distributed parallel processing service. The scheduling unit arranges the plurality of tasks by distributing the plurality of tasks among the plurality of nodes based on the information about the load of the plurality of tasks and the information about the load of the plurality of nodes.
摘要:
Disclosed herein are an apparatus and method for managing a data stream distributed parallel processing service. The apparatus includes a service management unit, a Quality of Service (QoS) monitoring unit, and a scheduling unit. The service management unit registers a plurality of tasks constituting the data stream distributed parallel processing service. The QoS monitoring unit gathers information about the load of the plurality of tasks and information about the load of a plurality of nodes constituting a cluster which provides the data stream distributed parallel processing service. The scheduling unit arranges the plurality of tasks by distributing the plurality of tasks among the plurality of nodes based on the information about the load of the plurality of tasks and the information about the load of the plurality of nodes.
摘要:
Disclosed herein is a system for processing large-capacity data in a distributed parallel processing manner based on MapReduce using a plurality of computing nodes. The distributed parallel processing system is configured to provide an incremental MapReduce-based distributed parallel processing function for large-capacity stream data which is being continuously collected even during the performance of the distributed parallel processing, as well as for large-capacity stored data which has been previously collected.