摘要:
Disclosed are service providing method and device, including: collecting execution state information about a plurality of tasks that constitute at least one service, and are dynamically distributed and arranged over a plurality of nodes; and performing scheduling based on the collected execution state information about the plurality of tasks, wherein each of the plurality of tasks has at least one input source and output source, and a unit of data to be processed for each input source and a data processing operation are defined by a user, and the scheduling is to delete at least a portion of data input into at least one task or to process the at least a portion of input data in at least one duplicate task by referring to the defined unit of data. In particular, the present invention may effectively provide a service of analyzing and processing large stream data in semi-real time.
摘要:
Disclosed herein are an apparatus and method for managing a data stream distributed parallel processing service. The apparatus includes a service management unit, a Quality of Service (QoS) monitoring unit, and a scheduling unit. The service management unit registers a plurality of tasks constituting the data stream distributed parallel processing service. The QoS monitoring unit gathers information about the load of the plurality of tasks and information about the load of a plurality of nodes constituting a cluster which provides the data stream distributed parallel processing service. The scheduling unit arranges the plurality of tasks by distributing the plurality of tasks among the plurality of nodes based on the information about the load of the plurality of tasks and the information about the load of the plurality of nodes.
摘要:
Provided are a cluster data management system and a method for data restoration using a shared redo log in the cluster data management system. The data restoration method includes collecting service information of a partition served by a failed partition server, dividing redo log files written by the partition server by columns of a table including the partition, restoring data of the partition on the basis of the collected service information and log records of the divided redo log files, and selecting a new partition server that will serve the data-restored partition, and allocating the partition to the selected partition server.
摘要:
Disclosed herein are an apparatus and method for managing a data stream distributed parallel processing service. The apparatus includes a service management unit, a Quality of Service (QoS) monitoring unit, and a scheduling unit. The service management unit registers a plurality of tasks constituting the data stream distributed parallel processing service. The QoS monitoring unit gathers information about the load of the plurality of tasks and information about the load of a plurality of nodes constituting a cluster which provides the data stream distributed parallel processing service. The scheduling unit arranges the plurality of tasks by distributing the plurality of tasks among the plurality of nodes based on the information about the load of the plurality of tasks and the information about the load of the plurality of nodes.
摘要:
Disclosed herein are an apparatus and method for managing the index information of high-dimensional data. The apparatus for managing the index information of high-dimensional data includes a plurality of data service devices and a control unit. Each of the plurality of data service devices is configured such that user data and index information used to search the user data are allocated thereto. The control unit is configured to extract high-dimensional index data from a large amount of input data and to allocate the extracted index data to the plurality of data service devices by mapping the extracted index data to the plurality of data service devices as the index information.