Abstract:
A method for controlling a memory from which data is transferred to a neural network processor and an apparatus thereof are provided, the method including: generating prefetch information of data by using a blob descriptor and a reference prediction table after history information is input; reading the data in the memory based on the pre-fetch information and temporarily archiving read data in a prefetch buffer; and accessing next data in the memory based on the prefetch information and temporarily archiving the next data in the prefetch buffer after the data is transferred to the neural network from the prefetch buffer.
Abstract:
An embodiment of the present invention provides a quantization method for weights of a plurality of batch normalization layers, including: receiving a plurality of previously learned first weights of the plurality of batch normalization layers; obtaining first distribution information of the plurality of first weights; performing a first quantization on the plurality of first weights using the first distribution information to obtain a plurality of second weights; obtaining second distribution information of the plurality of second weights; and performing a second quantization on the plurality of second weights using the second distribution information to obtain a plurality of final weights, and thereby reducing an error that may occur when quantizing the weight of the batch normalization layer.
Abstract:
Disclosed is an operating method of a neural network processing device that communicates with an external memory device and executes a plurality of layers including obtaining layer information of the plurality of layers by analyzing a connection structure of the plurality of layers, generating an input address and an output address for a target layer based on the layer information, receiving expected input data and expected output data for the target layer, storing the expected input data at an input address area of the external memory device corresponding the input address, storing output result data at an output address area of the external memory device corresponding the output address by executing the target layer, comparing the output result data with the expected output data, and determining whether an error for the target layer occurs.
Abstract:
Disclosed herein is a distributed in-memory database system for partitioning a database and allocating the partitioned database to a plurality of distributed nodes, wherein at least one of the plurality of nodes includes a plurality of central processing unit (CPU) sockets in which a plurality of CPU cores are installed, respectively; a plurality of memories respectively connected to the plurality of CPU sockets; and a plurality of database server instances managing allocated database partitions, wherein each database server instance is installed in units of CPU socket groups including a single CPU socket or at least two CPU sockets.
Abstract:
Disclosed is an apparatus for in-memory data management includes a hybrid memory including a plurality of types of memories with different characteristics and a storage engine. The storage engine rearranges data among the plurality of memories by the unit of a page by monitoring workloads for data stored in the memories and reconfigures a page layout by the unit of the page based on a data access characteristic of an application for each of pages constituting each of the plurality of memories.
Abstract:
The present invention relates to a system and method for parallel query processing by applying just-in-time (JIT) compilation-based query optimization when a query is processed. The system for parallel query processing based on JIT compilation according to the present invention includes a parallel processing scheduler configured to receive a database (DB) operation graph and operation dependency relation and distribute execution tasks and workers configured to execute a query executable code, wherein the workers include a worker for executing a JIT compiled executable code and a worker for executing the query executable code in an interpreter manner.
Abstract:
Disclosed is a system for distributed processing of stream data, including: a service management device which selects an operation device optimal to perform an operation constituting a service and assigns the operation in a node including the selected operation device; and a task execution device which performs one or more tasks included in the operation through the selected operation device when the assigned operation is an operation registered in a preregistered performance acceleration operation library.
Abstract:
A system for virtual machine placement and management monitors information regarding states of physical machines and virtual machines operated in a subgroup, and relocates the virtual machines operated in the subgroup according to information regarding states of the physical machines operated in the subgroup and a placement policy of the virtual machines.
Abstract:
A method and apparatus for multi-level stepwise quantization for neural network are provided. The apparatus sets a reference level by selecting a value from among values of parameters of the neural network in a direction from a high value equal to or greater than a predetermined value to a lower value, and performs learning based on the reference level. The setting of a reference level and the performing of learning are iteratively performed until the result of the reference level learning satisfies a predetermined value and there is no variable parameter that is updated during learning among the parameters.
Abstract:
Provided is a method of operating a neural network computing device that is configured to communicate with an external memory device and execute a plurality of layers. The method includes computing a first input address, based on first layer information of a first layer among the plurality of layers and a first memory management table, and updating the first memory management table to generate a second memory management table, reading first input data to be input to the first layer from the external memory device, based on the computed first input address, computing a first output address, based on the first layer information and the second memory management table, and updating the second memory management table to generate a third memory management table, and storing first output data output from the first layer, based on the first output address, in the external memory device.