摘要:
A system and method for vector memory array transposition. The system includes a vector memory, a block transposition accelerator, and an address controller. The vector memory stores a vector memory array. The block transposition accelerator reads a vector of a block of data within the vector memory array. The block transposition accelerator also writes a transposition of the vector of the block of data to the vector memory. The address controller determines a vector access order, and the block transposition accelerator accesses the vector of the block of data within the vector memory array according to the vector access order.
摘要:
A system and method for vector memory array transposition. The system includes a vector memory, a block transposition accelerator, and an address controller. The vector memory stores a vector memory array. The block transposition accelerator reads a vector of a block of data within the vector memory array. The block transposition accelerator also writes a transposition of the vector of the block of data to the vector memory. The address controller determines a vector access order, and the block transposition accelerator accesses the vector of the block of data within the vector memory array according to the vector access order.
摘要:
The embodiments herein develop a system for providing hierarchical cache for big data processing. The system comprises a caching layer, a plurality of actors in communication with the caching layer, a machine hosting the plurality of actors, a plurality of replication channels in communication with the plurality of actors, a predefined ring structure. The caching layer is a chain of memory and storage capacity elements, configured to store a data from the input stream. The plurality of actors is configured to replicate the input data stream and forward the replicated data to the caching layer. The replication channels are configured to forward the replicated data from a particular actor to another actor. The predefined ring structure maps the input data to the replica actors.
摘要:
A method for managing a memory stack provides mapping a part of the memory stack to a span of fast memory and a part of the memory stack to a span of slow memory, wherein the fast memory provides access speed substantially higher than the access speed provided by the slow memory.
摘要:
Each array in a sequence of arrays is reordered. A first port receives in a first serial order a number of values in each array in the sequence and a second port transmits the values in a different second serial order. For each value in each array in the sequence, the address generator generates an address within a range of zero through one less than the number of values in the array. For each address from the generator, the memory performs an access to a location corresponding to the address in the memory. The access for each address includes a read from the location before a write to the location. For each array in the sequence, the writes for the addresses serially write the values of the array in the first serial order and the reads for the addresses serially read the values in the second serial order.
摘要:
An active memory device includes a command engine that receives high level tasks from a host and generates corresponding sets of either DCU commands to a DRAM control unit or ACU commands to a processing array control unit. The DCU commands include memory addresses, which are also generated by the command engine, and the ACU command include instruction memory addresses corresponding to an address in an array control unit where processing array instructions are stored. The active memory device includes a vector processing and re-ordering system coupled to the array control unit and the memory device. The vector processing and re-ordering system re-orders data received from the memory device into a vector of contiguous data, process the data in accordance with an instruction received from the array control unit to provide results data, and passes the results data to the memory device.
摘要:
One embodiment of the present invention provides a system that facilitates performing operations on a lock-free double-ended queue (deque). This deque is implemented as a doubly-linked list of nodes formed into a ring, so that node pointers in one direction form an inner ring, and node pointers in the other direction form an outer ring. The deque has an inner hat, which points to a node next to the last occupied node along the inner ring, and an outer hat, which points to a node next to the last occupied node along the outer ring. The system uses a double compare-and-swap (DCAS) operation while performing pop and push operations onto either end of the deque, as well as growing and shrinking operations to change the number of nodes that are in the ring used by the deque.
摘要:
A method for deleting expired items in a queue data structure, the queue data structure comprising a sequential list of ordered data items including a queue head at one end of the sequential list and a queue tail at another end of the sequential list, wherein each data item includes an expiry time, the method comprising: generating a maximum interval value corresponding to a maximum time interval between an expiry time of a first item in the queue and an expiry time of a second item in the queue, wherein the second item is nearer the queue head than the first item; sequentially scanning the list of ordered items from the queue head; responsive to a determination that a scanned item is expired, deleting the scanned item; responsive to a determination that a scanned item will not expire for a time interval greater than the maximum interval value, terminating scanning of the list of ordered items.
摘要:
An active memory device includes a command engine that receives high level tasks from a host and generates corresponding sets of either DCU commands to a DRAM control unit or ACU commands to a processing array control unit. The DCU commands include memory addresses, which are also generated by the command engine, and the ACU command include instruction memory addresses corresponding to an address in an array control unit where processing array instructions are stored. The active memory device includes a vector processing and re-ordering system coupled to the array control unit and the memory device. The vector processing and re-ordering system re-orders data received from the memory device into a vector of contiguous data, process the data in accordance with an instruction received from the array control unit to provide results data, and passes the results data to the memory device.
摘要:
An orthogonal memory for transforming arrangements of system bus data and processing data is placed between a system bus interface and a memory cell mat storing the processing data. The orthogonal memory includes two-port memory cells, and changes data train transferred in a bit parallel and word serial fashion into a data train of word parallel and bit serial data. Data transfer efficiency in a signal processing device performing parallel operational processing can be increased without impairing parallelism of the processing.