Abstract:
Technologies are generally described for systems, devices and methods relating to multicore processors. The multicore processors may include first and second tiles with first and second caches, respectively. The first cache may include first magnetoresistive random access memory (MRAM) cells with first storage characteristics. The second cache may include second MRAM cells with second storage characteristics different from the first storage characteristics. In some examples, an interconnect structure may be coupled to the first and second tiles and may be configured to provide communication between the first tile and the second tile. Methods for handling migration between tiles and cores are also described.
Abstract:
Technologies are generally described herein for accelerating a cache state transfer in a multicore processor. The multicore processor may include first, second, and third tiles. The multicore processor may initiate migration of a thread executing on the first core at the first tile from the first tile to the second tile. The multicore processor may determine block addresses of blocks to be transferred from a first cache at the first tile to a second cache at the second tile, and identify that a directory at the third tile corresponds to the block addresses. The multicore processor may update the directory to reflect that the second cache shares the blocks. The multicore processor may transfer the blocks from the first cache in the first tile to the second cache in the second tile effective to complete the migration of the thread from the first tile to the second tile.
Abstract:
Technologies are generally described for methods and systems to assign threads in a multi-core processor. In an example, a method to assign threads in a multi-core processor may include determining data relating to memory controllers fetching data in response to cache misses experienced by a first core and a second core. Threads may be assigned to cores based on the number of cache misses processed by respective memory controllers. Methods may further include determining that a thread is latency-bound or bandwidth-bound. Threads may be assigned to cores based on the determination of the thread as latency-bound or bandwidth-bound. In response to the assignment of the threads to the cores, data for the thread may be stored in the assigned cores.
Abstract:
Technologies are generally described for methods and systems to assign threads in a multi-core processor. In an example, a method to assign threads in a multi-core processor may include determining data relating to memory controllers fetching data in response to cache misses experienced by a first core and a second core. Threads may be assigned to cores based on the number of cache misses processed by respective memory controllers. Methods may further include determining that a thread is latency-bound or bandwidth-bound. Threads may be assigned to cores based on the determination of the thread as latency-bound or bandwidth-bound. In response to the assignment of the threads to the cores, data for the thread may be stored in the assigned cores.
Abstract:
Technologies are generally described for methods and systems effective to implement a memory allocation accelerator. A processor may generate a request for allocation of a requested chunk of memory. The request may be received by a memory allocation accelerator configured to be in communication with the processor. The memory allocation accelerator may process the request to identify an address for a particular chunk of memory corresponding to the request and may return the address to the processor.
Abstract:
Technologies are generally described for a method, device and architecture effective to allocate resources. In an example, the method may include associating first and second resources with first and second resource identifiers and mapping the first and resource identifiers to first and second sets of addresses in a memory, respectively. The method may include identifying that the first resource is at least partially unavailable. The method may include mapping the second resource identifier to at least one address of the first set of addresses in the memory when the first resource is identified as at least partially unavailable. The method may include receiving a request for the first resource, wherein the request identifies a particular address of the addresses in the first set of addresses. The method may include analyzing the particular address to identify a particular resource and allocating the request to the particular resource.
Abstract:
Technologies are generally described for a multi-processor core and a method for transferring threads in a multi-processor core. In an example, a multi-core processor may include a first group including a first core and a second core. A first sum of the operating frequencies of the cores in the first group corresponds to a first total operating frequency. The multi-core processor may further include a second group including a third core. A second sum of the operating frequencies of the cores in the second group may correspond to a second total operating frequency that is substantially the same as the first total operating frequency. A hardware controller may be configured in communication with the first, second and third core. A memory may be configured in communication with the hardware controller and may include an indication of at least the first group and the second group.
Abstract:
Technologies are generally described for a cache coherence directory in multi-processor architectures. In an example, a directory in a die may receive a request for a particular block. The directory may determine a block aging threshold relating to a likelihood that data blocks, including the particular data block, are stored in one or more caches in the die. The directory may further analyze a memory to identify a particular cache indicated as storing the particular data block and identify a number of cache misses for the particular cache. The directory may identify a time when an event occurred for the particular data block and determine whether to send the request for the particular data block to the particular cache based on the aging threshold, the time of the event, and the number of cache misses.