Abstract:
Techniques described herein are generally related to managing cached memory addresses in a multi-core processor device that has a plurality of cores and a plurality of caches. Communication between the plurality of caches of and a main memory may be monitored. One or more memory addresses cached by the plurality of cores may be identified based on the monitored communications. A probabilistic memory address distribution table of the locations of the one or more memory addresses cached by the plurality of core may be generated and location of a given memory address can be predicted based upon the probabilistic memory address distribution table.
Abstract:
Techniques described herein are generally related to thread migration across processing cores of a multi-core processor. Execution of a thread may be migrated from a first processing core to a second processing core. Selective state data required for execution of the thread on the second processing core can be identified and can be dynamically acquired from the first processing core. The acquired state data can be utilized by the thread executed on the second processing core.
Abstract:
Technologies are generally provided for dynamically managing execution of sequential programs in a multi-core processing environment by dynamically hosting the data for the different dynamic program phases in the local caches of different cores. This may be achieved through monitoring data access patterns of a sequential program initially executed on a single core. Based on such monitoring, data identified as being accessed by different program phases may be sent to be stored in the local caches of different cores. The computation may then be moved from core to core based on which data is being accessed, when the program changes phase. Program performance may thus be enhanced by reducing local cache miss rates, proactively reducing the possibility of thermal hotspots, as well as by utilizing otherwise idle hardware.
Abstract:
Techniques described herein are generally related to data transfer in multi-core processor devices. A core of a multi-core processor device may be configured to receive a request for a data block, which may be stored in a private cache of the core. The data block in the private cache may be evaluated by a coherence module of the core to determine when the data block is in a ready state. A program slice associated with the data block may be identified by the coherence module when the data block is determined to be in an unavailable state and the identified program slice may be executed by the core to update the data block from the unavailable state to the ready state. The data block may be sent to an interconnect network in response to the received request when the stored data block is determined to be in the ready state.
Abstract:
Technologies are generally described for methods and systems effective to access data in a cache. In an example, a method to access data in a cache may include processing a first request for data at a first memory address related to first data in a memory. The method may further include retrieving the first data from the memory. The method may further include storing the first data in a first cache line in the cache. The method may further include processing a second request for data at a second memory address related to second data in the memory. The method may further include retrieving the second data from the memory. The method may further include selecting a second cache line in the cache to store the second data based on the storage of the first data. The method may further include storing the second data in the second cache line.
Abstract:
Techniques described herein are generally related to data transfer in multi-core processor devices. A core of a multi-core processor device may be configured to receive a request for a data block, which may be stored in a private cache of the core. The data block in the private cache may be evaluated by a coherence module of the core to determine when the data block is in a ready state. A program slice associated with the data block may be identified by the coherence module when the data block is determined to be in an unavailable state and the identified program slice may be executed by the core to update the data block from the unavailable state to the ready state. The data block may be sent to an interconnect network in response to the received request when the stored data block is determined to be in the ready state.
Abstract:
Techniques described herein are generally related to storing and retrieving data from a content-addressable memory (CAM). A data value to be stored in the CAM may be received, where the data value has two or more bits. The CAM may include a plurality of memory sets. An index corresponding to the data value may be determined. The index may be determined based on a subset of bits of the data value that correspond to an index bit set. A memory set of the CAM may be identified based on the determined index and the data value may be stored in a storage unit of the identified memory set.
Abstract:
Technologies are generally described for methods and systems effective to execute a program in a multi-core processor. In an example, methods to execute a program in a multi-core processor may include executing a first procedure on a first core of a multi-core processor. The methods may further include while executing the first procedure, sending a first and second instruction, from the first core to a second and third core, respectively. The instructions may command the cores to execute second and third procedures. The methods may further include executing the first procedure on the first core while executing the second procedure on the second core and executing the third procedure on the third core.
Abstract:
Techniques described herein are generally related to managing cached memory addresses in a multi-core processor device that has a plurality of cores and a plurality of caches. Communication between the plurality of caches of and a main memory may be monitored. One or more memory addresses cached by the plurality of cores may be identified based on the monitored communications. A probabilistic memory address distribution table of the locations of the one or more memory addresses cached by the plurality of core may be generated and location of a given memory address can be predicted based upon the probabilistic memory address distribution table.
Abstract:
Technologies are generally described for methods and systems effective to execute a program in a multi-core processor. In an example, methods to execute a program in a multi-core processor may include executing a first procedure on a first core of a multi-core processor. The methods may further include while executing the first procedure, sending a first and second instruction, from the first core to a second and third core, respectively. The instructions may command the cores to execute second and third procedures. The methods may further include executing the first procedure on the first core while executing the second procedure on the second core and executing the third procedure on the third core.