摘要:
The present invention discloses a method comprising: sending request for data to a memory controller; arranging the request for data by order of importance or priority; identifying a source of the request for data; if the source is an input/output device, masking off P ways in a cache; and allocating ways in filling the cache. The method further includes extending cache allocation logic to control a tag comparison operation by using a bit to provide a hint from IO devices that certain ways will not have requested data.
摘要:
Methods and apparatus for efficient peer-to-peer communication support in interconnect fabrics. Network interfaces associated with agents are implemented to facilitate peer-to-peer transactions between agents in a manner that ensures data accesses correspond to the most recent update for each agent. This is implemented, in part, via use of non-posted “dummy writes” that are sent from an agent when the destination between write transactions originating from the agent changes. The dummy writes ensure that data corresponding to previous writes reach their destination prior to subsequent write and read transactions, thus ordering the peer-to-peer transactions without requiring the use of a centralized transaction ordering entity.
摘要:
In one embodiment, a processor includes a first cache and a second cache, a first core associated with the first cache and a second core associated with the second cache. The caches are of asymmetric sizes, and a scheduler can intelligently schedule threads to the cores based at least in part on awareness of this asymmetry and resulting cache performance information obtained during a training phase of at least one of the threads.
摘要:
Configurable snoop filters. A memory system is coupled with one or more processing cores. A coherent system fabric couples the memory system with the one or more processing cores. The coherent system fabric comprising at least a configurable snoop filter that is configured based on workload. The configurable snoop filter having a configurable snoop filter directory and a bloom filter. The configurable snoop filter and the bloom filter include runtime configuration parameters that are used to selectively limit snoop traffic.
摘要:
An embodiment may include circuitry to select, at least in part, from a plurality of memories, at least one memory to store data. The memories may be associated with respective processor cores. The circuitry may select, at least in part, the at least one memory based at least in part upon whether the data is included in at least one page that spans multiple memory lines that is to be processed by at least one of the processor cores. If the data is included in the at least one page, the circuitry may select, at least in part, the at least one memory, such that the at least one memory is proximate to the at least one of the processor cores. Many alternatives, variations, and modifications are possible.
摘要:
An apparatus, method, and system are disclosed. In one embodiment the apparatus includes a cache memory, which a number of sets. Each of the sets in the cache memory have several cache lines. The apparatus also includes at least one process resource table. The process resource table maintains a cache line occupancy count of a number of cache lines. Specifically, the cache line occupancy count for each cache line describes the number of cache lines in the cache storing information utilized by a process running on a computer system. Additionally, the process resource table stores the occupancy count of less cache lines than the total number of cache lines in the cache memory.
摘要:
Methods and apparatus to schedule applications in heterogeneous multiprocessor computing platforms are described. In one embodiment, information regarding performance (e.g., execution performance and/or power consumption performance) of a plurality of processor cores of a processor is stored (and tracked) in counters and/or tables. Logic in the processor determines which processor core should execute an application based on the stored information. Other embodiments are also claimed and disclosed.
摘要:
Methods and apparatus to provide for power consumption reduction in memories (such as cache memories) are described. In one embodiment, a virtual tag is used to determine whether to access a cache way. The virtual tag access and comparison may be performed earlier in the read pipeline than the actual tag access or comparison. In another embodiment, a speculative way hit may be used based on pre-ECC partial tag match to wake up a subset of data arrays. Other embodiments are also described.
摘要:
An apparatus, method, and system are disclosed. In one embodiment the apparatus includes a cache memory, which a number of sets. Each of the sets in the cache memory have several cache lines. The apparatus also includes at least one process resource table. The process resource table maintains a cache line occupancy count of a number of cache lines. Specifically, the cache line occupancy count for each cache line describes the number of cache lines in the cache storing information utilized by a process running on a computer system. Additionally, the process resource table stores the occupancy count of less cache lines than the total number of cache lines in the cache memory.
摘要:
Methods and apparatus to provide for power consumption reduction in memories (such as cache memories) are described. In one embodiment, a virtual tag is used to determine whether to access a cache way. The virtual tag access and comparison may be performed earlier in the read pipeline than the actual tag access or comparison. In another embodiment, a speculative way hit may be used based on pre-ECC partial tag match to wake up a subset of data arrays. Other embodiments are also described.