摘要:
A system and method for obtaining coherence permission for speculative prefetched data. A memory controller stores an address of a prefetch memory line in a prefetch buffer. Upon allocation of an entry in the prefetch buffer a snoop of all the caches in the system occurs. Coherency permission information is stored in the prefetch buffer. The corresponding prefetch data may be stored elsewhere. During a subsequent memory access request for a memory address stored in the prefetch buffer, both the coherency information and prefetched data may be already available and the memory access latency is reduced.
摘要:
In one embodiment, a processing node includes a plurality of processor cores and a reconfigurable interconnect. The processing node also includes a controller configured to schedule transactions received from each processor core. The interconnect may be coupled to convey between a first processor core and the controller, transactions that each include a first corresponding indicator that indicates the source of the transaction. The interconnect may also be coupled to convey transactions between a second processor core and the controller, transactions that each include a second corresponding indicator that indicates the source of the transaction. When operating in a first mode, the interconnect is configurable to cause the first indicator to indicate that the corresponding transactions were conveyed from the second processor core and to cause the second indicator to indicate that the corresponding transactions were conveyed from the first processor core.
摘要:
A data processing device is disclosed that includes multiple processing cores, where each core is associated with a corresponding cache. When a processing core is placed into a first sleep mode, the data processing device initiates a first phase. If any cache probes are received at the processing core during the first phase, the cache probes are serviced. At the end of the first phase, the cache corresponding to the processing core is flushed, and subsequent cache probes are not serviced at the cache. Because it does not service the subsequent cache probes, the processing core can therefore enter another sleep mode, allowing the data processing device to conserve additional power.
摘要:
In one embodiment, a method comprises assigning a unique node number to each of a first plurality of nodes in a first partition of a system and a second plurality of nodes in a second partition of the system. A first memory address space spans first memory included in the first partition and a second memory address space spans second memory included in the second partition. The first memory address space and the second memory address space are generally logically distinct. The method further comprises programming a first address map in the first partition to map the first memory address space to node numbers, wherein the programming comprises mapping a first memory address range within the first memory address space to a first node number assigned to a first node of the second plurality of nodes in the second partition, whereby the first memory address range is mapped to the second partition.
摘要:
A method and apparatus for accelerated shared data migration between cores. Using an Always Migrate protocol, when a migratory probe hits a directory entry in either modified or owned state, the entry is transitioned to an owned state, and a source done command is sent without sending cache block ownership or state information to the directory.
摘要:
In one embodiment, a node comprises a plurality of processor cores and a node controller configured to receive a first read operation addressing a first register. The node controller is configured to return a first value in response to the first read operation, dependent on which processor core transmitted the first read operation. In another embodiment, the node comprises the processor cores and the node controller. The node controller comprises a queue shared by the processor cores. The processor cores are configured to transmit communications at a maximum rate of one every N clock cycles, where N is an integer equal to a number of the processor cores. In still another embodiment, a node comprises the processor cores and a plurality of fuses shared by the processor cores. In some embodiments, the node components are integrated onto a single integrated circuit chip (e.g. a chip multiprocessor).
摘要:
In one embodiment, a method comprises assigning a unique node number to each of a first plurality of nodes in a first partition of a system and a second plurality of nodes in a second partition of the system. A first memory address space spans first memory included in the first partition and a second memory address space spans second memory included in the second partition. The first memory address space and the second memory address space are generally logically distinct. The method further comprises programming a first address map in the first partition to map the first memory address space to node numbers, wherein the programming comprises mapping a first memory address range within the first memory address space to a first node number assigned to a first node of the second plurality of nodes in the second partition, whereby the first memory address range is mapped to the second partition.
摘要:
A method and apparatus for selectively bypassing a cache in a processor of a computing device are disclosed. A mechanism to provide visibility to transactions on the core to a cache interface (e.g., an L3 cache interface) in a trace controller buffer (TCB) for debugging purposes, by causing selected transactions, which would otherwise be satisfied by the cache, to bypass the cache and be presented to the memory system where they may be logged in the TCB is described. In an embodiment of the invention, there is provided a method for providing processing core request visibility comprising bypassing a higher level cache in response to a processing core request, capturing the processing core request in a TCB, providing a mask to filter the processing core request, and returning a transaction response to a requesting processing core.