Abstract:
An apparatus and method for multi-level cache request tracking. For example, one embodiment of a processor comprises: one or more cores to execute instructions and process data; a memory subsystem comprising a system memory and a multi-level cache hierarchy; a primary tracker to store a first entry associated with a memory request to transfer a cache line from the system memory or a first cache within the cache hierarchy to a second cache; primary tracker allocation circuitry to allocate and deallocate entries within the primary tracker; a secondary tracker to store a second entry associated with the memory request; secondary tracker allocation circuitry to allocate and deallocate entries within the secondary tracker; the primary tracker allocation circuitry to deallocate the first entry in response to a first indication that one or more cache coherence requirements associated with the cache line have been resolved, the secondary tracker allocation circuitry to deallocate the second entry in response to a second indication related to transmission of the cache line to the second cache.
Abstract:
An apparatus and method are described for a sharing aware snoop filter. For example, one embodiment of a processor comprises: a plurality of caches, each of the caches comprising a plurality of cache lines, at least some of which are to be shared by two or more of the caches; a snoop filter to monitor accesses to the plurality of cache lines shared by the two or more caches, the snoop filter comprising: a primary snoop filter comprising a first plurality of entries, each entry associated with one of the plurality of cache lines and comprising a N unique identifiers to uniquely identify up to N of the plurality of caches currently storing the cache line; an auxiliary snoop filter comprising a second plurality of entries, each entry associated with one of the plurality of cache lines, wherein once a particular cache line has been shared by more than N caches, an entry for that cache line is allocated in the auxiliary snoop filter to uniquely identify one or more additional caches storing the cache line.
Abstract:
Systems, methods, and apparatuses for range protection. In some embodiments, an apparatus comprises at least one monitoring circuit to monitor for memory accesses to an address space and take action upon a violation to the address space, wherein the action is one of generating a notification to a node that requested the monitor, generating the wrong request, generate a notification in a specific context of the home node, and generating a notification in a node that has ownership of the address space; at least one a protection table to store an identifier of the address space; and at least one hardware core to execute an instruction to enable the monitoring circuit.
Abstract:
First data is received on a plurality of data lanes of a physical link and a stream signal corresponding to the first data is received on a stream lane identifying a type of the first data. A first instance of an error detection code of a particular type is identified in the first data. Second data is received on at least a portion of the plurality of data lanes and a stream signal corresponding to the second data is received on the stream lane identifying a type of the second data. A second instance of the error detection code of the particular type is identified in the second data. The stream lane is another one of the lanes of the physical link and, in some instance, the type of the second data is different from the type of the first data.
Abstract:
A plurality of completed writes to memory are identified corresponding to a plurality of write requests from a host device received over a buffered memory interface. A completion packet is sent to the host device that includes a plurality of write completions to correspond to the plurality of completed writes.
Abstract:
Systems, methods, and apparatuses are directed to requesting access to a memory address; storing an identification of the memory address in a data structure; receiving a first request for access to the memory address, the request comprising a reference to a second processor core; storing the reference to the second processor in the data structure; receiving a second request for access to the memory address, the second request comprising a reference to a third processor core; determining, based on the data structure, that the third processor core is different from the second processor core; and responding to the second request without buffering the second request.
Abstract:
Systems, methods, and apparatuses for range protection. In some embodiments, an apparatus comprises at least one monitoring circuit to monitor for memory accesses to an address space and take action upon a violation to the address space, wherein the action is one of generating a notification to a node that requested the monitor, generating the wrong request, generate a notification in a specific context of the home node, and generating a notification in a node that has ownership of the address space; at least one a protection table to store an identifier of the address space; and at least one hardware core to execute an instruction to enable the monitoring circuit.
Abstract:
Embodiments of systems, method, and apparatuses for remote monitoring are described. In some embodiments, an apparatus includes at least one monitoring circuit to monitor for memory accesses to an address space; at least one a monitoring table to store an identifier of the address space; and a tag directory per core used by the core to track entities that have access to the address space.
Abstract:
In one embodiment, the present invention includes a multicore processor having a plurality of cores, a shared cache memory, an ntegrated input/output (IIO) module to interface between the multicore processor and at least one IO device coupled to the multicore processor, and a caching agent to perform cache coherency operations for the plurality of cores and the IIO module. Other embodiments are described and claimed.
Abstract:
Methods, systems, and apparatus for implementing low latency interconnect switches between CPU's and associated protocols. CPU's are configured to be installed on a main board including multiple CPU sockets linked in communication via CPU socket-to-socket interconnect links forming a CPU socket-to-socket ring interconnect. The CPU's are also configured to transfer data between one another by sending data via the CPU socket-tosocket interconnects. Data may be transferred using a packetized protocol, such as QPI, and the CPU's may also be configured to support coherent memory transactions across CPU's.