摘要:
In one embodiment, the present invention includes a method for associating a first priority indicator with data stored in a first entry of a shared cache memory by a core to indicate a priority level of a first thread, and associating a second priority indicator with data stored in a second entry of the shared cache memory by a graphics engine to indicate a priority level of a second thread. Other embodiments are described and claimed.
摘要:
An apparatus to resolve cache coherency is presented. In one embodiment, the apparatus includes a microprocessor comprising one or more processing cores. The apparatus also includes a probe speculative address file unit, coupled to a cache memory, comprising a plurality of entries. Each entry includes a timer and a tag associated with a memory line. The apparatus further includes control logic to determine whether to service an incoming probe based at least in part on a timer value.
摘要:
An apparatus and method is described herein for intelligently spilling cache lines. Usefulness of cache lines previously spilled from a source cache is learned, such that later evictions of useful cache lines from a source cache are intelligently selected for spill. Furthermore, another learning mechanism—cache spill prediction—may be implemented separately or in conjunction with usefulness prediction. The cache spill prediction is capable of learning the effectiveness of remote caches at holding spilled cache lines for the source cache. As a result, cache lines are capable of being intelligently selected for spill and intelligently distributed among remote caches based on the effectiveness of each remote cache in holding spilled cache lines for the source cache.
摘要:
Multi-processor systems and methods are provided. One embodiment relates to a multi-processor system that may comprise a processor having a processor pipeline that executes program instructions across at least one memory barrier with data from speculative data fills that are provided in response to source requests, and a log that retains executed load instruction entries associated with executed program instruction. The executed load instruction entries may be retired if a cache line associated with data of the speculative data fill has not been invalidated in an epoch that is different from the epoch in which the executed load instruction is executed.
摘要:
A technique selectively imposes inter-reference ordering between memory reference operations issued by a processor of a multiprocessor system to addresses within a page pertaining to a page table entry (PTE) that is affected by a translation buffer (TB) miss flow routine. The TB miss flow is used to retrieve information contained in the PTE for mapping a virtual address to a physical address and, subsequently, to allow retrieval of data at the mapped physical address. The PTE that is retrieved in response to a memory reference (read) operation is not loaded into the TB until a commit-signal associated with that read operation is returned to the processor. Once the PTE and associated commit-signal are returned, the processor loads the PTE into the TB so that it can be used for a subsequent read operation directed to the data at the physical address.
摘要:
A multi-node computer network includes a plurality of nodes coupled together via a data link. Each of the nodes includes a local memory, which further comprises a shared memory. Certain items of data that are to be shared by the nodes are stored in the shared portion of memory. Associated with each of the shared data items is a data structure. When a node sharing data with other nodes in the system seeks to modify the data, it transmits the modifications over the data link to the other nodes in the network. Each update is received in order by each node in the cluster. As part of the last transmission by the modifying node, an acknowledgement request is sent to the receiving nodes in the cluster. Each node that receives the acknowledgment request returns an acknowledgement to the sending node. The returned acknowledgement is written to the data structure associated with the shared data item. If there is an error during the transmission of the message, the receiving node does not transmit an acknowledgement, and the sending node is thereby notified that an error has occurred.
摘要:
A plurality of indexes are provided for a multi-way set-associate cache of a computer system. The cache is organized as a plurality of blocks for storing data which are a copies of main memory data. Each block has an associated tag for uniquely identifying the block. The blocks and the tags are addressed by indexes. The indexes are generated by a Boolean hashing function which converts a memory address to cache indexes by combining the bits of the memory address using an exclusive OR function. Different combination of bits are used to generate a plurality of different indexes to address the tags and the associated blocks to transfer data between the cache and the central processing unit of the computer system.
摘要:
The set-prediction cache memory system comprises an extension of a set-associative cache memory system which operates in parallel to the set-associative structure to increase the overall speed of the cache memory while maintaining its performance. The set prediction cache memory system includes a plurality of data RAMs and a plurality of tag RAMs to store data and data tags, respectively. Also included in the system are tag store comparators to compare the tag data contained in a specific tag RAM location with a second index comprising a predetermined second portion of a main memory address. The elements of the set prediction cache memory system which operate in parallel to the set-associative cache memory include: a set-prediction RAM which receives at least one third index comprising a predetermined third portion of the main memory address, and stores such third index to essentially predict the data cache RAM holding the data indexed by the third index; a data-select multiplexer which receives the prediction index and selects a data output from the data cache RAM indexed by the prediction index; and a mispredict logic device to determine if the set prediction RAM predicted the correct data cache RAM and if not, issue a mispredict signal which may comprise a write data signal, the write data signal containing information intended to correct the prediction index contained in the set prediction RAM.
摘要:
One disclosed embodiment may comprise a system that includes a home node that provides a transaction reference to a requester in response to a request from the requester. The requester provides an acknowledgement message to the home node in response to the transaction reference, the transaction reference enabling the requester to determine an order of requests at the home node relative to the request from the requester.
摘要:
Multiprocessor systems and methods are disclosed. One embodiment may comprise a plurality of processor cores. A given processor core may be operative to generate a request for desired data in response to a cache miss at a local cache. A shared cache structure may provide at least one speculative data fill and a coherent data fill of the desired data to at least one of the plurality of processor cores in response to a request from the at least one processor core. A processor scoreboard arbitrates the requests for the desired data. A speculative data fill of the desired data is provided to the at least one processor core. The coherent data fill of the desired data may be provided to the at least one processor core in a determined order.