摘要:
Multi-processor systems and methods are disclosed that employ speculative source requests to obtain speculative data fills in response to a cache miss. In one embodiment, a source processor generates a speculative source request and a system source request in response to a cache miss. At least one processor provides a speculative data fill to a source processor in response to the speculative source request. The processor system provides a coherent data fill to the processor in response to the system source request.
摘要:
Systems and methods are disclosed for blocking data responses. One system includes a target node that, in response to a source broadcast request for requested data, provides a response that includes a copy of the requested data. The target node also provides a blocking message to a home node associated with the requested data. The blocking message being operative cause the home node to provide a non-data response to the source broadcast request if the blocking message is matched with the source broadcast request at the home node.
摘要:
A performance enhancing change-to-dirty operation (CTD) is disclosed wherein contention among several processors trying to gain ownership of a block of data is obviated by arranging the CTD to always succeed. A method and a system are disclosed where a processor in a multiprocessor system having a copy of data gains assured ownership of data that the processor may then write. The method provides for the possibilities of conditions that may exist and provides a scenario that the requesting processor may have to wait for the ownership. Conditions are handled where the memory is the “owner” of the data and where other processor are requesting ownership, and where copies of the data exist at other processors. The method provides for messages to other processor having copies of the data informing them that the data is now invalid.
摘要:
Apparatus and methods implementing a hardware queue management device for reducing inter-core data transfer overhead by offloading request management and data coherency tasks from the CPU cores. The apparatus include multi-core processors, a shared L3 or last-level cache (“LLC”), and a hardware queue management device to receive, store, and process inter-core data transfer requests. The hardware queue management device further comprises a resource management system to control the rate in which the cores may submit requests to reduce core stalls and dropped requests. Additionally, software instructions are introduced to optimize communication between the cores and the queue management device.
摘要:
Methods and apparatus relating to directory cache allocation that is based on snoop response information are described. In one embodiment, an entry in a directory cache may be allocated for an address in response to a determination that another caching agent has a copy of the data corresponding to the address. Other embodiments are also disclosed.
摘要:
A system and method avoids deadlock, such as circular routing deadlock, in a computer system by providing a virtual buffer at main memory. The computer system has an interconnection network that couples a plurality of processors having access to main memory. The interconnection network includes one or more routing agents each having at least one buffer for storing packets that are to be forwarded. When the routing agent's buffer becomes full, thereby preventing it from accepting any additional packets, the routing agent transfers at least one packet into the virtual buffer. By transferring a packet out of the buffer, the routing agent frees up space allowing it to accept a new packet. If the newly accepted packet also results in the buffer becoming full, another packet is transferred into the virtual buffer. This process is repeated until the deadlock condition is resolved. Packets are then retrieved from the virtual buffer.
摘要:
A channel-based mechanism resolves race conditions in a computer system between a first processor writing modified data back to memory and a second processor trying to obtain a copy of the modified data. In addition to a Q0 channel for carrying requests for data, a Q1 channel for carrying probes in response to Q0 requests, and a Q2 channel for carrying responses to Q0 requests, a new channel, the QWB channel, which has a higher priority than Q1 but lower than Q2, is also defined. When a forwarded Read command from the second processor results in a miss at the first processor's cache, because the requested memory block was written back to memory, a Loop command is issued to memory by the first processor on the QWB virtual channel. In response to the Loop command, memory sends the written back version of the memory block to the second processor.
摘要:
A processor is described that includes one or more processing cores. The processing core includes a memory controller to interface with a system memory having a near memory and a far memory. The processing core includes a plurality of caching levels above the memory controller. The processor includes logic circuitry to track state information of a cache line that is cached in one of the caching levels. The state information including a selected one of an inclusive state and a non inclusive state. The inclusive state indicates that a copy or version of the cache line exists in near memory. The non inclusive states indicates that a copy or version of the cache line does not exist in the near memory. The logic circuitry is to cause the memory controller to handle a write request that requests a direct write into the near memory without a read of the near memory beforehand if a system memory write request generated within the processor targets the cache line when the cache line is in the inclusive state.
摘要:
Early race conditions caused by multiple computer system entities issuing memory reference operations for a given memory block are resolved by creating linked lists identifying the entities. The lists are preferably formed by storing information and state in miss address file (MAF) entries maintained by the entities. The MAF entries cooperate to form one or more read chains each of which links the entities requesting read access to a particular version of the given memory block. The MAF entries also cooperate to form a single write chain that links the entities requesting write access to the given memory block. When the desired memory block becomes available, the information and state stored at the MAF entries is then utilized by each entity in satisfying its obligations as part of the read and write chains, thereby ensuring that each entity receives the version of the given memory block that it desires.
摘要:
Bus interfaces for nodes coupled to a system bus in a computer system, the system bus including an address bus and a separate data bus. System bus operations include address and command transactions and data transactions. Data transactions occur on the data bus separately and independently of the occurrence of address and command transactions on the address bus. A bus interface may include any of a commander address bus interface means for providing to an address bus address and command transactions, a responder address bus interface means for acknowledging receipt of address and command transactions via the address bus, a commander data bus interface means for controlling submission to the data bus of data transactions as a result of the occurrence of address and command transactions on the address bus, and a responder data bus interface means for transferring data on the data bus during a data transaction. Data transactions occur on the data bus separately and independently of the occurrence of address and command transactions on the address bus. In particular, the timing of data transactions and the rate at which data transactions occur on the data bus is independent of the timing of address and command transactions and the rate at which address sub-transactions occur on the address bus.