Abstract:
A method, system, and computer program product are provided for prioritizing transactions. A processor in a computing environment initiates the execution of a transaction. The processor includes a transactional core, and the execution of the transaction is performed by the transactional core. The processor obtains concurrent with the execution of the transaction by the transactional core, an indication of a conflict between the transaction and at least one other transaction being executed by an additional core in the computing environment. The processor determines if the transactional core includes an indicator and based on determining that the transactional core includes an indicator, the processor ignores the conflict and utilizing the transactional core to complete executing the transaction.
Abstract:
Disclosed herein is a processing network element (NE) comprising at least one receiver configured to receive a plurality of memory request messages from a plurality of memory nodes, wherein each memory request designates a source node, a destination node, and a memory location, and a plurality of response messages to the memory requests from the plurality of memory nodes, wherein each memory request designates a source node, a destination node, and a memory location, at least one transmitter configured to transmit the memory requests and memory responses to the plurality of memory nodes, and a controller coupled to the receiver and the transmitter and configured to enforce ordering such that memory requests and memory responses designating the same memory location and the same source node/destination node pair are transmitted by the transmitter in the same order received by the receiver.
Abstract:
Aspects of the subject matter described herein relate to error detection for files. In aspects, before allowing updates to a clean file, a flag marking the file as dirty is written to non-volatile storage. Thereafter, the file may be updated as long as desired. Periodically or at some other time, the file may be marked as clean after all outstanding updates to the file and error codes associated with the file are written to storage. While waiting for outstanding updates and error codes to be written to storage, if additional requests to update the file are received, the file may be marked as dirty again prior to allowing the additional requests to update the file. The request to write a clean flag regarding the file may be done lazily.
Abstract:
An example embodiment of a computer system utilizing a central snoop filter includes several nodes coupled together via a switching device. Each of the nodes may include several processors and caches as well as a block of system memory. All traffic from one node to another takes place through the switching device. The switching device includes a snoop filter that tracks cache line coherency information for all caches in the computer system. The snoop filter has enough entries to track the tags and state information for all entries in all cashes in all of the system's nodes. In addition to the tag and state information, the snoop filter stores information indicating which of the nodes has a copy of each cache line. The snoop filter serves in part to keep snoop transactions from being performed at nodes that do not contain a copy of the subject cache line, thereby reducing system overhead, reducing traffic across the system interconnect busses, and reducing the amount of time required to perform snoop transactions.
Abstract:
Aggregating cache maintenance instructions in processor-based devices is disclosed. In this regard, a processor-based device comprises one or more processing elements (PEs), each providing an aggregation circuit configured to detect a first cache maintenance instruction in an instruction stream. The aggregation circuit then aggregates one or more subsequent, consecutive cache maintenance instructions in the instruction stream with the first cache maintenance instruction until an end condition is detected (e.g., detection of a data synchronization barrier instruction or a cache maintenance instruction targeting a non-consecutive memory address or a different memory page than a previous cache maintenance instruction, and/or detection that an aggregation limit has been exceeded). After detecting the end condition, the aggregation circuit generates a single cache maintenance request representing the aggregated cache maintenance instructions. In this manner, multiple cache maintenance instructions may be represented by and processed as a single request, thus minimizing the impact on system performance.
Abstract:
A technique to enable information sharing among agents within different cache coherency domains. In one embodiment, a graphics device may use one or more caches used by one or more processing cores to store or read information, which may be accessed by one or more processing cores in a manner that does not affect programming and coherency rules pertaining to the graphics device.
Abstract:
In one embodiment, a node comprises at least one processor core and a plurality of coherence units. The processor core is configured to generate an address to access a memory location. The address maps to a first coherence plane of a plurality of coherence planes. Coherence activity is performed within each coherence plane independent of other coherence planes, and a mapping of the address space to the coherence planes is independent of a physical location of the addressed memory in a distributed system memory. Each coherence unit corresponds to a respective coherence plane and is configured to manage coherency for the node and for the respective coherence plane. The coherence units operate independent of each other, and a first coherence unit corresponding to the first coherence plane is coupled to receive the address if external coherency activity is needed to complete the access to the memory location.
Abstract:
A system and method for performing speculative writestream transactions in a computing system. A computing system including a plurality of subsystems has a requesting subsystem configured to initiate a writestream ordered (WSO) transaction to perform a write operation to an entire coherency unit by conveying a WSO request to a home subsystem of the coherency unit. The requester is configured to perform the write operation without first receiving a copy of the coherency unit and complete WSO transactions initiated in the order in which they are initiated. The home subsystem is configured to process multiple WSO transactions directed to a given coherency unit in the order in which they are received. When the requester initiates a WSO transaction to a given coherency unit, the coherency unit is locked. Responsive to receiving the WSO request, the home subsystem conveys a pull request for the write data to the requester. If the requester detects a timeout condition, the requester may cancel the WSO transaction and unlock the coherency unit in the requesting node. The requester may further convey an acknowledgment to the home subsystem indicating no data will be returned. The home subsystem may then treat the WSO transaction as being complete.