摘要:
In a computer system implementing state transitions that change logically and atomically at an address packet independently of a response, the coherence domain is extended among distributed memory. As such, memory line ownership transfers upon request, and not upon requestor receipt of data. Requestor receipt of data is rapidly implemented by providing a ReadToShareFork transaction that simultaneously causes a write-type operation that updates invalid data from a requested memory address, and provides the updated data to the requesting device. More specifically, when writing valid data to memory, the ReadToShare Fork transaction simultaneously causes reissuance of the originally requested transaction using the same memory address and ID information. The requesting device upon recognizing its transaction ID on the bus system will pull the now valid data from the desired memory location.
摘要:
A probabilistic queue lock divides requesters for a lock into at least three sets. In one embodiment, the requesters are divided into the owner of the lock, the first waiting contender, and the other waiting contenders. The first waiting contender is made probabilistically more likely to obtain the lock by having it spin faster than the other waiting contenders. Because the other waiting contenders spin more slowly, the first waiting contender is more likely to be able to observe the free lock and acquire it before the other waiting contenders notice that it is free. The first of the other waiting contenders that determines that the previous first waiting contender has acquired the lock is promoted to be the new first waiting contender and begins spinning fast. Because only the first waiting contender is spinning fast on the lock, it is probable that only the first waiting contender will attempt to acquire the lock when it becomes available.
摘要:
Low-latency distributed round-robin arbitration is used to grant requests for access to a shared resource such as a computer system bus. A plurality of circuit board cards that each include two devices such as CPUs, I/O units, and ram and an address controller plugs into an Address Bus in the bus system. Each address controller contains logic implementing the arbitration mechanism with a two-level hierarchy: a single top arbitrator and preferably four leaf arbitrators. Each address controller is coupled to two devices and the logical "OR" of their arbitration request is coupled via an Arbitration Bus to other address controllers on other boards. Each leaf arbitrator has four prioritized request in lines, each such line being coupled to a single address controller serviced by that leaf arbitrator. By default, each leaf arbitrator and the top arbitrator implement a prioritized algorithm. However a last winner ("LW") state is maintained at every arbitrator that overrides the default, to provide round-robin selection. Each leaf arbitrator arbitrates among the zero to four requests it sees, selects a winner and signals the top arbitrator that it has a device wishing access. At the top arbitrator, if the first leaf arbitrator last won a grant, it now has lowest grant priority, and a grant will go to the next highest leaf arbitrator having a device seeking access.
摘要:
A system may include a plurality of nodes coupled by an inter-node network. Each of the nodes includes several active devices, an interface to the inter-node network, and an address network coupling the active devices to the interface. An active device included in one of the nodes initiates a transaction by sending either a first type of address packet or a second type of address packet on the address network dependent on whether the active device is included in a multi-node system. The first type of address packet is sent if the active device is included in a multi-node system and is not snooped by other active devices in the same node as the active device. The second type of address packet, sent if the active device is included in a single node system, is snooped by other active devices in the same node as the active device.
摘要:
A probabilistic queue lock divides requesters for a lock into at least three sets. In one embodiment, the requesters are divided into the owner of the lock, the first waiting contender, and the other waiting contenders. The first waiting contender is made probabilistically more likely to obtain the lock by having it spin faster than the other waiting contenders. Because the other waiting contenders spin more slowly, the first waiting contender is more likely to be able to observe the free lock and acquire it before the other waiting contenders notice that it is free. The first of the other waiting contenders that determines that the previous first waiting contender has acquired the lock is promoted to be the new first waiting contender and begins spinning fast. Because only the first waiting contender is spinning fast on the lock, it is probable that only the first waiting contender will attempt to acquire the lock when it becomes available.
摘要:
Snooping is implemented on a split transaction snooping bus for a computer system having one or many such buses. Circuit boards including CPU or other devices and/or distributed memory, data input/output buffers, queues including request tag queues, coherent input queues ("CIQ"), and address controller implementing address bus arbitration plug-into one or more split transaction snooping bus systems. All devices snoop on the address bus to learn whether an identified line is owned or shared, and an appropriate owned/shared signal is issued. Receipt of an ignore signal blocks CIQ loading of a transaction until the transaction is reloaded and ignore is deasserted. Ownership of a requested memory line transfers immediately at time of request. Asserted requests are queued such that state transactions on the address bus occur atomically logically without dependence upon the request. Subsequent requests for the same data are tagged to become the responsibility of the owner-requestor. A subsequent requestor's activities are not halted awaiting grant and completion of an earlier request transaction. Processor-level cache changes state upon receipt of transaction data. A single multiplexed arbitration bus carries address bus and data bus request transactions, which transactions are each two-cycles in length.
摘要:
A split transaction snooping bus protocol and architecture is provided for use in a system having one or many such buses. Circuit boards including CPU or other devices and/or distributed memory, data input/output buffers, queues including request tag queues, coherent input queues ("CIQ"), and address controller implementing address bus arbitration plug-into one or more split transaction snooping bus systems. All devices snoop on the address bus to learn whether an identified line is owned or shared, and an appropriate owned/shared signal is issued. Receipt of an ignore signal blocks CIQ loading of a transaction until the transaction is reloaded and ignore is deasserted. Ownership of a requested memory line transfers immediately at time of request. Asserted requests are queued such that state transactions on the address bus occur atomically logically without dependence upon the request. Subsequent requests for the same data are tagged to become the responsibility of the owner-requestor. A subsequent requestor's activities are not halted awaiting grant and completion of an earlier request transaction. Processor-level cache changes state upon receipt of transaction data. A single multiplexed arbitration bus carries address bus and data bus request transactions, which transactions are each two-cycles in length.
摘要:
The present invention provides a cache manager (CM) for use with an address translation table (ATT) which take advantage of way information, available when a cache line is first cached, for efficiently accessing a multi-way cache of a computer system having a main memory and one or more processors. The main memory and the ATT are page-oriented while the cache is organized using cache lines. The cache includes a plurality of cache lines divided into a number of segments corresponding to the number of "ways". Each cache line includes an address tag (AT) field and a data field. The way information is stored in the ATT for later cache access. In this implementation, "waylets" provide an efficiency mechanism for storing the way information whenever a cache line is cached. Accordingly, each table entry of the ATT includes a virtual address (VA) field, a physical address (PA) field, and a plurality of waylets associated with each pair of VA and PA fields. Subsequently, the waylets can be used to quickly index directly into a single segment of the cache as follows. Upon receiving a virtual address of a target cache line, the CM attempts to match a virtual address field of one of the ATT entries with a page index portion of the virtual address. If there is a match, a waylet of the ATT entry is retrieved using a page offset portion of the virtual address. If the waylet value is valid, the CM indexes directly into a single cache line using the waylet value, the physical address field of the ATT entry and the page offset portion of the virtual address. If the AT field of the retrieved cache line matches with a portion of the physical address field of the ATT entry, the processor retrieves the data field of the cache line using the page offset portion of the VA. If the AT field does not match, the target cache line is retrieved from the main memory, and the waylet value in both the ATT and the main memory is updated.
摘要:
A multiprocessor computer system is configured to selectively transmit address transactions through an address network using either a broadcast mode or a point-to-point mode transparent to the active devices that initiate the transactions. Depending on the mode of transmission selected, either a directory-based coherency protocol or a broadcast snooping coherency protocol is implemented to maintain coherency within the system. A computing node is formed by a group of clients which share a common address and data network. The address network is configured to determine whether a particular transaction is to be conveyed in broadcast mode or point-to-point mode. In one embodiment, the address network includes a mode table with entries which are configurable to indicate transmission modes corresponding to different regions of the address space within the node. Upon receiving a coherence request transaction, the address network may then access the table in order to determine the transmission mode, broadcast or point-to-point, which corresponds to the received transaction.
摘要:
A method is provided for a data storage system to change the RAID type, the layout characteristics, and the performance characteristics of a virtual volume mapped to logical disk regions in one or more logical disks while the data storage system remains online to a host. Another method is provided for a data storage system to consolidate space in one or more logical disks mapped to a virtual volume while the data storage system remains online to a host. The one or more logical disks can be consolidated to free unused chunklet regions for use in other logical disks.