摘要:
A system and method for data allocation in a shared cache memory of a computing system are contemplated. Each cache way of a shared set-associative cache is accessible to multiple sources, such as one or more processor cores, a graphics processing unit (GPU), an input/output (I/O) device, or multiple different software threads. A shared cache controller enables or disables access separately to each of the cache ways based upon the corresponding source of a received memory request. One or more configuration and status registers (CSRs) store encoded values used to alter accessibility to each of the shared cache ways. The control of the accessibility of the shared cache ways via altering stored values in the CSRs may be used to create a pseudo-RAM structure within the shared cache and to progressively reduce the size of the shared cache during a power-down sequence while the shared cache continues operation.
摘要:
A system and method for data allocation in a shared cache memory of a computing system are contemplated. Each cache way of a shared set-associative cache is accessible to multiple sources, such as one or more processor cores, a graphics processing unit (GPU), an input/output (I/O) device, or multiple different software threads. A shared cache controller enables or disables access separately to each of the cache ways based upon the corresponding source of a received memory request. One or more configuration and status registers (CSRs) store encoded values used to alter accessibility to each of the shared cache ways. The control of the accessibility of the shared cache ways via altering stored values in the CSRs may be used to create a pseudo-RAM structure within the shared cache and to progressively reduce the size of the shared cache during a power-down sequence while the shared cache continues operation.
摘要:
A multiprocessing system utilizes a bus protocol having two response windows. The first response window is at a fixed latency from the transmission of a bus request and/or address, while the second response window, utilized for coherency reporting, is placed a configurable number of clock cycles after the bus request and address to allow for longer access, or snoop, times to perform a cache directory look-up within other bus devices. The first response window reports error or flow control and error status. Furthermore, a method had been described, which implements the reporting of response information in a flexible and high performance manner.
摘要:
A system and method for using a toggle command for setting and releasing a lock, i.e. a locktoggle. In an exemplary computer system, one or more processors are each coupled to a bus bridge through separate high speed connections, such as a pair of uni-directional address buses with respective source-synchronous clock lines and a bi-directional data bus with attendant source-synchronous clock lines. The locktoggle command is used to transmit both a lock request and an unlock request from a processor to a system coherency. point, e.g. the bus bridge. The system coherency point acknowledges when the lock has been established or released. While the lock is active, other processors are inhibited. from accessing at least the memory locations for which the lock was initiated. Locks are thus established at the system coherency point, which may advantageously allow for locking functionality in a non-shared bus system. The use of the locktoggle command may advantageously allow for the use of a single command code point, leaving other points available for other uses.
摘要:
A processor is presented including a cache unit coupled to a bus interface unit (BIU). Address signal selection and masking functions are performed by circuitry within the BIU rather than within the cache unit, and physical addresses produced by the BIU are stored within the TLB. As a result, address signal selection and masking circuitry (e.g., a multiplexer and gating logic) are eliminated from a critical speed path within the cache unit, allowing the operational speed of the cache unit to be increased. The cache unit stores data items, and produces a data item corresponding to a received linear address. A translation lookaside buffer (TLB) within the cache unit stores multiple linear addresses and corresponding physical addresses. When a physical address corresponding to the received linear address is not found within the TLB, the cache unit passes the linear address to the BIU. The BIU includes address translation circuitry, a multiplexer, and gating logic, and returns the physical address corresponding to the linear address to the cache unit. The cache unit stores the physical address and the linear address within the TLB. The processor may also include a programmable control register and a microexecution unit. Upon detecting a change in state of an external masking signal, the microexecution unit may flush the contents of the TLB and modify a masking bit within the control register to reflect a new state of the masking signal.