摘要:
An electronic device that has an integrated central processing unit (CPU) including a pre-fetch stride analyzer and an out-of-order engine is provided. The electronic device also has a graphics engine, having graphics memory, that is coupled to the integrated CPU. A main memory that is coupled to a memory controller is provided. The memory controller is also coupled to the CPU and the graphics engine. The device has a host address decoder coupled to the integrated CPU. A front side bus (FSB) is provided that is coupled to the integrated CPU and the host address decoder. Also provided is a plurality of memory components. Accordingly, either the plurality of memory components or the graphics memory can be shared to perform alternate memory functions. Additionally, a method is provided that determines allocation availability between memory components in an integrated computer processing unit. The method also shares an available memory component as a pre-fetch buffer and another available memory component as a victim cache.
摘要:
A method and apparatus for cache replacement in a multiple variable-way associative cache is disclosed. The method according to the present techniques partitions a cache array dynamically based upon requests for memory from an integrated device having a plurality of processors.
摘要:
A system and method for flushing a cache line associated with a linear memory address from all caches in the coherency domain. A cache controller receives a memory address, and determines whether the memory address is stored within the closest cache memory in the coherency domain. If a cache line stores the memory address, it is flushed from the cache. The flush instruction is allocated to a write-combining buffer within the cache controller. The write-combining buffer transmits the information to the bus controller. The bus controller locates instances of the memory address stored within external and intel cache memories within the coherency domain; these instances are flushed. The flush instruction can then be evicted from the write-combining buffer. Control bits may be used to indicate whether a write-combining buffer is allocated to the flush instruction, whether the memory address is stored within the closest cache memory, and whether the flush instruction should be evicted from the write-combining buffer.
摘要:
The present invention discloses a method and apparatus for implementing a senior load instruction type. An instruction requesting a memory reference is decoded. The decoded instruction is then dispatched to a memory ordering unit. The instruction is retired from a load buffer and is executed after retiring.
摘要:
A processor is disclosed. The processor includes a decoder to decode instructions and a circuit, in response to a decoded instruction, detects an incoming write back or write through streaming store instruction that misses a cache and allocates a buffer in write combining mode. The circuit, in response to a second decoded instruction, detects either an uncacheable speculative write combining store instruction or a second write back streaming store or write through streaming store instruction that hits the buffer and merges the second decoded instruction with the buffer.
摘要:
A processor is described. The processor includes a decoder to decode instructions and a circuit, in response to a decoded instruction, to detect an incoming load instruction that misses a cache, allocate a buffer to service the incoming load instruction, and issue a bus request to load the data in the buffer without accessing said cache.
摘要:
In a processor cache, cache circuits are mapped into one or more logical modules. Each module may be powered down independently of other modules in response to microinstructions processed by the cache. Power control may be applied on a microinstruction-by-microinstruction basis. Because the microinstructions determine which modules are used, power savings may be achieved by powering down those modules that are not used. A cache layout organization may be modified to distribute a limited number of ways across addressable cache banks. By associating fewer than a total number of ways to a bank (for example, one or two ways), the size of memory clusters within the bank may be reduced. The reduction in this size of the memory cluster contributes reduces the power needed for an address decoder to address sets within the bank.
摘要:
A cache controller is presented having at least one register. The cache controller is connected to a cache memory, which is connected to the register. The cache controller dynamically selects between a cache management scheme based on a maximum number of programmable writeback entries and a cache management scheme allowing both writeback entries and incoming core requests to be allocated based on priority. Also presented is a device having a single request queue and a corresponding single set of buffers. The device dynamically selects between a cache management scheme based on a maximum number of programmable writeback entries and a cache management scheme allowing both writeback entries and incoming core requests to be allocated based on priority.
摘要:
A processor comprising a decoder, an execution core and a bus controller. The decoder is operative to decode instructions received by the processor including a move instruction comprising a first operand identifying a plurality of bytes of packed data and a second operand identifying a corresponding plurality of byte masks. The execution core, coupled to the decoder, is operative to receive the decoded move instruction and analyze each individual byte mask of the plurality of byte masks to identify corresponding bytes within the plurality of bytes of packed data that are write-enabled. The bus controller, coupled to the execution core, is operative to write select bytes of the plurality of bytes of packed data to an implicitly defined location based, at least in part, on the write enabled byte masks identified by the execution core.
摘要:
A method is provided that includes a step for setting a maximum number of concurrently allocated queue entries to service writeback evictions. The method also includes a step of setting a register bit based on cache requests. The method also includes a step for dynamically selecting, based on the register bit set, one of a cache management scheme based on a maximum number of programmable writeback entries and a cache management scheme allowing both writeback entries and incoming core requests to be allocated in in any free entry based on priority. According to another embodiment of the invention, a computer system is provided that includes at least one computer processor. The computer processor provided has at least one cache memory and a cache controller. Further included is a register coupled to the computer processor. Also, a memory bus is provided that is coupled to the computer processor. A memory is included that is coupled to the memory bus. A controller for dynamically selecting between a cache management scheme based on a maximum number of programmable writeback entries and a cache management scheme allowing both writeback entries and incoming core requests to be allocated based on priority is also included. The controller for dynamically selecting between one of a cache management scheme based on a maximum number of programmable writeback entries and a cache management scheme allowing both writeback entries and incoming core requests to be allocated based on priority includes a register bit within the register that is capable of being set and cleared. The computer processor queries the register to determine if the register bit is either set and cleared.