摘要:
A processing system may include a memory configured to store data in a plurality of pages, a TLB, and a memory cache including a plurality of cache lines. Each page in the memory may include a plurality of lines of memory. The memory cache may permit, when a virtual address is presented to the cache, a matching cache line to be identified from the plurality of cache lines, the matching cache line having a matching address that matches the virtual address. The memory cache may be configured to permit one or more page attributes of a page located at the matching address to be retrieved from the memory cache and not from the TLB, by further storing in each one of the cache lines a page attribute of the line of data stored in the cache line.
摘要:
A fetch section of a processor comprises an instruction cache and a pipeline of several stages for obtaining instructions. Instructions may cross cache line boundaries. The pipeline stages process two addresses to recover a complete boundary crossing instruction. During such processing, if the second piece of the instruction is not in the cache, the fetch with regard to the first line is invalidated and recycled. On this first pass, processing of the address for the second part of the instruction is treated as a pre-fetch request to load instruction data to the cache from higher level memory, without passing any of that data to the later stages of the processor. When the first line address passes through the fetch stages again, the second line address follows in the normal order, and both pieces of the instruction are can be fetched from the cache and combined in the normal manner.
摘要:
A processor includes a conditional branch instruction prediction mechanism that generates weighted branch prediction values. For weakly weighted predictions, which tend to be less accurate than strongly weighted predictions, the power associating with speculatively filling and subsequently flushing the cache is saved by halting instruction prefetching. Instruction fetching continues when the branch condition is evaluated in the pipeline and the actual next address is known. Alternatively, prefetching may continue out of a cache. To avoid displacing good cache data with instructions prefetched based on a mispredicted branch, prefetching may be halted in response to a weakly weighted prediction in the event of a cache miss.
摘要:
A CAM bank is functionally divided into two or more sub-banks, without replicating CAM driver circuits, by disabling all match line discharge circuits in the bank, and selectively enabling the discharge circuits in entries comprising sub-banks. At least one selectively actuated switching circuit is interposed between the virtual ground node of each discharging comparator in the discharge circuit of a sub-bank and circuit ground. When the switching circuit is in a non-conductive state, the virtual ground node is maintained at a voltage level sufficiently above circuit ground to preclude discharging a connected match line within the CAM access time. When the switching circuit is placed in a conductive state, the virtual ground node is pulled to circuit ground and the connected match line may be discharged by a miscompare. Control signals, which may be decoded from address bits, are distributed to the switching circuits to define the CAM sub-banks.
摘要:
A processing system may include a memory configured to store data in a plurality of pages, a TLB, and a memory cache including a plurality of cache lines. Each page in the memory may include a plurality of lines of memory. The memory cache may permit, when a virtual address is presented to the cache, a matching cache line to be identified from the plurality of cache lines, the matching cache line having a matching address that matches the virtual address. The memory cache may be configured to permit one or more page attributes of a page located at the matching address to be retrieved from the memory cache and not from the TLB, by further storing in each one of the cache lines a page attribute of the line of data stored in the cache line.
摘要:
A pipelined processor comprises an instruction cache (iCache), a branch target address cache (BTAC), and processing stages, including a stage to fetch from the iCache and the BTAC. To compensate for the number of cycles needed to fetch a branch target address from the BTAC, the fetch from the BTAC leads the fetch of a branch instruction from the iCache by an amount related to the cycles needed to fetch from the BTAC. Disclosed examples either decrement a write address of the BTAC or increment a fetch address of the BTAC, by an amount essentially corresponding to one less than the cycles needed for a BTAC fetch.
摘要:
A processor includes a return stack circuit used for predicting procedure return addresses for instruction pre-fetching, wherein a return stack controller determines the number of return levels associated with a given return instruction, and pops that number of return addresses from the return stack. Popping multiple return addresses from the return stack permits the processor to pre-fetch the return address of the original calling procedure in a chain of successive procedure calls. In one embodiment, the return stack controller reads the number of return levels from a value embedded in the return instruction. A complementary compiler calculates the return level values for given return instructions and embeds those values in them at compile-time. In another embodiment, the return stack circuit dynamically tracks the number of return levels by counting the procedure calls (branches) in a chain of successive procedure calls.