摘要:
Techniques for controllably allocating a portion of a plurality of memory banks as cache memory are disclosed. To this end, a configuration tracker and a bank selector are employed. The configuration tracker configures whether each memory bank is to operate in a cache or not. The bank selector has a plurality of bank distributing functions. Upon receiving an incoming address, the bank selector determines the configuration of memory banks currently operating as the cache and applies an appropriate bank distributing function based on the configuration of memory banks. The applied bank distributing function utilizes bits in the incoming address to access one of the banks configured as being in the cache.
摘要:
Method and apparatus for reformatting instructions in a pipelined processor. An instruction register holds a plurality of instructions received from a cache memory external to the processor. A predecoder predecodes each of the instructions and determines from an instruction operation field where the instruction fields should be placed. A multiplexer reformats architecturally aligned instructions into hardware implementation aligned instructions prior to storing into L1 cache, so that the instructions are ready for dispatch to the pipeline execution units.
摘要:
A method and system for avoiding various hazards for instructions which are propagating through a microprocessor pipeline. When a plurality of instructions exist within the pipeline which read and write the same value, a vector is established to distinguish the older from the newer instructions. Further, before instructions are dispatched for execution, pointers are generated which identify the particular instruction which had the operand or parameter value needed. Accordingly, by monitoring both the recent vector and pointers, dated dependency hazards can be avoided.
摘要:
In response to a property of a conditional branch instruction associated with a loop, such as a property indicating that the branch is a loop-ending branch, a count of the number of iterations of the loop is maintained, and a multi-bit value indicative of the loop iteration count is stored in a Branch History Register (BHR). In one embodiment, the multi-bit value may comprise the actual loop count, in which case the number of bits is variable. In another embodiment, the number of bits is fixed (e.g., two) and loop iteration counts are mapped to one of a fixed number of multi-bit values (e.g., four) by comparison to thresholds. Separate iteration counts may be maintained for nested loops, and a multi-bit value stored in the BHR may indicate a loop iteration count of only an inner loop, only the outer loop, or both.
摘要:
A method and apparatus are provided for determining power events on an I/C chip for undissipated power to the chip, and wherein the chip includes a plurality of separately regulatable power consumers. A structure is provided for monitoring the occurrence of each power event to each power consumer, and determining the dissipation of power from each power event, and controlling power used by the chip responsive to the amount of undissipated power.
摘要:
A method, an apparatus, and a computer program are provided for a retry cancellation mechanism to enhance system performance when a cache is missed or during direct memory access in a multi-processor system. In a multi-processor system with a number of independent nodes, the nodes must be able to request data that resides in memory locations on other nodes. The nodes search their memory caches for the requested data and provide a reply. The dedicated node arbitrates these replies and informs the nodes how to proceed. This invention enhances system performance by enabling the transfer of the requested data if an intervention reply is received by the dedicated node, while ignoring any retry replies. An intervention reply signifies that the modified data is within the node's memory cache and therefore, any retries by other nodes can be ignored.
摘要:
A method, computer system and set of signals are disclosed allowing for communication of a data transfer, via a bus, between a master and a slave using a single transfer request regardless of transfer size and alignment. The invention provides three transfer qualifier signals including: a first signal including a starting byte address of the data transfer; a second signal including a size of the data transfer in data beats; and a third signal including a byte enable for each byte required during a last data beat of the data transfer. The invention is usable with single or multiple beat, aligned or unaligned data transfers. Usage of the three transfer qualifier signals provides the slave with how many data beats it will transfer at the start of the transfer, and the alignment of both the starting and ending data beats. As a result, the slave need not calculate the number of bytes it will transfer. In terms of multiple beat transfers, the number of data transfer requests are reduced, which reduces the amount of switching, bus arbitration and power consumption required. In addition, the invention allows byte enable signals to be used for subsequent data transfer requests prior to the completion of the initial data transfer, which reduces power consumption and allows for pipelining of data transfer requests.
摘要:
A method and apparatus for executing instructions in a pipeline processor. The method decreases the latency between an instruction cache and a pipeline processor when bubbles occur in the processing stream due to an execution of a branch correction, or when an interrupt changes the sequence of an instruction stream. The latency is reduced when a decode stage for detecting branch prediction and a related instruction queue location have invalid data representing a bubble in the processing stream. Instructions for execution are inserted in parallel into the decode stage and instruction queue, thereby reducing by one cycle time the length of the pipeline stage.
摘要:
Embodiments include systems and methods for selectively inclusive multi-level cache. When data for which memory coherency is designated is received from a process and stored into a lower level cache the data is copied into a higher level of cache. When the data is snooped it is snooped from the higher level cache and not the lower level of cache. When data is invalidated in the higher level cache, the data is invalidated in the lower level cache also. Lines of higher level cache are inclusive of lower level cache lines for data for which memory coherency is designated, but need not be inclusive of data for which coherency is not designated.
摘要:
A method and system for monitoring the real-time of software running on a microprocessor system. Debug hardware is used to select a range of instructions or events to be monitored by a performance monitor interval with the microprocessor system. A comparison is made between each event and start and stop events are identified in the debug hardware. The performance monitor is enabled by the debug hardware, when events occur within the range defined by the debug hardware. Use of the debug hardware for enabling performance monitoring avoids any overhead associated with generating interrupts, or additional code in the application program.