摘要:
A memory controller provides programmable flexibility, via one or more configuration registers, for the configuration of the memory. The memory may be optimized for a given application by programming the configuration registers. For example, in one embodiment, the portion of the address of a memory transaction used to select a storage location for access in response to the memory transaction may be programmable. In an implementation designed for DRAM, a first portion may be programmably selected to form the row address and a second portion may be programmable selected to form the column address. Additional embodiments may further include programmable selection of the portion of the address used to select a bank. Still further, interleave modes among memory sections assigned to different chip selects and among two or more channels to memory may be programmable, in some implementations. Furthermore, the portion of the address used to select between interleaved memory sections or interleaved channels may be programmable. One particular implementation may include all of the above programmable features, which may provide a high degree of flexibility in optimizing the memory system.
摘要:
A processor supports an operating mode in which the default address size is greater than 32 bits and the default operand size is 32 bits. The default address size may be nominally indicated as 64 bits, although various embodiments of the processor may implement any address size which exceeds 32 bits, up to and including 64 bits, in the operating mode. The operating mode may be established by placing an enable indication in a control register into an enabled state and by setting a first operating mode indication and a second operating mode indication in a segment descriptor to predefined states. Additionally, a first instruction prefix may be coded into an instruction to override the default operand size to a first non-default operand size (e.g. 64 bits). Furthermore, a second instruction prefix may be coded into an instruction in addition to the first instruction prefix to override the default operand size to a second non-default operand size (e.g. 16 bits). Thus operand sizes of 64, 32, and 16 bits may be used when desired.
摘要:
A line predictor caches alignment information for instructions. In response to each fetch address, the line predictor provides alignment information for the instruction beginning at the fetch address, as well as one or more additional instructions subsequent to that instruction. The alignment information may be, for example, instruction pointers, each of which directly locates a corresponding instruction within a plurality of instruction bytes fetched in response to the fetch address. The line predictor may include a memory having multiple entries, each entry storing up to a predefined maximum number of instruction pointers and a fetch address corresponding to the instruction identified by a first one of the instruction pointers. Furthermore, each entry may store additional information regarding the terminating instruction within the entry. In one embodiment, the additional information includes an indication of the branch displacement when the terminating instruction is a branch instruction. In another embodiment, the additional information includes the entry point for a microcode instruction when the terminating instruction is a microcode instruction. Furthermore, the microcode instruction may be identified by an instruction pointer corresponding to a particular decode unit which is coupled to the microcode unit.
摘要:
A processor includes an instruction cache and a predecode cache which is not actively maintained coherent with the instruction cache. The processor fetches instruction bytes from the instruction cache and predecode information from the predecode cache. Instructions are provided to a plurality of decode units based on the predecode information, and the decode units decode the instructions and verify that the predecode information corresponds to the instructions. More particularly, each decode unit may verify that a valid instruction was decoded, and that the instruction succeeds a preceding instruction decoded by another decode unit. Additionally, other units involved in the instruction processing pipeline stages prior to decode may verify portions of the predecode information. If the predecode information does not correspond to the fetched instructions, the predecode information may be corrected (either by predecoding the instruction bytes or by updating the predecode information, if the update may be determined without predecoding the instruction bytes). In one particular embodiment, the predecode cache may be a line predictor which stores instruction pointers indexed by a portion of the fetch address. The line predictor may thus experience address aliasing, and predecode information may therefore not correspond to the instruction bytes. However, power may be conserved by not storing and comparing the entire fetch address.
摘要:
A circuit and method is disclosed for preserving the order for memory requests originating from I/O devices coupled to a multiprocessor computer system. The multiprocessor computer system includes a plurality of circuit nodes and a plurality of memories. Each circuit node includes at least one microprocessor coupled to a memory controller which in turn is coupled to one of the plurality of memories. The circuit nodes are in data communication with each other, each circuit node being uniquely identified by a node number. At least one of the circuit nodes is coupled to an I/O bridge which in turn is coupled directly or indirectly to one or more I/O devices. The I/O bridge generates non-coherent memory access transactions in response to memory access requests originating with one of the I/O devices. The circuit node coupled to the I/O bridge, receives the non-coherent memory access transactions. For example, the circuit node coupled to the I/O bridge receives first and second non-coherent memory access transactions. The first and second non-coherent memory access transactions include first and second memory addresses, respectively. The first and second non-coherent memory access transactions further include first and second pipe identifications, respectively. The node circuit maps the first and second memory addresses to first and second node numbers, respectively. The first and second pipe identifications are compared. If the first and second pipe identifications compare equally, then the first and second node numbers are compared. First and second coherent memory access transactions are generated by the node coupled to the I/O bridge wherein the first and second coherent memory access transactions correspond to the first and second non-coherent memory access transactions, respectively. The first coherent memory access transaction is transmitted to one of the nodes of the multiprocessor computer system. However, the second coherent memory access transaction is not transmitted unless the first and second pipe identifications do not compare equally or if the first and second node numbers compare equally.
摘要:
A technique handles load instructions within a data processor that includes a cache circuit having a data cache and a tag memory indicating valid entries within the data cache. The technique involves writing data to the data cache during a series of four processor cycles in response to a first load instruction. Additionally, the technique involves updating the tag memory and preventing reading of the tag memory in response to the first load instruction during a first processor cycle in the series of processor cycles. Furthermore, the technique involves reading tag information from the tag memory during a processor cycle of the series of four processor cycles following the first processor cycle in response to a second load instruction.
摘要:
A computing apparatus connectable to a cache and a memory, includes a system port configured to receive an atomic probe command or a system data control response command having an address part identifying data stored in the cache which is associated with data stored in the memory and a next coherence state part indicating a next state of the data in the cache. The computing apparatus further includes an execution unit configured to execute the command to change the state of the data stored in the cache according to the next coherence state part of the command.
摘要:
An arrangement and method for decoding coded instructions and playing and replaying decoded instructions to a machine. The arrangement has a source of coded instructions. Connected to this coded instruction source is a decoder for receiving and decoding the coded instructions and for outputting the decoded instructions to a machine. A silo is connected to the output of the decoder and siloes and outputs the decoded instructions to the machine. The outputting of the decoded instructions to the machine are switched between the silo and the decoder, so that the machine receives the siloed decoded instructions. By siloing and then replaying already decoded instructions at the time of a trap occurrence, a speed increase is achieved, since the instructions which are in the trap shadow do not have to be decoded again.
摘要:
A system and method for efficiently reducing the power consumption of register file accesses. A processor is operable to execute instructions with two or more data types, each with an associated size and alignment. Data operands for a first data type use operand sizes equal to an entire width of a physical register within a physical register file. Data operands for a second data type use operand sizes less than an entire width of a physical register. Accesses of the physical register file for operands associated with a non-full-width data type do not access a full width of the physical registers. A given numerical value may be bypassed for the portion of the physical register that is not accessed.
摘要:
A system and method for efficiently reducing the latency of initializing registers. A register rename unit within a processor determines whether prior to an execution pipeline stage it is known a decoded given instruction writes a particular numerical value in a destination operand. An example is a move immediate instruction that writes a value of 0 in its destination operand. Other examples may also qualify. If the determination is made, a given physical register identifier is assigned to the destination operand, wherein the given physical register identifier is associated with the particular numerical value, but it is not associated with an actual physical register in a physical register file. The given instruction is marked to prevent it from proceeding to an execution pipeline stage. When the given physical register identifier is used to read the physical register file, no actual physical register is accessed.