摘要:
A CAM/SRAM structure (44) performs address translations that are compatible with a segmentation/paging addressing scheme yet require only a single look-up step. Each entry in the effective-to-real-address-translator (ERAT) has two CAM fields (ESID, EPI) that independently compare an input segment identifier and an input page identifier to a stored segment identifier and a stored page identifier, respectively. The ERAT outputs a stored real address field (DATA) associated with a stored segment-stored page pair if both comparisons are equivalent. The ERAT can invalidate stored translations on the basis of segment or page granularity by requiring either a segment or a page CAM field match, respectively, during an invalidate operation.
摘要:
An address translator (126) translates addresses, acting like a register file or a table, as necessary. The address translator contains a number of entries for matching an input address to a stored tag. An entry outputs a stored translated address if its stored tag matches the input address. A decoder (138) selects a particular entry in which to store an input translated address when the address translator operates as a register file. In these cases, a register number is also stored in the particular entry's as the entry's tag. Later, when it is necessary to read the particular entry, the register number is compared to each entry's tag to find a match. The disclosed address translator is compatible with both hardware and software refill algorithms ("tablewalks") without impacting its critical read speed path.
摘要:
A method for low latency access to the control space. A pipeline processor executes instructions in multiple stages including a decode stage, one or more execution, stages, and a writeback stage. A control space access instruction includes a first field containing a control register specifier and a second field containing a general purpose register specifier. The decode stage is configured to decode the first and second fields and place the decoded contents on a global operand bus. The specified control register is addressed from the global operand bus while the access instruction is in decode. In the case of a read instruction, the addressed control register places its contents on the global operand bus while the instruction remains in decode. In the case of a write instruction, the general purpose register is addressed during the execution stage and its contents placed on the global operand bus during the writeback stage such that the contents of the addressed general purpose register are moved to the addressed configuration register during the writeback stage.
摘要:
An address translator (42) with a by-pass circuit (106) translates a received effective address into a real address in a first mode of operation by matching a portion of the effective address and a stored translation tag. The address translator outputs a real address corresponding to the matching translation tag on a plurality of bit lines (BIT LINE). The by-pass circuit connects the input effective address to the bit lines in a second mode of operation. The address translator thereby eliminates the need for a subsequent two-to-one multiplexer.
摘要:
A method for forwarding data within a pipeline of a pipelined data processor having a plurality of execution pipeline stages where each stage accepts a plurality of operand inputs and generates a result. The result generated by each execution pipeline stage is selectively coupled to an operand input of one of the execution pipeline stages.
摘要:
A fully associative address translator which includes a number of entries, each of said number of entries translating a received effective address into a real address, each received effective address including a segment identifier and a page identifier. Each of the entries within the fully associative address translator includes a first translation from an effective address segment identifier into a virtual address segment identifier and a second translation from a virtual address page identifier to a real address page identifier. A first valid bit cell is provided for storing a validity bit which indicates the validity of the first translation from the effective address segment identifier to the virtual address segment identifier and a second valid bit cell is also provided for storing a validity bit indicating the validity of the second translation from the virtual address page identifier to the real address page identifier wherein a process context switch will invalidate only a portion of each of the entries, thereby reducing the miss penalty associated with a context switch.
摘要:
In a computer system having a central processing unit (CPU) execution pipeline and a floating point unit (FPU) execution pipeline, the CPU execution pipeline including a CPU decoder pipestage and the FPU execution pipeline including an FPU decoder pipestage, the method including the steps of, (a) sending a first instruction to the CPU decoder pipestage, (b) sending the first instruction to the FPU decoder pipestage, (c) generating a signal indicating that the first instruction has been accepted by the CPU decoder pipestage, (d) generating a signal indicating that the first instruction has been accepted by the FPU decoder pipestage, (e) sending a second instruction to the CPU decoder pipestage in response to step (d), and (f) sending a second instruction to the FPU decoder pipestage in response to step (c). A corresponding apparatus is also provided.
摘要:
A processor element, structured to execute a 32-bit fixed length instruction set architecture, is backward compatible for executing a 16-bit fixed length instruction set architecture by translating each of the 16-bit instructions into a sequence of one or more 32-bit instructions. The 32-bit instruction set architecture includes “prepare to branch” instructions that allow target addresses for branch instructions to be set up in advance of the branch. The 32-bit prepare to branch and branch instructions are combined to execute a 16-bit branch instruction coupled with a 16-bit Delay Slot instruction.
摘要:
A computer system having a central processing unit (CPU) execution pipeline and a floating point unit (FPU) execution pipeline, the CPU pipeline including a plurality of pipestages and the FPU pipeline including a plurality of pipestages, wherein each CPU pipestage in the CPU pipeline has a corresponding pipestage in the FPU pipeline, a method of synchronizing operation of the CPU pipeline and the FPU pipeline, the method including the steps of (a) receiving an instruction in a first CPU pipestage, (b) receiving the instruction in a corresponding first FPU pipestage, (c) processing the instruction in the first CPU pipestage, (d) processing the instruction in the first FPU pipestage, (e) generating, by the first CPU pipestage, a first signal indicating that the instruction has been processed by first CPU pipestage and is ready to proceed to a second pipestage in the CPU pipeline, (f) generating by the first FPU pipestage, a second signal indicating that the instruction has been processed by the first FPU pipestage and is ready to proceed to a second pipestage in the FPU pipeline, (g) sending the instruction from the first CPU pipestage to the second pipestage in the CPU pipeline, (h) sending the instruction from the first FPU pipestage to the second pipestage in the FPU pipeline, (i) wherein the second pipestage in the CPU pipeline responds to the second signal to send the instruction to a third pipestage in the CPU pipeline, and (j) wherein the second pipestage in the FPU pipeline responds to the first signal to send the instruction to a third pipestage in the FPU pipeline. A corresponding apparatus is also provided.
摘要:
According to the present invention, techniques for setting selected operand fields in pipelined architectures are provided. Methods and systems for efficiently selecting operand fields according to the present invention can be operative on a variety of computer architectures, including RISC architectures.