摘要:
A method for supporting speculative execution includes designating operations as speculative or non-speculative, and then deferring exceptions generated by speculative operations while immediately reporting exceptions by non-speculative operations. If a speculative operation uses a result of a speculative operation that has generated an exception, the exception is propagated. Deferred exceptions are detected and reported using a check operation either incorporated into a non-speculative operation or inserted as a separate check operation. A system for supporting speculative execution includes a functional unit for recognizing a speculative operation and deferring any exceptions generated by such an operation. The functional unit may defer an exception by storing information indicating an error has occurred in the register file. To check for deferred exceptions, the functional unit then reads the register file. If an exception is detected, then the exception is processed and one or more of the speculative operation are re-executed (in a non-speculative mode) where necessary to process the exception.
摘要:
A method for supporting speculative execution includes designating operations as speculative or non-speculative, and then deferring exceptions generated by speculative operations while immediately reporting exceptions by non-speculative operations. If a speculative operation uses a result of a speculative operation that has generated an exception, the exception is propagated. Deferred exceptions are detected and reported using a check operation either incorporated into a non-speculative operation or inserted as a separate check operation. A system for supporting speculative execution includes a functional unit for recognizing a speculative operation and deferring any exceptions generated by such an operation. The functional unit may defer an exception by storing information indicating an error has occurred in the register file. To check for deferred exceptions, the functional unit then reads the register file.
摘要:
A memory processor which prevents errors when the compiler advances long latency load instructions in the instruction sequence to reduce the loss of efficiency resulting from the latency time. The memory processor intercepts all load and store instructions prior to the instructions entering the memory pipeline. The memory processor stores load instructions for a period of time sufficient to determine if any subsequent store instruction that would have been executed prior to the load instruction, had the load instruction not been moved, references the same address as that specified in the load instruction. If a store instruction references the load instruction address, the invention returns the same data as the load instruction would have if it was not moved by the compiler.
摘要:
An improved data processing system for executing branch instructions which has lower latency times and which only rarely requires the instruction pipeline to be flushed is disclosed. The data processing system utilizes a register file to hold the information needed to execute a branch instruction. The information is loaded into the register file in advance of the branch instruction. This allows the system to prepare more than one branch instruction at any given time. The present invention may be used to cause the cache line containing the target address of the branch instruction to be loaded soon as the target address is available for the branch instruction. Since the outcome of the branch instruction is almost always known when the branch instruction enters the instruction pipeline, the instruction pipeline only rarely needs to be flushed.
摘要:
In a digital computer, a circular queue of registers in a register file are allocated as temporary local storage for procedures rather than using the known caller/callee save convention in order to minimize main memory references. A called procedure dynamically allocates local registers as needed without regard to registers used by the caller of the procedure or by any callee of the procedure, whereby register allocation is not restricted by any predetermined window size. Local registers, including parameter passing registers, are allocated in the called procedure, rather than a priori at compile time, by adjusting register stack pointer values. Only the number of registers actually required by the procedure need by allocated. Optionally, rotating registers may be allocated among the local registers. Stack pointer values are stored in one of the parameter passing registers when a procedure is called. Hardware register file access circuitry maps virtual register numbers used by the procedures into the hardware register file. Upon return from a procedure, registers are deallocated by adjusting the register stack pointers to the values stored when the procedure was called.
摘要:
The op-code bandwidth limitation of computer systems is alleviated by providing one or more vector buffers. Data is transferred between memory and processor registers in a two part process using the vector buffers. In a first part, a vector request instruction initiates buffering of data by storing data in control registers identifying a set of data elements (a vector) in the memory. When the identifying information is loaded in the control registers, a vector prefetch controller transfers elements of the vector between the memory and a vector buffer. In a second part, vector element operation instructions transfer a next element of the vector between the vector buffer and a specified processor register for use in arithmetic or logic operations.
摘要:
To improve the function of a circuit for prefetching data accessed by a processor, a prefetch unit incorporates therein a circuit for issuing a request to read out one group of data to be prefetched and registers for holding the group of data read in response to the read request therein. The group of data are read out from a cache memory or a main memory under the control of a cache request unit. A plurality of groups of data can be prefetched. When data designation is made, the processor requests the cache memory to read a block to which the data to be prefetched belongs. A circuit is also included in the prefetch unit, wherein when prefetched data is subsequently updated by the processor, its updated data is made invalid. Elements of a vector complex in structure, such as an indexed vector or the like can be also read out. It is also possible to cope with an interrupt generated within the processor.
摘要:
A programmable logic device (PLD) including a plurality of programmable tiles organized in blocks. Each block comprises a unique subset of the plurality of programmable tiles. A data bus extends to each of the blocks. An independent address circuit is provided within each block. A block select line is coupled to each block such that when the block select is line is asserted the address circuit of a selected block is capable of transferring data from the data bus to the plurality of programmable tiles and when the block enable line is deasserted the data bus is substantially electrically isolated from the address circuit and data bus.
摘要:
A programmable logic device (PLD) including a plurality of programmable tiles organized in blocks. Each block comprises a unique subset of the plurality of programmable tiles. A data bus extends to each of the blocks. An independent address circuit is provided within each block. A block select line is coupled to each block such that when the block select is line is asserted the address circuit of a selected block is capable of transferring data from the data bus to the plurality of programmable tiles and when the block enable line is deasserted the data bus is substantially electrically isolated from the address circuit and data bus.