摘要:
One aspect of the invention relates to a method for operating a processor. In one version of the invention, the method includes the steps of dispatching an instruction; determining a presently architected RMAP entry for the architectural register targeted by the dispatched instruction; selecting the RMAP entries which are associated with physical registers that contain operands for the dispatched instruction; updating a use indicator in the selected RMAP entries; determining whether the dispatched instruction is interruptible; and updating an architectural indicator and a historical indicator in the presently architected RMAP entry if the dispatched instruction is uninterruptible.
摘要:
Apparatuses and methods of an executable-in-place solid-state device are disclosed. In one embodiment, a solid-state device includes a flash memory coupled to a dynamic random access memory, the dynamic random access memory to store at least as much data as the flash memory; and a logic circuit coupled to the flash memory and the dynamic access memory to copy data from the flash memory to the dynamic random access memory on power up of a data processing system coupled to the solid-state device. The logic circuit is to minimize writes to the flash memory by using the dynamic access memory as a working memory during operation of the data processing system, and/or to block at least some sectors of at least one of the flash memory and the dynamic random access memory when the data processing system uses the working memory to conserve power usage of the solid-state device.
摘要:
A multiscalar processor and method of executing a multiscalar program within a multiscalar processor having a plurality of processing elements and a thread scheduler are provided. The multiscalar program includes a plurality of threads that are each composed of one or more instructions of a selected instruction set architecture. Each of the plurality of threads has a single entry point and a plurality of possible exit points. The multiscalar program further comprises thread code including a plurality of data structures that are each associated with a respective one of the plurality of threads. According to the method, a third data structure among the plurality of data structures is supplied to the thread scheduler. The third data structure, which is associated with a third thread among the plurality of threads, specifies a first data structure associated with a first possible exit point of the third thread and a second data structure associated with a second possible exit point of the third thread. The third thread is assigned to a selected one of the plurality of processing elements for execution. Prior to completing execution of the third thread, the thread scheduler selects from among the first and the second possible exit points of the third thread. In response to the selection, a corresponding one of the first and second data structures is loaded into the thread scheduler for processing.
摘要:
A method and system for constructing a program are provided. According to the method, each of a plurality of instructions are assigned to at least one of a plurality of threads. The plurality of threads include first, second, and third threads, where the third thread follows the first thread and precedes the second thread in a logical program order. A data structure associated with the first thread is then constructed. The data structure includes an indication that execution of the second thread is to be initiated prior to initiation of execution of the third thread. According to one embodiment, the indication within the data structure is a pointer that specifies a second data structure associated with the second thread.
摘要:
A method and system are provided for constructing a program executable by a processor including one or more processing elements for executing threads and a thread scheduler for assigning threads to the processing elements for execution. According to the method, a plurality of threads are provided that each include at least one control flow instruction. From one or more control flow instructions within the plurality of threads, a condition upon which execution of a particular thread depends is determined. In response to the determination, at least one navigation instruction executable by the thread scheduler is created that indicates that the particular thread is to be assigned to one of the processing elements for execution in response to the condition.
摘要:
The present invention relates to a multiple stage execution unit for executing instructions in a microprocessor having a plurality of rename registers for storing execution results, an instruction cache for storing instructions, each instruction being associated with a rename register, a sequencer unit for providing an instruction to the execution unit, and a data cache for providing data to the execution unit. In one version, the execution unit includes a first stage which generates an intermediate result from the data according to an instruction; a means for providing a first portion of the intermediate result to an intermediate register; a means for providing a second portion of the intermediate result to a rename register associated with the instruction; a means for passing the first portion from the intermediate register to a second stage of the execution unit; a means for passing the second portion from the rename register to the second stage of the execution unit; wherein the second stage of the execution unit operates on the first and second portions according to the instruction.
摘要:
A processor and method of executing instructions within a processor are disclosed, which permit both a branch instruction and a target instruction of the branch instruction to be executed in response to a single instruction fetch. In accordance with an illustrative embodiment, the processor, which has an associated memory, simultaneously fetches a plurality of instructions from the memory. Branch instructions among the plurality of instructions are then detected. In response to a detection of a branch instruction among the plurality of instructions, a determination is made whether a target instruction to be executed in response to execution of the branch instruction is one of the plurality of instructions. In response to a determination that the target instruction is one of the plurality of instructions, the processor executes the target instruction without making an additional instruction fetch.
摘要:
A method and device for generating address aliases corresponding to memory locations, for avoiding false load/store collisions during memory disambiguation. The alias generator takes advantage of the fact that the entire address range will most likely not be active in the registers at any one time. The subset of the address range that is active can be represented with a smaller number of bits and, hence, the computation of true dependencies is greatly reduced. The address alias generator includes an array for receiving the memory addresses, comparators having inputs connected to each array entry and having outputs connected to an alias encoder, and a control logic unit for writing the given memory address in one of the entries. The output of a given gate is turned on if a memory address is the same as the contents of one of the entry corresponding to that output, and the control means is activated if the output of all of the gates are turned off. In the preferred embodiment, the memory addresses are 32-bit values, the array has 64 entries, and the encoder generates 6-bit values for the address aliases. The processor includes a memory disambiguation buffer for identifying load/store collisions, that uses the 6-bit address aliases.
摘要:
A method and apparatus in a superscalar microprocessor for early completion of floating-point instructions prior to a previous load/store multiple instruction is provided. The microprocessor's load/store execution unit loads or stores data to or from the general purpose registers, and the microprocessor's dispatch unit dispatches instructions to a plurality of execution units, including the load/store execution unit and the floating point execution unit. The method comprises the dispatch unit dispatching a multi-register instruction to the load/store unit to begin execution of the multi-register instruction, wherein the multi-register instruction, such as a store multiple or a load multiple, stores or loads data from more than one of the plurality of general purpose registers to memory, and further, prior to the multi-register instruction finishing execution in the load/store unit, the dispatch unit dispatches a floating-point instruction, which is dependent upon source operand data stored in one or more floating-point registers of the plurality of floating point registers, to the floating-point execution unit, wherein the dispatched floating-point instruction completes execution prior to the multi-register instruction finishing execution.
摘要:
A system and method for improving the performance of a processor that emulates a guest instruction where the guest instruction includes a first and second operand. The first operand is stored in a general purpose register, and the second operand is stored in a special-purpose register. The method and system provides a host instruction that performs an operation using the first operand and the second operand without moving the second operand from the special-purpose register into the general purpose register. This reduces the number of instructions in the semantic routines necessary to operate on immediate data from guest instructions and increases emulation performance.