摘要:
A method and system for executing a context-altering instruction within a processor are disclosed. The processor has a superscalar architecture that includes multiple pipelines, buffers, registers, and execution units. The processor also includes a machine state register for identifying a context of the processor, and a shadow machine state register in conjunction with the machine state register. During operation, a first state of the machine state register is copied to the shadow machine state register. Instructions are executed in accordance with a context identified by the first state of the machine state register. The first state of the shadow machine state register is subsequently altered to a second state in response to decoding a context-altering instruction. The context-altering instruction and subsequent instructions are then executed in accordance with the second state of the shadow machine state register. Finally, the first state of the machine state register is altered to the second state in response to a completion of the context-altering instruction. As a result context synchronization operations are avoided.
摘要:
A method and system are disclosed for processing instructions within a data processing system including a processor having a plurality of execution units. According to the method of the present invention, a number of instructions stored within a memory within the data processing system are retrieved from memory. A selected instruction among the number of instructions is decoded to determine if the selected instruction would be noneffective if executed by the processor. In a preferred embodiment of the present invention, noneffective instructions include instructions with invalid opcodes and instructions that would not change the value of any data register within the processor. In response to determining that the selected instruction would be noneffective if executed by the processor, the selected instruction is recoded into a specified instruction format prior to dispatching the selected instruction to one of the number of execution units. Detecting noneffective instructions prior to dispatch reduces the decode logic required within the dispatcher and enhances processor performance.
摘要:
A method for selectively executing speculative load instructions in a high-performance processor is disclosed. In accordance with the present disclosure, when a speculative load instruction for which the data is not stored in a data cache is encountered, a bit within an enable speculative load table which is associated with that particular speculative load instruction is read in order to determine a state of the bit. If the associated bit is in a first state, data for the speculative load instruction is requested from a system bus and further execution of the speculative load instruction is then suspended to wait for control signals from a branch processing unit. If the associated bit is in a second state, the execution of the speculative load instruction is immediately suspended to wait for control signals from the branch processing unit. If the speculative load instruction is executed in response to the control signals, then the associated bit in the enable speculative load table will be set to the first state. However, if the speculative load instruction is not executed in response to the control signals, then the associated bit in the enable speculative load table is set to the second state. In this manner, the displacement of useful data in the data cache due to wrongful execution of the speculative load instruction is avoided.
摘要:
A method and device of executing a load multiple instruction in a superscaler microprocessor is provided. The method comprises the steps of dispatching a load multiple instruction to a load/store unit, wherein the load/store unit begins execution of a dispatched load multiple instruction, and wherein the load multiple instruction loads data from memory into a plurality of registers. The method further includes the step of maintaining a table that lists each register of the plurality of registers and that indicates when data has been loaded into each register by the executing load multiple instruction. The method concludes by executing an instruction that is dependent upon source operand data loaded by the load multiple instruction into a register of the plurality of registers indicated by the instruction as a source register, prior to the load multiple instruction completing its execution, when the table indicates the source operand data has been loaded into the source register. Also, according to the present invention, a method of executing a store multiple instruction in a superscaler microprocessor is provided. This method comprises the steps of dispatching a store multiple instruction to a load/store unit, whereupon the load/store unit begins executing the store multiple instruction, wherein the load store instruction stores data from a plurality of registers to memory; and executing a fixed point instruction that is dependent upon data being stored by the store multiple instruction from a register of the plurality of registers indicated by the fixed point instruction as a source register, prior to the store multiple instruction completing its execution, but prohibiting the executing fixed point instruction from writing to a register of the plurality of registers prior to the store multiple instruction completing.
摘要:
A method and device of executing a load multiple instruction in a superscaler microprocessor is provided. The method comprises the steps of dispatching a load multiple instruction to a load/store unit, wherein the load/store unit begins execution of a dispatched load multiple instruction, and wherein the load multiple instruction loads data from memory into a plurality of registers. The method further includes the step of maintaining a table that lists each register of the plurality of registers and that indicates when data has been loaded into each register by the executing load multiple instruction. The method concludes by executing an instruction that is dependent upon source operand data loaded by the load multiple instruction into a register of the plurality of registers indicated by the instruction as a source register, prior to the load multiple instruction completing its execution, when the table indicates the source operand data has been loaded into the source register. Also, according to the present invention, a method of executing a store multiple instruction in a superscaler microprocessor is provided. This method comprises the steps of dispatching a store multiple instruction to a load/store unit, whereupon the load/store unit begins executing the store multiple instruction, wherein the load store instruction stores data from a plurality of registers to memory; and executing a fixed point instruction that is dependent upon data being stored by the store multiple instruction from a register of the plurality of registers indicated by the fixed point instruction as a source register, prior to the store multiple instruction completing its execution, but prohibiting the executing fixed point instruction from writing to a register of the plurality of registers prior to the store multiple instruction completing.
摘要:
Apparatuses and methods of an executable-in-place solid-state device are disclosed. In one embodiment, a solid-state device includes a flash memory coupled to a dynamic random access memory, the dynamic random access memory to store at least as much data as the flash memory; and a logic circuit coupled to the flash memory and the dynamic access memory to copy data from the flash memory to the dynamic random access memory on power up of a data processing system coupled to the solid-state device. The logic circuit is to minimize writes to the flash memory by using the dynamic access memory as a working memory during operation of the data processing system, and/or to block at least some sectors of at least one of the flash memory and the dynamic random access memory when the data processing system uses the working memory to conserve power usage of the solid-state device.
摘要:
A multiscalar processor and method of executing a multiscalar program within a multiscalar processor having a plurality of processing elements and a thread scheduler are provided. The multiscalar program includes a plurality of threads that are each composed of one or more instructions of a selected instruction set architecture. Each of the plurality of threads has a single entry point and a plurality of possible exit points. The multiscalar program further comprises thread code including a plurality of data structures that are each associated with a respective one of the plurality of threads. According to the method, a third data structure among the plurality of data structures is supplied to the thread scheduler. The third data structure, which is associated with a third thread among the plurality of threads, specifies a first data structure associated with a first possible exit point of the third thread and a second data structure associated with a second possible exit point of the third thread. The third thread is assigned to a selected one of the plurality of processing elements for execution. Prior to completing execution of the third thread, the thread scheduler selects from among the first and the second possible exit points of the third thread. In response to the selection, a corresponding one of the first and second data structures is loaded into the thread scheduler for processing.
摘要:
A method and system for constructing a program are provided. According to the method, each of a plurality of instructions are assigned to at least one of a plurality of threads. The plurality of threads include first, second, and third threads, where the third thread follows the first thread and precedes the second thread in a logical program order. A data structure associated with the first thread is then constructed. The data structure includes an indication that execution of the second thread is to be initiated prior to initiation of execution of the third thread. According to one embodiment, the indication within the data structure is a pointer that specifies a second data structure associated with the second thread.
摘要:
A method and system are provided for constructing a program executable by a processor including one or more processing elements for executing threads and a thread scheduler for assigning threads to the processing elements for execution. According to the method, a plurality of threads are provided that each include at least one control flow instruction. From one or more control flow instructions within the plurality of threads, a condition upon which execution of a particular thread depends is determined. In response to the determination, at least one navigation instruction executable by the thread scheduler is created that indicates that the particular thread is to be assigned to one of the processing elements for execution in response to the condition.
摘要:
The present invention relates to a multiple stage execution unit for executing instructions in a microprocessor having a plurality of rename registers for storing execution results, an instruction cache for storing instructions, each instruction being associated with a rename register, a sequencer unit for providing an instruction to the execution unit, and a data cache for providing data to the execution unit. In one version, the execution unit includes a first stage which generates an intermediate result from the data according to an instruction; a means for providing a first portion of the intermediate result to an intermediate register; a means for providing a second portion of the intermediate result to a rename register associated with the instruction; a means for passing the first portion from the intermediate register to a second stage of the execution unit; a means for passing the second portion from the rename register to the second stage of the execution unit; wherein the second stage of the execution unit operates on the first and second portions according to the instruction.