摘要:
A method and system for permitting single cycle instruction dispatch in a superscalar processor system which dispatches multiple instructions simultaneously to a group of execution units for execution and placement of results thereof within specified general purpose registers. Each instruction generally includes at least one source operand and one destination operand. A plurality of intermediate storage buffers are provided and each time an instruction is dispatched to an available execution unit, a particular one of the intermediate storage buffers is assigned to any destination operand within the dispatched instruction, permitting the instruction to be dispatched within a single cycle by eliminating any requirement for determining and selecting the specified general purpose register or a designated alternate general purpose register.
摘要:
The method and system of the present invention permits enhanced instruction dispatch efficiency in a superscalar processor system capable of fetching an application specified ordered sequence of scalar instructions and simultaneously dispatching a group of the scalar instructions to a plurality of execution units on a nonsequential opportunistic basis. A group of scalar instructions fetched in an application specified ordered sequence on a nonsequential opportunistic basis is processed in the present invention. The present invention detects conditions requiring serialization during the processing. In response to a detection of a condition requiring serialization, processing of particular scalar instructions from the group of scalar instructions are selectively controlled, wherein at least a portion of the scalar instructions within the group of scalar instructions are thereafter processed in a serial fashion.
摘要:
A method and system for increased instruction dispatch efficiency in a superscalar processor system having an instruction queue for receiving a group of instructions in an application specified sequential order and an instruction dispatch unit for dispatching instructions from an associated instruction buffer to multiple execution units on an opportunistic basis. The dispatch status of instructions within the associated instruction buffer is periodically determined and, in response to a dispatch of the instructions at the beginning of the instruction buffer, the remaining instructions are shifted within the instruction buffer in the application specified sequential order and a partial group of instructions are loaded into the instruction buffer from the instruction queue utilizing a selectively controlled multiplex circuit. In this manner additional instructions may be dispatched to available execution units without requiring a previous group of instructions to be dispatched completely.
摘要:
A method and system for increased instruction synchronization efficiency in a superscalar processor system which includes instructions having multiple source and destination operands. Simultaneous dispatching of multiple instructions creates a source-to-destination data dependency problem in that the results of one instruction may be necessary to accomplish execution of a second instruction. Data dependency hazards may be eliminated by prohibiting each instruction from dispatching until all possible data dependencies have been eliminated by the completion of preceding instructions; however, instruction dispatch efficiency is substantially decreased utilizing this technique. Data dependency interlock circuitry may be utilized to clear possible data dependency hazards; however, the complexity of such circuitry increases dramatically as the number of interlocked sources and destinations increases. The method and system of the present invention utilizes data dependency interlock circuitry capable of interlocking two source operands by two destination operands for each instruction. Instructions having three or more source operands are interlocked at the dispatch stage for the first two source operands utilizing existing data dependency interlock circuitry. Thereafter, the instruction is dispatched only after data dependency hazards are cleared for the first two source operands, utilizing the data dependency interlock circuitry, and all instructions preceding the instruction have been completed, eliminating possible data dependency hazards for the third source operand. In this manner, instructions which include three source operands may be synchronized without requiring a substantial increase in data dependency interlock circuitry and with only a slight degradation in system efficiency.
摘要:
An improved method of addressing within a pipelined processor having an address bit width of m+n bits is disclosed, which includes storing m high order bits corresponding to a first range of addresses, which encompasses a selected plurality of data executing within the pipelined processor. The n low order bits of addresses associated with each of the selected plurality of data are also stored. After determining the address of a subsequent datum to be executed within the processor, the subsequent datum is fetched. In response to fetching a subsequent datum having an address outside of the first range of addresses, a status register is set to a first of two states to indicate that an update to the first address register is required. In response to the status register being set to the second of the two states, the subsequent datum is dispatched for execution within the pipelined processor. The n low order bits of the subsequent datum are then stored, such that memory required to store addresses of instructions executing within the pipelined processor is thereby decreased.
摘要:
A method and circuit generates a sampling clock signal that digitizes an analog video signal. The sampling clock signal is generated by a clock divider coupled to the horizontal synchronization signal of the analog video signal. A divisor calculator calculates a divisor for the clock divider to control the frequency of the sampling clock signal. Specifically, the divisor calculator selects an initial divisor for the clock divider. Then the divisor calculator calculates a new divisor based on the target pixel value provided by a mode detector and the measured pixel value from a counter. Some embodiments of the present invention provides fine tuning of the frequency by testing other possible divisors with a plurality of different phases. In addition, some embodiments of the present invention calibrate the phase of the sampling clock signal to generate a phase shifted sampling clock signal.
摘要:
A method and circuit generates a sampling clock signal that digitizes an analog video signal. The sampling clock signal is generated by a clock divider coupled to the horizontal synchronization signal of the analog video signal. A divisor calculator calculates a divisor for the clock divider to control the frequency of the sampling clock signal. Specifically, the divisor calculator selects an initial divisor for the clock divider. Then the divisor calculator calculates a new divisor based on the target pixel value provided by a mode detector and the measured pixel value from a counter. Some embodiments of the present invention fine tune the frequency by testing other possible divisors with a plurality of different phases. In addition, some embodiments of the present invention calibrate the phase of the sampling clock signal to generate a phase shifted sampling clock signal.
摘要:
A method and system for enhanced instruction dispatch efficiency in a superscalar processor system having a plurality of intermediate storage buffers, a plurality of general purpose registers, and a storage buffer index. Multiple scalar instructions may be simultaneously dispatched from a dispatch buffer to a plurality of execution units. Each of the multiple scalar instructions generally include at least one source operand and one destination operand. A particular one of the plurality of intermediate storage buffers is assigned to a destination operand within a selected one of the multiple scalar instructions. A relationship between the particular one of the plurality of intermediate storage buffers and a designated one of the plurality of general purpose registers is stored in the storage buffer index at that time when the instruction which has been dispatched is replaced in the dispatcher by another instruction in the application program sequence. Results of execution from the selected one of the multiple scalar instructions are stored in the particular one of the intermediate storage buffers when the selected instruction is executed. The storage buffer index is used to determine which storage buffers to use as source operands for those instructions which are dispatched between the time that a storage buffer has been assigned for a specific general purpose register and the results of execution are moved from the storage buffer into the general purpose register.
摘要:
A processing system allows for out of order instruction execution and includes at least one load/store unit for loading instructions to a register for processing by a fixed point unit, floating point unit, or the like, and store the results to memory. A load queue maintains the addresses and program numbers of the load instructions. During execution the address of the store instruction is compared to the address in the load queue of previously executed load instructions. A program counter compares the program number of the store instruction with the program number of the load instruction in the load queue. If the addresses are different, then no impermissible out of order situation exists between the load and store instructions being compared, because the data is not at the same address. If the address is the same, and the store program number is greater than the load program number, then the instructions have been executed in order (the load correctly preceded the store) and no problem exists. However, if the addresses are the same and the load instruction has been incorrectly reordered to precede the store instruction, then a reordering conflict exists and the load instructions must be re-executed.
摘要:
A method and system for permitting single cycle instruction dispatch in a superscalar processor system which dispatches multiple instructions simultaneously to a group of execution units for execution and placement of results thereof within specified general purpose registers. Each instruction generally includes at least one source operand and one destination operand. A plurality of intermediate storage buffers are provided and each time an instruction is dispatched to an available execution unit, a particular one of the intermediate storage buffers is assigned to any destination operand within the dispatched instruction, permitting the instruction to be dispatched within a single cycle by eliminating any requirement for determining and selecting the specified general purpose register or a designated alternate general purpose register.