Superscalar microprocessor having symmetrical, fixed issue positions
each configured to execute a particular subset of instructions
    71.
    发明授权
    Superscalar microprocessor having symmetrical, fixed issue positions each configured to execute a particular subset of instructions 失效
    超标量微处理器具有对称的,固定的发布位置,每个配置成执行特定的指令子集

    公开(公告)号:US5901302A

    公开(公告)日:1999-05-04

    申请号:US690384

    申请日:1996-07-26

    IPC分类号: G06F9/38 G06F9/46 G06F12/12

    摘要: A microprocessor employing a reorder buffer is configured with fixed, symmetrical issue positions. The symmetrical nature of the issue positions may increase the average number of instructions to be concurrently dispatched and executed by the microprocessor. In one embodiment, the reorder buffer allocates a line of storage sufficient to store instruction results corresponding to a maximum number of concurrently dispatchable instructions regardless of the number actually dispatched. The average number of unused locations within the line decreases as the average number of concurrently dispatched instructions increases.

    摘要翻译: 采用重排序缓冲器的微处理器配置有固定的对称发布位置。 问题位置的对称性质可能会增加由微处理器同时调度和执行的指令的平均数量。 在一个实施例中,重新排序缓冲器分配足够的存储线,以便存储对应于最大数量的可同时分发的指令的指令结果,而不管实际发送的数目如何。 随着并发调度指令的平均数量的增加,行中未使用位置的平均数量减少。

    Method and apparatus for five bit predecoding variable length
instructions for scanning of a number of RISC operations
    72.
    发明授权
    Method and apparatus for five bit predecoding variable length instructions for scanning of a number of RISC operations 失效
    用于扫描多个RISC操作的五位预解码可变长度指令的方法和装置

    公开(公告)号:US5898851A

    公开(公告)日:1999-04-27

    申请号:US873115

    申请日:1997-06-11

    摘要: A superscalar microprocessor is provided that includes a predecode unit configured to predecode variable byte-length instructions prior to their storage within an instruction cache. The predecode unit is configured to generate a plurality of predecode bits for each instruction byte. The plurality of predecode bits associated with each instruction byte include an end bit and two ROP bits. The ROP bits indicate a number of microinstructions required to implement the instruction. The plurality of predecode bits are collectively referred to as a predecode tag. An instruction alignment unit then uses the predecode tags to identify microinstructions. The instruction alignment unit dispatches the microinstructions simultaneously to a plurality of decode units which form fixed issue positions within the superscalar microprocessor. Because the instruction alignment unit identifies microinstructions, the multiplexing of instructions from the instruction alignment unit to the decoders is simplified. Accordingly, relatively fast multiplexing may be attained, and high performance may be accommodated.

    摘要翻译: 提供了一种超标量微处理器,其包括预定解码单元,其被配置为在可变字节长度指令存储在指令高速缓存之前预解码。 预解码单元被配置为为每个指令字节生成多个预解码位。 与每个指令字节相关联的多个预解码位包括结束位和两个ROP位。 ROP位指示实现该指令所需的微指令数。 多个预解码比特统称为预解码标签。 指令对齐单元然后使用预解码标签来识别微指令。 指令对准单元将微指令同时分配到在超标量微处理器内形成固定发行位置的多个解码单元。 由于指令对准单元识别微指令,简化了从指令对准单元到解码器的指令的复用。 因此,可以实现相对快速的复用,并且可以适应高性能。

    Load/store unit with multiple oldest outstanding instruction pointers
for completing store and load/store miss instructions
    73.
    发明授权
    Load/store unit with multiple oldest outstanding instruction pointers for completing store and load/store miss instructions 失效
    加载/存储单元具有多个最早的未完成指令指针,用于完成存储和加载/存储错误指令

    公开(公告)号:US5887152A

    公开(公告)日:1999-03-23

    申请号:US420737

    申请日:1995-04-12

    申请人: Thang M. Tran

    发明人: Thang M. Tran

    IPC分类号: G06F9/38

    摘要: A superscalar microprocessor is provided having a load/store unit which receives a pair of pointers identifying the oldest outstanding instructions which are not in condition for retirement. The load/store unit compares these pointers with the reorder buffer tags of load instructions that miss the data cache and store instructions. A match must be found before the associated instruction accesses the data cache and the main memory system. The pointer-compare mechanism provides an ordering mechanism for load instructions that miss the data cache and store instructions.

    摘要翻译: 提供了一种超标量微处理器,其具有负载/存储单元,该单元接收一对指示符,用于识别不符合退休条件的最早的未完成指令。 加载/存储单元将这些指针与错过数据高速缓存和存储指令的加载指令的重新排序缓冲区标记进行比较。 必须在关联指令访问数据高速缓存和主存储器系统之前找到匹配项。 指针比较机制为缺少数据高速缓存和存储指令的加载指令提供了排序机制。

    Instruction alignment using a dispatch list and a latch list
    75.
    发明授权
    Instruction alignment using a dispatch list and a latch list 失效
    指令对齐使用调度列表和锁存列表

    公开(公告)号:US5859992A

    公开(公告)日:1999-01-12

    申请号:US815566

    申请日:1997-03-12

    IPC分类号: G06F9/30 G06F9/38 G06F9/00

    摘要: An instruction alignment unit includes a byte queue configured to store instruction blocks. Each instruction block includes a fixed number of instruction bytes and identifies up to a maximum number of instructions within the fixed number of instruction bytes. Additionally, the instruction alignment unit is configured to form a pair of instruction lists: a dispatch list and a latch list. The dispatch list includes instruction locators corresponding to instructions within the instruction blocks stored in the byte queue. Additionally, the first three instructions from instructions blocks being received from the instruction cache during a particular clock cycle are appended to the dispatch list. The dispatch list is used to select instructions from the byte queue for dispatch to the decode units. The latch list is used for receiving instruction locators for the remaining instructions from the instruction blocks received from the instruction cache during the particular clock cycle. Furthermore, the latch list receives instruction locators from the dispatch list which correspond to instructions not selected for dispatch to the decode units. The latch list is stored until a succeeding clock cycle, in which the stored program-ordered list is used as a basis for forming the dispatch list during that succeeding clock cycle. The instruction identification information and instruction bytes corresponding to the instruction can be located by selecting the instructions corresponding to the instruction locators at the front of the dispatch list.

    摘要翻译: 指令对准单元包括被配置为存储指令块的字节队列。 每个指令块包括固定数量的指令字节,并且在固定数目的指令字节内识别最多指令数。 此外,指令对准单元被配置为形成一对指令列表:调度列表和锁存列表。 调度列表包括对应于存储在字节队列中的指令块内的指令的指令定位符。 此外,在特定时钟周期期间从指令高速缓存接收到来自指令块的前三个指令被附加到调度列表。 调度列表用于从字节队列中选择用于调度到解码单元的指令。 锁存列表用于从特定时钟周期内从指令高速缓存接收到的指令块接收剩余指令的指令定位器。 此外,锁存列表从调度列表接收与未被选择用于发送到解码单元的指令对应的指令定位器。 存储锁存列表直到下一个时钟周期,其中存储的程序排序列表用作在该后续时钟周期期间形成分派列表的基础。 可以通过选择与调度列表前面的指令定位符相对应的指令来定位与该指令相对应的指令识别信息和指令字节。

    Apparatus for providing memory and register operands concurrently to
functional units
    76.
    发明授权
    Apparatus for providing memory and register operands concurrently to functional units 失效
    用于向功能单元同时提供存储器和寄存器操作数的装置

    公开(公告)号:US5835968A

    公开(公告)日:1998-11-10

    申请号:US633302

    申请日:1996-04-17

    IPC分类号: G06F9/38 G06F12/08 G06F12/10

    摘要: An apparatus including address generation units, corresponding reservation stations, and a speculative register file is provided. Decode units provide memory operation information to the corresponding reservation stations while the associated instructions are being decoded. The speculative register file stores speculative register values corresponding to previously decoded instructions. The speculative register values are generated prior to execution of the previously decoded instructions. If the register operands included in the address operands of an instruction are stored in the speculative register file, then the memory operation may be passed through the corresponding reservation station to an address generation unit. The address generation unit generates the data address from the address operands and accesses a data cache while register operands corresponding to the instruction are requested from a register file and reorder buffer.

    摘要翻译: 提供了包括地址生成单元,相应的保留站和推测寄存器文件的装置。 解码单元在对相关指令进行解码的同时向对应的保留站提供存储器操作信息。 推测寄存器文件存储与先前解码的指令相对应的推测寄存器值。 在执行先前解码的指令之前生成推测寄存器值。 如果包含在指令的地址操作数中的寄存器操作数存储在推测寄存器文件中,则存储器操作可以通过相应的保留站传递给地址生成单元。 地址生成单元从地址操作数生成数据地址,并访问数据高速缓存,同时从寄存器文件和重排序缓冲器请求与指令对应的寄存器操作数。

    Method and apparatus for predecoding variable byte-length instructions
within a superscalar microprocessor
    77.
    发明授权
    Method and apparatus for predecoding variable byte-length instructions within a superscalar microprocessor 失效
    用于在超标量微处理器内预编码可变字节长度指令的方法和装置

    公开(公告)号:US5822558A

    公开(公告)日:1998-10-13

    申请号:US790394

    申请日:1997-01-29

    申请人: Thang M. Tran

    发明人: Thang M. Tran

    IPC分类号: G06F9/30 G06F9/38

    摘要: A superscalar microprocessor is provided that includes a predecode unit configured to predecode variable byte-length instructions prior to their storage within an instruction cache. The predecode unit is configured to generate a plurality of predecode bits for each instruction byte. The plurality of predecode bits associated with each instruction byte are collectively referred to as a predecode tag. An instruction alignment unit then uses the predecode tags to dispatch the variable byte-length instructions simultaneously to a plurality of decode units which form fixed issue positions within the superscalar microprocessor. With the information conveyed by the functional bits, the decode units can detect the exact locations of the opcode, displacement, immediate, register, and scale-index bytes. Accordingly, no serial scan by the decode units through the instruction bytes is needed. In addition, the functional bits allow the decode units to calculate linear addresses (via adder circuits) expeditiously for use by other subunits within the superscalar microprocessor. Accordingly, relatively fast decoding may be attained, and high performance may be accommodated.

    摘要翻译: 提供了一种超标量微处理器,其包括预定解码单元,其被配置为在可变字节长度指令存储在指令高速缓存之前预解码。 预解码单元被配置为为每个指令字节生成多个预解码位。 与每个指令字节相关联的多个预解码位被统称为预解码标签。 指令对准单元然后使用预解码标签将可变字节长度指令同时分配到在超标量微处理器内形成固定发布位置的多个解码单元。 通过功能位传送的信息,解码单元可以检测操作码,位移,立即数,寄存器和缩放索引字节的确切位置。 因此,不需要通过指令字节的解码单元的串行扫描。 此外,功能位允许解码单元计算线性地址(通过加法器电路),以便迅速地由超标量微处理器内的其他子单元使用。 因此,可以获得相对快速的解码,并且可以适应高性能。

    Predecode unit adapted for variable byte-length instruction set
processors and method of operating the same
    78.
    发明授权
    Predecode unit adapted for variable byte-length instruction set processors and method of operating the same 失效
    适用于可变字节长度指令集处理器的预编码单元及其操作方法

    公开(公告)号:US5819059A

    公开(公告)日:1998-10-06

    申请号:US421663

    申请日:1995-04-12

    申请人: Thang M. Tran

    发明人: Thang M. Tran

    IPC分类号: G06F9/30 G06F9/38

    摘要: A superscalar microprocesor is provided that includes a predecode unit adapted for predecoding variable byte-length instructions. The predecode unit predecodes the instructions prior to their storage within an instruction cache. In one system, a predecode unit is configured to generate a plurality of predecode bits for each instruction byte. The plurality of predecode bits associated with each instruction byte are collectively referred to as a predecode tag. An instruction alignment unit then uses the predecode tags to dispatch the variable byte-length instructions simultaneously to a plurality of decode units which form fixed issue positions within the superscalar microprocessor. By utilizing the predecode information from the predecode unit, the instruction alignment unit may be implemented with a relatively small number of cascaded levels of logic gates, thus accommodating very high frequencies of operation. Instruction alignment to decode units may further be accomplished with relatively few pipeline stages. Finally, since the predecode unit is configured such that the meaning of the functional bit of a particular predecode tag is dependent upon the status of the start bit, a relatively large amount of predecode information may be conveyed with a relatively small number of predecode bits. This thereby allows a reduction in the size of the instruction cache without compromising performance.

    摘要翻译: 提供了一种超标量微处理器,其包括适于预编码可变字节长度指令的预解码单元。 预解码单元在指令存储之前对指令进行预解码。 在一个系统中,预解码单元被配置为为每个指令字节生成多个预解码位。 与每个指令字节相关联的多个预解码位被统称为预解码标签。 指令对准单元然后使用预解码标签将可变字节长度指令同时分配到在超标量微处理器内形成固定发布位置的多个解码单元。 通过利用来自预解码单元的预解码信息,指令对准单元可以用相对较少数量的级联的逻辑门来实现,从而适应非常高的操作频率。 解码单元的指令对准可以在相对较少的流水线阶段进一步完成。 最后,由于预解码单元被配置为使得特定预解码标签的功能位的含义取决于起始位的状态,所以相对大量的预解码信息可以用相对较少数量的预先解码位来传送。 这样就可以减小指令高速缓存的大小,而不会影响性能。

    Recorder buffer capable of detecting dependencies between accesses to a
pair of caches
    79.
    发明授权
    Recorder buffer capable of detecting dependencies between accesses to a pair of caches 失效
    重新排序缓冲器能够检测对一对缓存的访问之间的依赖关系

    公开(公告)号:US5765035A

    公开(公告)日:1998-06-09

    申请号:US561075

    申请日:1995-11-20

    申请人: Thang M. Tran

    发明人: Thang M. Tran

    摘要: A dependency checking structure is provided which compares memory accesses performed from the execution stage of the instruction processing pipeline to memory accesses performed from the decode stage. The decode stage performs memory accesses to a stack cache, while the execution stage performs its accesses (address for which are formed via indirect addressing) to the stack cache and to a data cache. If a read memory access performed by the execution stage is dependent upon a write memory access performed by the decode stage, the read memory access is stalled until the write memory access completes. If a read memory access performed by the decode stage is dependent upon a write memory access performed by the execution stage, then the instruction associated with the read memory access and subsequent instructions are flushed. Data coherency is maintained between the pair of caches while allowing stack-relative accesses to be performed from the decode stage. The comparator circuits used to perform the comparison are configured to compare a field of address bits instead of the entire address, reducing the size while still maintaining accurate dependency checking by qualifying the resulting comparison signals with an indication that both addresses hit in the same storage location within the stack cache.

    摘要翻译: 提供了一种依赖性检查结构,其将从指令处理流水线的执行阶段执行的存储器访问与从解码级执行的存储器访问进行比较。 解码级对堆栈高速缓存执行存储器访问,而执行级通过间接寻址将其访问(通过间接寻址形成的地址)执行到堆栈高速缓存和数据高速缓存。 如果由执行级执行的读取存储器访问取决于由解码级执行的写存储器访问,则读存储器访问被停止,直到写存储器访问完成。 如果由解码级执行的读取存储器访问取决于由执行级执行的写入存储器访问,则刷新与读取的存储器访问和后续指令相关联的指令。 在一对缓存之间保持数据一致性,同时允许从解码级执行堆栈相对访问。 用于执行比较的比较器电路被配置为比较地址位的字段而不是整个地址,减小大小,同时仍然通过将所得到的比较信号限定在相同存储位置中的两个地址的指示来保持精确的依赖性检查 在堆栈缓存内。

    Byte queue divided into multiple subqueues for optimizing instruction
selection logic
    80.
    发明授权
    Byte queue divided into multiple subqueues for optimizing instruction selection logic 失效
    字节队列分为多个子队列,用于优化指令选择逻辑

    公开(公告)号:US5748978A

    公开(公告)日:1998-05-05

    申请号:US650940

    申请日:1996-05-17

    摘要: An apparatus for aligning variable byte length instructions to a plurality of issue positions is provided. The apparatus includes a byte queue divided into several subqueues. Each subqueue is maintained such that a first instruction in program order within the subqueue is identified by information stored in a first position within the subqueue, a second instruction in program order within the subqueue is identified by information stored in a second position within the subqueue, etc. When instructions from a subqueue are dispatched, remaining instructions within the subqueue are shifted such that the first of the remaining instructions (in program order) occupies the first position, etc. Instructions are shifted from subqueue to subqueue when each of the instructions within a particular subqueue have been dispatched. The information stored in one subqueue is shifted as a unit to another subqueue independent of the internal shifting of subqueue information. The subqueues are additionally configured to handle instructions which overflow from a first subqueue into a second subqueue. Information pertaining to the overflowing instructions is maintained in the last position within the first subqueue. The information is not shifted when other positions within the subqueue are shifted. In this manner, information regarding an overflowing instruction is again located in a limited number of positions.

    摘要翻译: 提供了一种用于将可变字节长度指令与多个发行位置对准的装置。 该装置包括分为几个子队列的字节队列。 维持每个子队列,使得在子队列内以程序顺序排列的第一指令通过存储在子队列内的第一位置的信息来识别,子队列内的程序顺序中的第二指令由存储在子队列内的第二位置的信息来识别, 当发送来自子队列的指令时,子队列中的剩余指令被移位,使得剩余指令中的第一个(以程序顺序)占据第一位置等。当每个指令在 已经调度了一个特定的子队列。 存储在一个子队列中的信息作为一个单元移动到另一个子队列,而不依赖于子队列信息的内部移位。 子队列另外配置为处理从第一子队列溢出到第二子队列的指令。 关于溢出指令的信息被保持在第一子队列内的最后位置。 当子队列中的其他位置移动时,信息不会移动。 以这种方式,关于溢出指令的信息再次位于有限数量的位置。