Apparatus for aligning instructions using predecoded shift amounts
    71.
    发明授权
    Apparatus for aligning instructions using predecoded shift amounts 失效
    用于使用预解码移位量对准指令的装置

    公开(公告)号:US5872943A

    公开(公告)日:1999-02-16

    申请号:US690382

    申请日:1996-07-26

    CPC classification number: G06F9/382 G06F9/30152 G06F9/3816 G06F9/3822

    Abstract: A predecode unit within a microprocessor predecodes a cache line of instruction bytes for storage within the instruction cache of the microprocessor. The predecode unit produces multiple shift amounts, each of which identify the beginning of a particular instruction within the instruction cache line. The shift amounts are stored in the instruction cache with the instruction bytes, and are conveyed when the instruction bytes are fetched for execution by the microprocessor. An instruction alignment unit decodes the shift amounts to locate instructions within the fetched instruction bytes. Each shift amount directly identifies a corresponding instruction for dispatch, and therefore decoding the shift amount directly results in controls for shifting the instruction bytes such that the identified instruction is conveyed to a corresponding issue position. The number of shift amounts stored may be equal to the number of issue positions within the microprocessor. The instruction alignment unit scans the start and end byte predecode data (which is also provided by the predecode unit and stored in the instruction cache) to detect any additional instructions within the cache line (e.g. instructions not identified by the shift amounts). Additional shift amounts are generated and used by the instruction alignment unit to dispatch instructions during subsequent clock cycles.

    Abstract translation: 微处理器中的预解码单元预先对指令字节的高速缓存行进行存储,以存储在微处理器的指令高速缓存内。 预解码单元产生多个移位量,每个移位量标识指令高速缓存行内特定指令的开始。 移位量存储在具有指令字节的指令高速缓存中,并且当指令字节被提取以由微处理器执行时被传送。 指令对准单元对移位量进行解码,以定位取出的指令字节内的指令。 每个移位量直接识别用于调度的相应指令,因此解码移位量直接导致用于移位指令字节的控制,使得所识别的指令被传送到相应的发行位置。 存储的移位量的数量可以等于微处理器内的发放位置的数目。 指令对准单元扫描开始和结束字节预解码数据(其也由预解码单元提供并存储在指令高速缓存中)以检测高速缓存行内的任何附加指令(例如,未被移位量标识的指令)。 附加移位量由指令对准单元产生并用于在随后的时钟周期期间调度指令。

    Instruction alignment using a dispatch list and a latch list
    72.
    发明授权
    Instruction alignment using a dispatch list and a latch list 失效
    指令对齐使用调度列表和锁存列表

    公开(公告)号:US5859992A

    公开(公告)日:1999-01-12

    申请号:US815566

    申请日:1997-03-12

    Abstract: An instruction alignment unit includes a byte queue configured to store instruction blocks. Each instruction block includes a fixed number of instruction bytes and identifies up to a maximum number of instructions within the fixed number of instruction bytes. Additionally, the instruction alignment unit is configured to form a pair of instruction lists: a dispatch list and a latch list. The dispatch list includes instruction locators corresponding to instructions within the instruction blocks stored in the byte queue. Additionally, the first three instructions from instructions blocks being received from the instruction cache during a particular clock cycle are appended to the dispatch list. The dispatch list is used to select instructions from the byte queue for dispatch to the decode units. The latch list is used for receiving instruction locators for the remaining instructions from the instruction blocks received from the instruction cache during the particular clock cycle. Furthermore, the latch list receives instruction locators from the dispatch list which correspond to instructions not selected for dispatch to the decode units. The latch list is stored until a succeeding clock cycle, in which the stored program-ordered list is used as a basis for forming the dispatch list during that succeeding clock cycle. The instruction identification information and instruction bytes corresponding to the instruction can be located by selecting the instructions corresponding to the instruction locators at the front of the dispatch list.

    Abstract translation: 指令对准单元包括被配置为存储指令块的字节队列。 每个指令块包括固定数量的指令字节,并且在固定数目的指令字节内识别最多指令数。 此外,指令对准单元被配置为形成一对指令列表:调度列表和锁存列表。 调度列表包括对应于存储在字节队列中的指令块内的指令的指令定位符。 此外,在特定时钟周期期间从指令高速缓存接收到来自指令块的前三个指令被附加到调度列表。 调度列表用于从字节队列中选择用于调度到解码单元的指令。 锁存列表用于从特定时钟周期内从指令高速缓存接收到的指令块接收剩余指令的指令定位器。 此外,锁存列表从调度列表接收与未被选择用于发送到解码单元的指令对应的指令定位器。 存储锁存列表直到下一个时钟周期,其中存储的程序排序列表用作在该后续时钟周期期间形成分派列表的基础。 可以通过选择与调度列表前面的指令定位符相对应的指令来定位与该指令相对应的指令识别信息和指令字节。

    Way prediction unit and a method for operating the same
    73.
    发明授权
    Way prediction unit and a method for operating the same 失效
    方式预测单元及其操作方法

    公开(公告)号:US5848433A

    公开(公告)日:1998-12-08

    申请号:US838680

    申请日:1997-04-09

    CPC classification number: G06F9/3806 G06F12/0864 G06F9/3832 G06F2212/6082

    Abstract: A way prediction unit for a superscalar microprocessor is provided which predicts the next fetch address as well as the way of the instruction cache that the current fetch address hits in while the instructions associated with the current fetch are being read from the instruction cache. The way prediction unit is intended for high frequency microprocessors in which associative caches tend to be clock cycle limiting, causing the instruction fetch mechanism to require more than one clock cycle between fetch requests. Therefore, an instruction fetch can be made every clock cycle using the predicted fetch address until an incorrect next fetch address or an incorrect way is predicted. The instructions from the predicted way are provided to the instruction processing pipelines of the superscalar microprocessor each clock cycle.

    Abstract translation: 提供了一种用于超标量微处理器的方式预测单元,其预测下一个提取地址以及当前提取地址所在的指令高速缓存的方式,同时从指令高速缓存读取与当前提取相关联的指令。 预测单元用于高频微处理器的方式,其中关联高速缓存趋向于是时钟周期限制,导致指令获取机制在提取请求之间需要多于一个时钟周期。 因此,可以使用预测的提取地址进行每个时钟周期的指令提取,直到预测到不正确的下一个提取地址或错误的方式。 来自预测方式的指令被提供给超标量微处理器每个时钟周期的指令处理流水线。

    Apparatus for providing memory and register operands concurrently to
functional units
    74.
    发明授权
    Apparatus for providing memory and register operands concurrently to functional units 失效
    用于向功能单元同时提供存储器和寄存器操作数的装置

    公开(公告)号:US5835968A

    公开(公告)日:1998-11-10

    申请号:US633302

    申请日:1996-04-17

    CPC classification number: G06F12/0875 G06F9/3826 G06F9/3832

    Abstract: An apparatus including address generation units, corresponding reservation stations, and a speculative register file is provided. Decode units provide memory operation information to the corresponding reservation stations while the associated instructions are being decoded. The speculative register file stores speculative register values corresponding to previously decoded instructions. The speculative register values are generated prior to execution of the previously decoded instructions. If the register operands included in the address operands of an instruction are stored in the speculative register file, then the memory operation may be passed through the corresponding reservation station to an address generation unit. The address generation unit generates the data address from the address operands and accesses a data cache while register operands corresponding to the instruction are requested from a register file and reorder buffer.

    Abstract translation: 提供了包括地址生成单元,相应的保留站和推测寄存器文件的装置。 解码单元在对相关指令进行解码的同时向对应的保留站提供存储器操作信息。 推测寄存器文件存储与先前解码的指令相对应的推测寄存器值。 在执行先前解码的指令之前生成推测寄存器值。 如果包含在指令的地址操作数中的寄存器操作数存储在推测寄存器文件中,则存储器操作可以通过相应的保留站传递给地址生成单元。 地址生成单元从地址操作数生成数据地址,并访问数据高速缓存,同时从寄存器文件和重排序缓冲器请求与指令对应的寄存器操作数。

    Method and apparatus for predecoding variable byte-length instructions
within a superscalar microprocessor
    75.
    发明授权
    Method and apparatus for predecoding variable byte-length instructions within a superscalar microprocessor 失效
    用于在超标量微处理器内预编码可变字节长度指令的方法和装置

    公开(公告)号:US5822558A

    公开(公告)日:1998-10-13

    申请号:US790394

    申请日:1997-01-29

    Applicant: Thang M. Tran

    Inventor: Thang M. Tran

    Abstract: A superscalar microprocessor is provided that includes a predecode unit configured to predecode variable byte-length instructions prior to their storage within an instruction cache. The predecode unit is configured to generate a plurality of predecode bits for each instruction byte. The plurality of predecode bits associated with each instruction byte are collectively referred to as a predecode tag. An instruction alignment unit then uses the predecode tags to dispatch the variable byte-length instructions simultaneously to a plurality of decode units which form fixed issue positions within the superscalar microprocessor. With the information conveyed by the functional bits, the decode units can detect the exact locations of the opcode, displacement, immediate, register, and scale-index bytes. Accordingly, no serial scan by the decode units through the instruction bytes is needed. In addition, the functional bits allow the decode units to calculate linear addresses (via adder circuits) expeditiously for use by other subunits within the superscalar microprocessor. Accordingly, relatively fast decoding may be attained, and high performance may be accommodated.

    Abstract translation: 提供了一种超标量微处理器,其包括预定解码单元,其被配置为在可变字节长度指令存储在指令高速缓存之前预解码。 预解码单元被配置为为每个指令字节生成多个预解码位。 与每个指令字节相关联的多个预解码位被统称为预解码标签。 指令对准单元然后使用预解码标签将可变字节长度指令同时分配到在超标量微处理器内形成固定发布位置的多个解码单元。 通过功能位传送的信息,解码单元可以检测操作码,位移,立即数,寄存器和缩放索引字节的确切位置。 因此,不需要通过指令字节的解码单元的串行扫描。 此外,功能位允许解码单元计算线性地址(通过加法器电路),以便迅速地由超标量微处理器内的其他子单元使用。 因此,可以获得相对快速的解码,并且可以适应高性能。

    Predecode unit adapted for variable byte-length instruction set
processors and method of operating the same
    76.
    发明授权
    Predecode unit adapted for variable byte-length instruction set processors and method of operating the same 失效
    适用于可变字节长度指令集处理器的预编码单元及其操作方法

    公开(公告)号:US5819059A

    公开(公告)日:1998-10-06

    申请号:US421663

    申请日:1995-04-12

    Applicant: Thang M. Tran

    Inventor: Thang M. Tran

    CPC classification number: G06F9/382 G06F9/30152 G06F9/3816 G06F9/3885

    Abstract: A superscalar microprocesor is provided that includes a predecode unit adapted for predecoding variable byte-length instructions. The predecode unit predecodes the instructions prior to their storage within an instruction cache. In one system, a predecode unit is configured to generate a plurality of predecode bits for each instruction byte. The plurality of predecode bits associated with each instruction byte are collectively referred to as a predecode tag. An instruction alignment unit then uses the predecode tags to dispatch the variable byte-length instructions simultaneously to a plurality of decode units which form fixed issue positions within the superscalar microprocessor. By utilizing the predecode information from the predecode unit, the instruction alignment unit may be implemented with a relatively small number of cascaded levels of logic gates, thus accommodating very high frequencies of operation. Instruction alignment to decode units may further be accomplished with relatively few pipeline stages. Finally, since the predecode unit is configured such that the meaning of the functional bit of a particular predecode tag is dependent upon the status of the start bit, a relatively large amount of predecode information may be conveyed with a relatively small number of predecode bits. This thereby allows a reduction in the size of the instruction cache without compromising performance.

    Abstract translation: 提供了一种超标量微处理器,其包括适于预编码可变字节长度指令的预解码单元。 预解码单元在指令存储之前对指令进行预解码。 在一个系统中,预解码单元被配置为为每个指令字节生成多个预解码位。 与每个指令字节相关联的多个预解码位被统称为预解码标签。 指令对准单元然后使用预解码标签将可变字节长度指令同时分配到在超标量微处理器内形成固定发布位置的多个解码单元。 通过利用来自预解码单元的预解码信息,指令对准单元可以用相对较少数量的级联的逻辑门来实现,从而适应非常高的操作频率。 解码单元的指令对准可以在相对较少的流水线阶段进一步完成。 最后,由于预解码单元被配置为使得特定预解码标签的功能位的含义取决于起始位的状态,所以相对大量的预解码信息可以用相对较少数量的预先解码位来传送。 这样就可以减小指令高速缓存的大小,而不会影响性能。

    Recorder buffer capable of detecting dependencies between accesses to a
pair of caches
    77.
    发明授权
    Recorder buffer capable of detecting dependencies between accesses to a pair of caches 失效
    重新排序缓冲器能够检测对一对缓存的访问之间的依赖关系

    公开(公告)号:US5765035A

    公开(公告)日:1998-06-09

    申请号:US561075

    申请日:1995-11-20

    Applicant: Thang M. Tran

    Inventor: Thang M. Tran

    CPC classification number: G06F12/0848 G06F9/3834 G06F9/3861

    Abstract: A dependency checking structure is provided which compares memory accesses performed from the execution stage of the instruction processing pipeline to memory accesses performed from the decode stage. The decode stage performs memory accesses to a stack cache, while the execution stage performs its accesses (address for which are formed via indirect addressing) to the stack cache and to a data cache. If a read memory access performed by the execution stage is dependent upon a write memory access performed by the decode stage, the read memory access is stalled until the write memory access completes. If a read memory access performed by the decode stage is dependent upon a write memory access performed by the execution stage, then the instruction associated with the read memory access and subsequent instructions are flushed. Data coherency is maintained between the pair of caches while allowing stack-relative accesses to be performed from the decode stage. The comparator circuits used to perform the comparison are configured to compare a field of address bits instead of the entire address, reducing the size while still maintaining accurate dependency checking by qualifying the resulting comparison signals with an indication that both addresses hit in the same storage location within the stack cache.

    Abstract translation: 提供了一种依赖性检查结构,其将从指令处理流水线的执行阶段执行的存储器访问与从解码级执行的存储器访问进行比较。 解码级对堆栈高速缓存执行存储器访问,而执行级通过间接寻址将其访问(通过间接寻址形成的地址)执行到堆栈高速缓存和数据高速缓存。 如果由执行级执行的读取存储器访问取决于由解码级执行的写存储器访问,则读存储器访问被停止,直到写存储器访问完成。 如果由解码级执行的读取存储器访问取决于由执行级执行的写入存储器访问,则刷新与读取的存储器访问和后续指令相关联的指令。 在一对缓存之间保持数据一致性,同时允许从解码级执行堆栈相对访问。 用于执行比较的比较器电路被配置为比较地址位的字段而不是整个地址,减小大小,同时仍然通过将所得到的比较信号限定在相同存储位置中的两个地址的指示来保持精确的依赖性检查 在堆栈缓存内。

    Byte queue divided into multiple subqueues for optimizing instruction
selection logic
    78.
    发明授权
    Byte queue divided into multiple subqueues for optimizing instruction selection logic 失效
    字节队列分为多个子队列,用于优化指令选择逻辑

    公开(公告)号:US5748978A

    公开(公告)日:1998-05-05

    申请号:US650940

    申请日:1996-05-17

    CPC classification number: G06F9/30152 G06F9/3816 G06F9/382

    Abstract: An apparatus for aligning variable byte length instructions to a plurality of issue positions is provided. The apparatus includes a byte queue divided into several subqueues. Each subqueue is maintained such that a first instruction in program order within the subqueue is identified by information stored in a first position within the subqueue, a second instruction in program order within the subqueue is identified by information stored in a second position within the subqueue, etc. When instructions from a subqueue are dispatched, remaining instructions within the subqueue are shifted such that the first of the remaining instructions (in program order) occupies the first position, etc. Instructions are shifted from subqueue to subqueue when each of the instructions within a particular subqueue have been dispatched. The information stored in one subqueue is shifted as a unit to another subqueue independent of the internal shifting of subqueue information. The subqueues are additionally configured to handle instructions which overflow from a first subqueue into a second subqueue. Information pertaining to the overflowing instructions is maintained in the last position within the first subqueue. The information is not shifted when other positions within the subqueue are shifted. In this manner, information regarding an overflowing instruction is again located in a limited number of positions.

    Abstract translation: 提供了一种用于将可变字节长度指令与多个发行位置对准的装置。 该装置包括分为几个子队列的字节队列。 维持每个子队列,使得在子队列内以程序顺序排列的第一指令通过存储在子队列内的第一位置的信息来识别,子队列内的程序顺序中的第二指令由存储在子队列内的第二位置的信息来识别, 当发送来自子队列的指令时,子队列中的剩余指令被移位,使得剩余指令中的第一个(以程序顺序)占据第一位置等。当每个指令在 已经调度了一个特定的子队列。 存储在一个子队列中的信息作为一个单元移动到另一个子队列,而不依赖于子队列信息的内部移位。 子队列另外配置为处理从第一子队列溢出到第二子队列的指令。 关于溢出指令的信息被保持在第一子队列内的最后位置。 当子队列中的其他位置移动时,信息不会移动。 以这种方式,关于溢出指令的信息再次位于有限数量的位置。

    Apparatus and method for resolving dependencies among a plurality of
instructions within a storage device
    79.
    发明授权
    Apparatus and method for resolving dependencies among a plurality of instructions within a storage device 失效
    用于解决存储装置内的多个指令之间的依赖性的装置和方法

    公开(公告)号:US5345569A

    公开(公告)日:1994-09-06

    申请号:US764155

    申请日:1991-09-20

    Applicant: Thang M. Tran

    Inventor: Thang M. Tran

    Abstract: An apparatus and method for resolving data dependencies among a plurality of instructions within a storage device, such as a reorder buffer in a superscalar computing apparatus employing pipeline instruction processing. The storage device has a read pointer, indicating a most recently-stored instruction and has a write pointer, indicating a first-stored instruction of the plurality of instructions within the storage device.A compare-hit circuit generates a compare-hit signal upon each concurrence of the respective source indicator in a next-to-be-dispatched instruction with the destination indicator of an earlier-stored instruction within the storage device; a first enable circuit generates a first enable signal for a first packet of instructions defined by the read pointer and the write pointer; a first comparing circuit generates a hit-enable signal for each concurrence of the compare-hit signal and the first enable signal; a second enable circuit generates a second enable signal for a second packet of instructions defined by the read pointer and the hit-enable signal; and a second comparing circuit generates the output signal for each concurrence of the second enable signal and the hit-enable signal.

    Abstract translation: 一种用于解决存储装置内的多个指令之间的数据依赖性的装置和方法,例如采用流水线指令处理的超标量计算装置中的重排序缓冲器。 存储装置具有指示最近存储的指令的读指针,并具有指示存储装置内的多个指令的第一存储指令的写指针。 比较命中电路在下一个待分派指令中的相应源指示符的每次同时生成比较命中信号,其中存储设备内的先前存储的指令的目的地指示符; 第一使能电路产生用于由读指针和写指针定义的第一指令包的第一使能信号; 第一比较电路针对比较命中信号和第一使能信号的每次同步产生命中使能信号; 第二使能电路为由读指针和命中使能信号定义的第二指令分组生成第二使能信号; 并且第二比较电路产生用于第二使能信号和命中使能信号的每次同步的输出信号。

    Data processing system with latency tolerance execution
    80.
    发明授权
    Data processing system with latency tolerance execution 有权
    具有延迟容限执行的数据处理系统

    公开(公告)号:US09141391B2

    公开(公告)日:2015-09-22

    申请号:US13419531

    申请日:2012-03-14

    Abstract: In a processor having an instruction unit, a decode/issue unit, and execution queues configured to provide instructions to correspondingly different types execution units, a method comprises maintaining a duplicate free list for the execution queues. The duplicate free list includes a plurality of duplicate dependent instruction indicators that indicate when a duplicate instruction for a dependent instruction is stored in at least one of the execution queues. One of the duplicate dependent instruction indicators is assigned to an execution queue for a dependent instruction. The dependent instruction is executed only when the one of the duplicate dependent instruction indicators is reset.

    Abstract translation: 在具有指令单元,解码/发布单元和执行队列的处理器中,被配置为向对应的不同类型的执行单元提供指令,一种方法包括维护执行队列的重复空闲列表。 重复的空闲列表包括多个重复相关指令指示符,其指示何时将依赖指令的重复指令存储在至少一个执行队列中。 重复的相关指令指示符之一被分配给依赖指令的执行队列。 依赖指令仅在重复相关指示指示符之一被复位时执行。

Patent Agency Ranking