Microprocessor having address generation units for efficient generation
of memory operation addresses
    11.
    发明授权
    Microprocessor having address generation units for efficient generation of memory operation addresses 失效
    微处理器具有用于有效地生成存储器操作地址的地址生成单元

    公开(公告)号:US6085302A

    公开(公告)日:2000-07-04

    申请号:US633351

    申请日:1996-04-17

    CPC classification number: G06F9/355 G06F9/3885

    Abstract: A microprocessor including address generation units configured to perform address generation for memory operations is provided. A reservation station associated with one of the address generation units receives the displacement from an instruction and an indication of the selected segment register upon decode of the instruction in a corresponding decode unit within the microprocessor. The displacement and segment base address from the selected segment register are added in the reservation station while the register operands for the instruction are requested. If the register operands are provided upon request (as opposed to a reorder buffer tag), the displacement/base sum and register operands are passed to the address generation unit. The address generation unit adds the displacement/base sum to the register operands, thereby forming the linear address. If register operands are not provided upon request (i.e. one or more reorder buffer tags are received instead of the corresponding register operand), then the reservation station stores the displacement/base sum and register operands/tags. Once each register operand has been provided, the displacement/base sum and register operands are conveyed to the address generation unit. Data address generation responsibilities are thereby fulfilled by the address generation units. Since the functional units of the microprocessor are relieved of address generation responsibilities, the functional units may be simplified.

    Abstract translation: 提供一种微处理器,包括被配置为执行存储器操作的地址生成的地址生成单元。 与地址生成单元中的一个相关联的预约站在微处理器中的对应解码单元中的指令解码后,从指令接收所选择的段寄存器的指示和指示。 来自所选段寄存器的位移和段基地址被添加到保留站中,同时请求指令的寄存器操作数。 如果根据请求提供寄存器操作数(与重新排序缓冲器标签相反),则将位移/基和和寄存器操作数传递给地址生成单元。 地址生成单元将位移/基数和加到寄存器操作数,从而形成线性地址。 如果根据请求不提供寄存器操作数(即接收一个或多个重新排序缓冲器标签而不是相应的寄存器操作数),则保留站存储位移/基本和和寄存器操作数/标签。 一旦提供了每个寄存器操作数,则位移/基和和寄存器操作数被传送到地址生成单元。 地址生成单元由此实现数据地址生成责任。 由于微处理器的功能单元减轻了地址生成责任,因此功能单元可以被简化。

    Reverse TLB for providing branch target address in a microprocessor
having a physically-tagged cache
    12.
    发明授权
    Reverse TLB for providing branch target address in a microprocessor having a physically-tagged cache 失效
    用于在具有物理标记的高速缓存的微处理器中提供分支目标地址的反向TLB

    公开(公告)号:US6079003A

    公开(公告)日:2000-06-20

    申请号:US974972

    申请日:1997-11-20

    Abstract: A microprocessor employs a branch prediction unit including a branch prediction storage which stores the index portion of branch target addresses and an instruction cache which is virtually indexed and physically tagged. The branch target index (if predicted-taken, or the sequential index if predicted not-taken) is provided as the index to the instruction cache. The selected physical tag is provided to a reverse translation lookaside buffer (TLB) which translates the physical tag to a virtual page number. Concatenating the virtual page number to the virtual index from the instruction cache (and the offset portion, generated from the branch prediction) results in the branch target address being generated. In one embodiment, the process of reading an index from the branch prediction storage, accessing the instruction cache, selecting the physical tag, and reverse translating the physical tag to achieve a virtual page number may require more than a clock cycle to complete. Such an embodiment may employ a current page register which stores the most recently translated virtual page number and the corresponding real page number. The branch prediction unit predicts that each fetch address will continue to reside in the current page and uses the virtual page number from the current page to form the branch target address. The physical tag from the fetched cache line is compared to the corresponding real page number to verify that the fetch address is actually still within the current page. When a mismatch is detected between the corresponding real page number and the physical tag from the fetched cache line, the branch target address is corrected with the linear page number provided by the reverse TLB and the current page register is updated.

    Abstract translation: 微处理器采用分支预测单元,该分支预测单元包括分支预测存储器,该分支预测存储器存储分支目标地址的索引部分,以及虚拟索引和物理标记的指令高速缓 提供分支目标索引(如果预测取得的,或者如果预测未被采用的顺序索引)作为指令高速缓存的索引。 所选择的物理标签被提供给反向翻译后备缓冲器(TLB),其将物理标签转换成虚拟页码。 将虚拟页号连接到来自指令高速缓存(以及从分支预测生成的偏移部分)的虚拟索引导致生成分支目标地址。 在一个实施例中,从分支预测存储器读取索引,访问指令高速缓存,选择物理标签以及反转翻译物理标签以实现虚拟页面号的过程可能需要多于一个时钟周期来完成。 这样的实施例可以使用存储最近翻译的虚拟页面号码和对应的真实页面号码的当前页面寄存器。 分支预测单元预测每个获取地址将继续驻留在当前页面中,并使用当前页面中的虚拟页面号来形成分支目标地址。 将获取的高速缓存行中的物理标记与相应的实际页码进行比较,以验证提取地址实际上仍在当前页面中。 当在相应的实际页码与来自取出的高速缓存行的物理标记之间检测到不匹配时,用反向TLB提供的线性页码修正分支目标地址,并更新当前页寄存器。

    Instruction fetch unit configured to provide sequential way prediction
for sequential instruction fetches
    13.
    发明授权
    Instruction fetch unit configured to provide sequential way prediction for sequential instruction fetches 失效
    指令提取单元被配置为为顺序指令提取提供顺序方式预测

    公开(公告)号:US06073230A

    公开(公告)日:2000-06-06

    申请号:US873113

    申请日:1997-06-11

    CPC classification number: G06F9/3806 G06F12/0864 G06F9/3814 G06F2212/6082

    Abstract: An instruction fetch unit that employs sequential way prediction. The instruction fetch unit comprises a control unit configured to convey a first index and a first way to an instruction cache in a first clock cycle. The first index and first way select a first group of contiguous instruction bytes within the instruction cache, as well as a corresponding branch prediction block. The branch prediction block is stored in a branch prediction storage, and includes a predicted sequential way value. The control unit is further configured to convey a second index and a second way to the instruction cache in a second clock cycle succeeding the first clock cycle. This second index and second way select a second group of contiguous instruction bytes from the instruction cache. The second way is selected to be the predicted sequential way value stored in the branch prediction block corresponding to the first group of contiguous instruction bytes in response to a branch prediction algorithm employed by the control unit predicting a sequential execution path. Advantageously, a set associative instruction cache utilizing this method of way prediction may operate at higher frequencies (i.e., lower clock cycles) than if tag comparison were used to select the correct way.

    Abstract translation: 采用顺序方式预测的指令提取单元。 指令提取单元包括控制单元,其被配置为在第一时钟周期中将第一索引和第一路径传送到指令高速缓存。 第一索引和第一方式选择指令高速缓存内的第一组连续指令字节,以及相应的分支预测块。 分支预测块存储在分支预测存储器中,并且包括预测的顺序路径值。 控制单元还被配置为在第一时钟周期之后的第二时钟周期中将第二索引和第二路径传送到指令高速缓存。 该第二索引和第二方式从指令高速缓存中选择第二组连续的指令字节。 响应于预测顺序执行路径的控制单元使用的分支预测算法,第二种方式被选择为存储在与第一组连续指令字节对应的分支预测块中的预测顺序方式值。 有利地,使用这种方式预测方法的集合关联指令高速缓存可以比使用标签比较来选择正确的方式更高的频率(即,较低的时钟周期)操作。

    Speculative store buffer
    14.
    发明授权
    Speculative store buffer 失效
    推测存储缓冲区

    公开(公告)号:US6065103A

    公开(公告)日:2000-05-16

    申请号:US991915

    申请日:1997-12-16

    CPC classification number: G06F9/3842 G06F9/3834

    Abstract: A speculative store buffer is speculatively updated in response to speculative store memory operations buffered by a load/store unit in a microprocessor. Instead of performing dependency checking for load memory operations among the store memory operations buffered by the load/store unit, the load/store unit may perform a lookup in the speculative store buffer. If a hit is detected in the speculative store buffer, the speculative state of the memory location is forwarded from the speculative store buffer. The speculative state corresponds to the most recent speculative store memory operation, even if multiple speculative store memory operations are buffered by the load/store unit. Since dependency checking against the memory operation buffers is not performed, the dependency checking limitations as to the size of these buffers may be eliminated. The speed at which dependency checking can be performed may in large part be determined by the number of storage locations within the speculative store buffer (as opposed to the number of memory operations which may be buffered in the memory operation buffers or buffers).

    Abstract translation: 响应于由微处理器中的加载/存储单元缓冲的推测存储器操作,推测性地更新推测性存储缓冲器。 加载/存储单元可以在推测存储缓冲器中执行查找,而不是对由加载/存储单元缓冲的存储器存储器操作之间对加载存储器操作执行依赖性检查。 如果在推测存储缓冲区中检测到命中,则存储器位置的推测状态从推测性存储缓冲区转发。 即使多个推测存储器存储器操作被加载/存储单元缓冲,推测状态对应于最近的推测存储器存储器操作。 由于不执行对存储器操作缓冲器的依赖性检查,因此可以消除关于这些缓冲器的大小的依赖性检查限制。 可以执行依赖性检查的速度在很大程度上可以由推测性存储缓冲器内的存储位置的数量(与可能在存储器操作缓冲器或缓冲器中缓冲的存储器操作的数量相反)来确定。

    Method and apparatus for predecoding variable byte length instructions
for fast scanning of instructions
    15.
    发明授权
    Method and apparatus for predecoding variable byte length instructions for fast scanning of instructions 失效
    用于预编码可变字节长度指令以快速扫描指令的方法和装置

    公开(公告)号:US5987235A

    公开(公告)日:1999-11-16

    申请号:US835082

    申请日:1997-04-04

    Applicant: Thang M. Tran

    Inventor: Thang M. Tran

    CPC classification number: G06F9/382 G06F9/30152 G06F9/3816

    Abstract: A superscalar microprocessor is provided that includes a predecode unit configured to predecode variable byte-length instructions prior to their storage within an instruction cache. The predecode unit is configured to generate a plurality of predecode bits for each instruction byte. The plurality of predecode bits, called a predecode tag, associated with each instruction byte include a number of bits that indicates a number of byte positions to shift each instruction byte in order to align the instruction byte with a decode unit. Each decode unit includes a fixed number of instruction byte positions for storing bytes of instructions. A start byte of an instruction is conveyed to a first instruction byte position. The predecode tags are used by a multiplex and shift unit of an instruction alignment unit to shift the instruction bytes such that the start byte of an instruction is stored in a first instruction byte position of a decode unit. The subsequent instruction bytes of an instruction are stored in the remaining instruction bytes of the decode unit. Accordingly, relatively fast multiplexing of instructions may be obtained. The instruction alignment unit is not required to scan the instruction bytes for start bytes and end bytes. The predecode tag for each instruction byte indicates a number of byte positions to shift that byte. Accordingly, the instruction alignment unit mnay be a simple multiplexing and shift unit.

    Abstract translation: 提供了一种超标量微处理器,其包括预定解码单元,其被配置为在可变字节长度指令存储在指令高速缓存之前预解码。 预解码单元被配置为为每个指令字节生成多个预解码位。 与每个指令字节相关联的多个预解码比特被称为预解码标签包括指示用于移位每个指令字节的字节位置数以便将指令字节与解码单元对准的位数。 每个解码单元包括用于存储指令字节的固定数目的指令字节位置。 指令的起始字节被传送到第一指令字节位置。 预解码标签由指令对准单元的多路复用和移位单元使用以移位指令字节,使得指令的起始字节存储在解码单元的第一指令字节位置。 指令的后续指令字节存储在解码单元的剩余指令字节中。 因此,可以获得相对快速的指令复用。 指令对齐单元不需要扫描指令字节的起始字节和结束字节。 每个指令字节的预解码标签表示要移位该字节的字节数。 因此,指令对准单元可以是简单的复用和移位单元。

    Superscalar microprocessor employing a future file for storing results
into multiportion registers
    16.
    发明授权
    Superscalar microprocessor employing a future file for storing results into multiportion registers 失效
    采用未来文件的超标量微处理器将结果存储到多端口寄存器中

    公开(公告)号:US5983342A

    公开(公告)日:1999-11-09

    申请号:US711880

    申请日:1996-09-12

    Applicant: Thang M. Tran

    Inventor: Thang M. Tran

    Abstract: A superscalar microprocessor includes a reorder buffer configured into multiple lines of storage, wherein a line of storage includes sufficient storage for instruction results regarding a predefined maximum number of concurrently dispatchable instructions. A line of storage is allocated whenever one or more instructions are dispatched. A microprocessor employing the reorder buffer is also configured with fixed, symmetrical issue positions. The symmetrical nature of the issue positions may increase the average number of instructions to be concurrently dispatched and executed by the microprocessor. The average number of unused locations within the line decreases as the average number of concurrently dispatched instructions increases. One particular implementation of the reorder buffer includes a future file. The future file comprises a storage location corresponding to each register within the microprocessor. The reorder buffer tag (or instruction result, if the instruction has executed) of the last instruction in program order to update the register is stored in the future file. The reorder buffer provides the value (either reorder buffer tag or instruction result) stored in the storage location corresponding to a register when the register is used as a source operand for another instruction. Another advantage of the future file for microprocessors which allow access and update to portions of registers is that narrow-to-wide dependencies are resolved upon completion of the instruction which updates the narrower register.

    Abstract translation: 超标量微处理器包括配置成多行存储器的重新排序缓冲器,其中存储线包括关于预定最大数量的可同时分发的指令的指令结果的足够的存储。 只要调度一个或多个指令,就分配一行存储空间。 采用重排序缓冲器的微处理器也配置有固定的对称发布位置。 问题位置的对称性质可能会增加由微处理器同时调度和执行的指令的平均数量。 随着并发调度指令的平均数量的增加,行中未使用位置的平均数量减少。 重排序缓冲器的一个特定实现包括将来的文件。 未来文件包括与微处理器内的每个寄存器对应的存储位置。 程序顺序中的最后一条指令的重新排序缓冲区标签(或指令结果已执行)更新寄存器存储在将来的文件中。 重新排序缓冲器提供当寄存器用作另一个指令的源操作数时,存储在与寄存器相对应的存储位置中的值(重新排序缓冲器标签或指令结果)。 允许访问和更新寄存器部分的微处理器的未来文件的另一个优点是,在更新较窄寄存器的指令完成后,解决了窄到宽的依赖关系。

    Cache holding register for receiving instruction packets and for
providing the instruction packets to a predecode unit and instruction
cache
    17.
    发明授权
    Cache holding register for receiving instruction packets and for providing the instruction packets to a predecode unit and instruction cache 失效
    缓存保持寄存器,用于接收指令包,并将指令包提供给预解码单元和指令高速缓存

    公开(公告)号:US5983321A

    公开(公告)日:1999-11-09

    申请号:US815567

    申请日:1997-03-12

    Abstract: An instruction cache employing a cache holding register is provided. When a cache line of instruction bytes is fetched from main memory, the instruction bytes are temporarily stored into the cache holding register as they are received from main memory. The instruction bytes are predecoded as they are received from the main memory. If a predicted-taken branch instruction is encountered, the instruction fetch mechanism within the instruction cache begins fetching instructions from the target instruction path. This fetching may be initiated prior to receiving the complete cache line containing the predicted-taken branch instruction. As long as instruction fetches from the target instruction path continue to hit in the instruction cache, these instructions may be fetched and dispatched into a microprocessor employing the instruction cache. The remaining portion of the cache line of instruction bytes containing the predicted-taken branch instruction is received by the cache holding register. In order to reduce the number of ports employed upon the instruction bytes storage used to store cache lines of instructions, the cache holding register retains the cache line until an idle cycle occurs in the instruction bytes storage. The same port ordinarily used for fetching instructions is then used to store the cache line into the instruction bytes storage. In one embodiment, the instruction cache prefetches a succeeding cache line to the cache line which misses. A second cache holding register is employed for storing the prefetched cache line.

    Abstract translation: 提供采用高速缓存保持寄存器的指令高速缓存器。 当从主存储器取出指令字节的高速缓存行时,指令字节从主存储器接收时临时存储到高速缓存保持寄存器中。 指令字节是从主存储器接收到的预解码的。 如果遇到预测的分支指令,则指令高速缓存内的指令获取机制开始从目标指令路径获取指令。 可以在接收到包含预测的分支指令的完整高速缓存行之前启动该获取。 只要从目标指令路径获取的指令继续命中指令高速缓存,可以将这些指令提取并分派到采用指令高速缓存的微处理器中。 由高速缓存保持寄存器接收包含预测的分支指令的指令字节的高速缓存行的剩余部分。 为了减少用于存储高速缓存行指令的指令字节存储器所使用的端口数量,高速缓存保持寄存器保持高速缓存行直到在指令字节存储器中发生空闲周期。 通常用于提取指令的相同端口用于将高速缓存行存储到指令字节存储器中。 在一个实施例中,指令高速缓存将后续的高速缓存行预取到丢失的高速缓存行。 采用第二高速缓存保存寄存器来存储预取的高速缓存行。

    Reorder buffer having an improved future file for storing speculative
instruction execution results
    18.
    发明授权
    Reorder buffer having an improved future file for storing speculative instruction execution results 失效
    重新排序缓冲器具有改进的未来文件,用于存储推测性指令执行结果

    公开(公告)号:US5946468A

    公开(公告)日:1999-08-31

    申请号:US974967

    申请日:1997-11-20

    Abstract: A reorder buffer for a microprocessor comprising a control unit, an instruction storage, and future file. The future file has storage locations associated with each register implemented in the microprocessor. The future file is configured to store a reorder buffer tag that corresponds to the last instruction, in program order, stored within the instruction storage that has a destination operand corresponding to the register associated with said storage location. The future file is further configured to store instruction results. The control unit is configured to read a particular reorder buffer tag from the future file that corresponds to a completed instruction and to compare the particular reorder buffer tag with the completed instruction's result tag. If the two tags compare equal, the control unit is configured to write any result data corresponding to the completed instruction into the future file. This advantageously reduces the number of comparators needed to maintain the future file. The future file is also configured to improve branch misprediction recovery speed by examining each entry in said instruction storage for a valid destination starting with the mispredicted branch instruction. This configuration advantageously allows older instructions in the instruction storage to be retired while the future file is being recovered, thereby reducing the number of instructions the control unit must process to recover the future file.

    Abstract translation: 一种用于微处理器的重新排序缓冲器,包括控制单元,指令存储器和未来文件。 未来文件具有与微处理器中实现的每个寄存器相关联的存储位置。 未来文件被配置为存储对应于最后指令的重新排序缓冲器标签,以程序顺序存储在具有对应于与所述存储位置相关联的寄存器的目的地操作数的指令存储器内。 将来的文件进一步配置为存储指令结果。 控制单元被配置为从对应于完成的指令的未来文件中读取特定的重排序缓冲器标签,并且将特定重排序缓冲器标签与完成的指令的结果标签进行比较。 如果两个标签的比较相等,则控制单元被配置为将与完成的指令对应的任何结果数据写入未来文件。 这有利地减少了保持未来文件所需的比较器的数量。 未来文件还被配置为通过检查所述指令存储器中针对由错误预测的分支指令开始的有效目的地的每个条目来提高分支错误预测恢复速度。 该配置有利地允许指令存储器中的较旧指令在将来的文件被恢复的同时退出,从而减少控制单元必须处理以恢复未来文件的指令数量。

    System for using a data history table to select among multiple data
prefetch algorithms
    19.
    发明授权
    System for using a data history table to select among multiple data prefetch algorithms 失效
    用于使用数据历史表在多个数据预取算法中进行选择的系统

    公开(公告)号:US5941981A

    公开(公告)日:1999-08-24

    申请号:US963276

    申请日:1997-11-03

    Applicant: Thang M. Tran

    Inventor: Thang M. Tran

    CPC classification number: G06F9/383 G06F9/3455 G06F9/3832

    Abstract: A prefetch unit stores a plurality of prefetch control fields in a data history table. Each prefetch control field selects one of multiple prefetch algorithms for use in prefetching data. As an instruction stream is fetched, the fetch address is provided to the data history table for selecting a prefetch control field. Since multiple prefetch algorithms are supported, many different data reference patterns may be prefetched. The prefetch unit is configured to gauge the effectiveness of the selected prefetch algorithm, and to select a different prefetch algorithm if the selected prefetch algorithm is found to be ineffective. The prefetch unit monitors the load/store memory operations performed in response to the instruction stream (i.e. the non-prefetch memory operations) to determine the effectiveness. Alternatively, the prefetch unit may evaluate each of the prefetch algorithms with respect to the observed set of memory references and select the algorithm which is most accurate.

    Abstract translation: 预取单元在数据历史表中存储多个预取控制字段。 每个预取控制字段选择多个预取算法之一用于预取数据。 当指令流被提取时,提取地址被提供给用于选择预取控制字段的数据历史表。 由于支持多个预取算法,因此可能会预取许多不同的数据参考模式。 预取单元被配置为测量所选预取算法的有效性,并且如果发现所选择的预取算法是无效的,则选择不同的预取算法。 预取单元监视响应于指令流执行的加载/存储存储器操作(即非预取存储器操作)以确定有效性。 或者,预取单元可以针对所观察的存储器参考的集合来评估每个预取算法,并且选择最准确的算法。

    Branch misprediction recovery in a reorder buffer having a future file
    20.
    发明授权
    Branch misprediction recovery in a reorder buffer having a future file 失效
    在具有未来文件的重新排序缓冲器中的分支错误预测恢复

    公开(公告)号:US5915110A

    公开(公告)日:1999-06-22

    申请号:US975011

    申请日:1997-11-20

    Abstract: A reorder buffer for a microprocessor comprising a control unit, an instruction storage, and future file. The future file has storage locations associated with each register implemented in the microprocessor. The future file is configured to store a reorder buffer tag that corresponds to the last instruction, in program order, stored within the instruction storage that has a destination operand corresponding to the register associated with said storage location. The future file is further configured to store instruction results. The control unit is configured to read a particular reorder buffer tag from the future file that corresponds to a completed instruction and to compare the particular reorder buffer tag with the completed instruction's result tag. If the two tags compare equal, the control unit is configured to write any result data corresponding to the completed instruction into the future file. This advantageously reduces the number of comparators needed to maintain the future file. The future file is also configured to improve branch misprediction recovery speed by examining each entry in said instruction storage for a valid destination starting with the mispredicted branch instruction. This configuration advantageously allows older instructions in the instruction storage to be retired while the future file is being recovered, thereby reducing the number of instructions the control unit must process to recover the future file.

    Abstract translation: 一种用于微处理器的重新排序缓冲器,包括控制单元,指令存储器和未来文件。 未来文件具有与微处理器中实现的每个寄存器相关联的存储位置。 未来文件被配置为存储对应于最后指令的重新排序缓冲器标签,以程序顺序存储在具有对应于与所述存储位置相关联的寄存器的目的地操作数的指令存储器内。 将来的文件进一步配置为存储指令结果。 控制单元被配置为从对应于完成的指令的未来文件中读取特定的重排序缓冲器标签,并且将特定重排序缓冲器标签与完成的指令的结果标签进行比较。 如果两个标签的比较相等,则控制单元被配置为将与完成的指令对应的任何结果数据写入未来文件。 这有利地减少了保持未来文件所需的比较器的数量。 未来文件还被配置为通过检查所述指令存储器中针对由错误预测的分支指令开始的有效目的地的每个条目来提高分支错误预测恢复速度。 该配置有利地允许指令存储器中的较旧指令在将来的文件被恢复的同时退出,从而减少控制单元必须处理以恢复未来文件的指令数量。

Patent Agency Ranking