Branch selector prediction
    101.
    发明授权
    Branch selector prediction 失效
    分支选择器预测

    公开(公告)号:US5954816A

    公开(公告)日:1999-09-21

    申请号:US972988

    申请日:1997-11-19

    CPC classification number: G06F9/3844 G06F9/30054 G06F9/3806

    Abstract: A branch prediction unit includes a branch prediction entry corresponding to a group of contiguous instruction bytes. The branch prediction entry stores branch predictions corresponding to branch instructions within the group of contiguous instruction bytes. Additionally, the branch prediction entry stores a set of branch selectors corresponding to the group of contiguous instruction bytes. The branch selectors identify which branch prediction is to be selected if the corresponding byte (or bytes) is selected by the offset portion of the fetch address. Still further, a predicted branch selector is stored. The predicted branch selector is used to select a branch prediction for forming the fetch address. In parallel, a selected branch selector is selected from the set of branch selectors. The predicted branch selector is verified using the selected branch selector. If the selected branch selector and the predicted branch selector mismatch, the correct branch prediction is generated and the predicted branch selector is updated to indicate the selected branch selector.

    Abstract translation: 分支预测单元包括对应于一组相邻指令字节的分支预测条目。 分支预测条目存储对应于连续指令字节组内的分支指令的分支预测。 此外,分支预测条目存储对应于该组连续指令字节的一组分支选择器。 如果通过提取地址的偏移部分选择了相应的字节(或字节),则分支选择器识别要选择哪个分支预测。 此外,存储预测分支选择器。 预测分支选择器用于选择用于形成取出地址的分支预测。 并行地,从分支选择器组中选择选择的分支选择器。 使用所选择的分支选择器验证预测分支选择器。 如果所选择的分支选择器和预测分支选择器不匹配,则生成正确的分支预测,并且更新预测分支选择器以指示所选择的分支选择器。

    Speculative register storage for storing speculative results
corresponding to register updated by a plurality of concurrently
recorded instruction
    102.
    发明授权
    Speculative register storage for storing speculative results corresponding to register updated by a plurality of concurrently recorded instruction 失效
    用于存储对应于由多个并行记录指令更新的寄存器的推测结果的推测寄存器存储器

    公开(公告)号:US5933618A

    公开(公告)日:1999-08-03

    申请号:US550218

    申请日:1995-10-30

    Abstract: A microprocessor including a reorder buffer configured to store speculative register values regarding a particular register is provided. One value is stored for each set of concurrently decoded instructions which are outstanding within the microprocessor, reflecting the updates of each instruction within the set which updates the register. Additionally, the reorder buffer stores a set of constants indicative of the modification of the register by each instruction within the set of concurrently decoded instructions. Recovery from a mispredicted branch instruction (or from an instruction which causes an exception, a TRAP instruction, or an interrupt) may be achieved by utilizing the constants to adjust the result generated for the set of concurrently decoded instructions including the mispredicted branch instruction. The constants generated to indicate the modifications of the particular register may additionally allow multiple instructions having a dependency for the particular register to execute in parallel.

    Abstract translation: 提供了一种微处理器,其包括配置为存储关于特定寄存器的推测寄存器值的重排序缓冲器。 对于在微处理器内未完成的每组并行解码的指令存储一个值,反映了更新寄存器的集合内的每个指令的更新。 此外,重排序缓冲器存储指示该并发解码指令集内的每个指令对寄存器进行修改的一组常数。 可以通过利用常数来调整针对包括错误预测的分支指令的并行解码指令集合生成的结果来实现从错误预测的分支指令(或来自导致异常,TRAP指令或中断的指令)的恢复。 为了指示特定寄存器的修改产生的常数可另外允许具有对特定寄存器的依赖性的多个指令并行执行。

    Method for transferring data between a pair of caches configured to be
accessed from different stages of an instruction processing pipeline
    103.
    发明授权
    Method for transferring data between a pair of caches configured to be accessed from different stages of an instruction processing pipeline 失效
    用于在配置成从指令处理流水线的不同阶段被访问的一对缓存之间传送数据的方法

    公开(公告)号:US5903910A

    公开(公告)日:1999-05-11

    申请号:US561073

    申请日:1995-11-20

    Abstract: A microprocessor including a pair of caches is provided. One of the pair of caches is accessed by stack-relative memory accesses from the decode stage of the instruction processing pipeline. The second of the pair of caches is accessed by memory accesses from the execute stage of the instruction processing pipeline. When a miss is detected in the first of the pair of caches, the stack-relative memory access which misses is conveyed to the execute stage of the instruction processing pipeline. When the stack-relative memory access accesses the second of the pair of caches, the cache line containing the access is transmitted to the first of the pair of caches for storage. The first of the pair of caches selects a victim line for replacement when the data is transferred from the second of the pair of caches. If the victim line has been modified while stored in the first cache, then the victim line is stored in a copyback buffer. A signal is asserted by the first cache to inform the second cache of the need to perform a victim line copyback. Requests from the execute stage of the instruction processing pipeline are stalled to allow the copyback to occur.

    Abstract translation: 提供了包括一对高速缓存的微处理器。 一对缓存中的一个通过来自指令处理流水线的解码级的堆栈相对存储器访问进行访问。 该对高速缓存中的第二个由指令处理流水线的执行阶段的存储器访问访问。 当在一对高速缓存中的第一个中检测到未命中时,丢失的堆栈相对存储器访问被传送到指令处理流水线的执行阶段。 当堆栈相对存储器访问访问该对高速缓存中的第二个时,包含访问的高速缓存行被发送到该对高速缓存中的第一个用于存储。 一对缓存中的第一个在从一对缓存中的第二个数据传输数据时选择一个受害者行进行替换。 如果受害者行已被存储在第一个缓存中被修改,那么受害者行将被存储在一个副本缓冲区中。 一个信号由第一个缓存断言,通知第二个缓存是否需要执行受害线回拷。 来自指令处理流水线的执行阶段的请求被停止以允许发生回拷。

    Functional unit with a pointer for mispredicted resolution, and a
superscalar microprocessor employing the same
    104.
    发明授权
    Functional unit with a pointer for mispredicted resolution, and a superscalar microprocessor employing the same 失效
    具有误预测分辨率的指针的功能单元,以及使用其的超标量微处理器

    公开(公告)号:US5822574A

    公开(公告)日:1998-10-13

    申请号:US819109

    申请日:1997-03-17

    Applicant: Thang M. Tran

    Inventor: Thang M. Tran

    Abstract: A superscalar microprocessor is provided having functional units which receive a pointer (a reorder buffer tag) which is compared to the reorder buffer tags of the instructions currently being executed. The pointer identifies the oldest outstanding branch instruction. If a functional unit's reorder buffer tag matches the pointer, then that functional unit conveys its corrected fetch address to the instruction fetching mechanism of the superscalar microprocessor (i.e. the branch prediction unit). The superscalar microprocessor also includes a load/store unit which receives a pair of pointers identifying the oldest outstanding instructions which are not in condition for retirement. The load/store unit compares these pointers with the reorder buffer tags of load instructions that miss the data cache and store instructions. A match must be found before the associated instruction is presented to the data cache and the main memory system. The pointer-compare mechanism provides an ordering mechanism for load instructions that miss the data cache and store instructions.

    Abstract translation: 提供了一种超标量微处理器,其具有接收与当前正在执行的指令的重排序缓冲器标签相比较的指针(重排序缓冲器标签)的功能单元。 指针标识最旧的未完成的分支指令。 如果功能单元的重新排序缓冲器标签与指针匹配,则该功能单元将其校正的提取地址传送到超标量微处理器(即,分支预测单元)的指令获取机制。 超标量微处理器还包括一个加载/存储单元,其接收一对指示符,用于标识不符合退休条件的最旧的未完成指令。 加载/存储单元将这些指针与错过数据高速缓存和存储指令的加载指令的重新排序缓冲区标记进行比较。 在将相关指令提供给数据高速缓存和主存储器系统之前,必须找到匹配项。 指针比较机制为缺少数据高速缓存和存储指令的加载指令提供了排序机制。

    Instruction cache configured to provide instructions to a microprocessor
having a clock cycle time less than a cache access time of said
instruction cache
    105.
    发明授权
    Instruction cache configured to provide instructions to a microprocessor having a clock cycle time less than a cache access time of said instruction cache 失效
    指令高速缓存,其被配置为向微处理器提供具有小于所述指令高速缓存的高速缓存访​​问时间的时钟周期时间的指令

    公开(公告)号:US5752259A

    公开(公告)日:1998-05-12

    申请号:US621960

    申请日:1996-03-26

    Applicant: Thang M. Tran

    Inventor: Thang M. Tran

    Abstract: An apparatus including a banked instruction cache and a branch prediction unit is provided. The banked instruction cache allows multiple instruction fetch addresses (comprising consecutive instruction blocks from the predicted instruction stream being executed by the microprocessor) to be fetched concurrently. The instruction cache provides an instruction block corresponding to one of the multiple fetch addresses to the instruction processing pipeline of the microprocessor during each consecutive clock cycle, while additional instruction fetch addresses from the predicted instruction stream are fetched. Preferably, the instruction cache includes at least a number of banks equal to the number of clock cycles consumed by an instruction cache access. In this manner, instructions may be provided during each consecutive clock cycle even though instruction cache access time is greater than the clock cycle time of the microprocessor. Because consecutive instruction blocks from the instruction stream are fetched concurrently, the branch prediction unit stores a prediction for a non-consecutive instruction block with each instruction block. For example, for an instruction cache having a cache access time which is twice the clock cycle time, a prediction for the second consecutive instruction block following a particular instruction block within the predicted instruction stream is stored. When a pair of consecutive instruction blocks are fetched, predictions for a second pair of consecutive instruction blocks within the instruction stream subsequent to the pair of consecutive instruction blocks are formed from the branch prediction information stored with respect to the pair of consecutive instruction blocks.

    Abstract translation: 提供一种包括分组指令高速缓存和分支预测单元的装置。 存储的指令高速缓存允许同时取出多​​个指令获取地址(包括来自由微处理器执行的预测指令流的连续指令块)。 指令高速缓冲存储器在每个连续的时钟周期期间向微处理器的指令处理流水线提供与多个提取地址中的一个相对应的指令块,同时提取来自预测指令流的附加指令提取地址。 优选地,指令高速缓存包括等于指令高速缓存访​​问消耗的时钟周期的数量的至少一组存储体。 以这种方式,即使指令高速缓存访​​问时间大于微处理器的时钟周期时间,也可以在每个连续时钟周期期间提供指令。 由于来自指令流的连续的指令块被同时取出,所以分支预测单元存储具有每个指令块的非连续指令块的预测。 例如,对于具有两倍于时钟周期时间的高速缓存访​​问时间的指令高速缓存,存储在预测指令流内的特定指令块之后的第二连续指令块的预测。 当取出一对连续的指令块时,根据相对于一对连续指令块存储的分支预测信息,形成在该连续指令块对之后的指令流内的第二对连续指令块的预测。

    Array having an update circuit for updating a storage location with a
value stored in another storage location
    106.
    发明授权
    Array having an update circuit for updating a storage location with a value stored in another storage location 失效
    阵列具有用于利用存储在另一存储位置中的值来更新存储位置的更新电路

    公开(公告)号:US5687110A

    公开(公告)日:1997-11-11

    申请号:US603802

    申请日:1996-02-20

    CPC classification number: G06F9/3844 G06F9/3857 G06F9/3863

    Abstract: A memory including first storage circuits for storing first values and second storages circuit for storing second values is provided. The first value may be retired branch prediction information, while the second value may be speculative branch prediction information. The speculative branch prediction information is updated when the corresponding instructions are fetched, and the retired branch prediction value is updated when the corresponding branch instruction is retired. The speculative branch prediction information is used to form branch predictions. Therefore, the speculatively fetched and executed branches influence subsequent branch predictions. Upon detection of a mispredicted branch or an instruction which causes an exception, the speculative branch prediction information is updated to the corresponding retired branch prediction information. An update circuit is coupled between the first and second storage circuits for transmitting the updated information upon assertion of a control signal. The control signal may be asserted to cause the update of each speculative branch prediction by the corresponding retired branch prediction. The updates occur substantially simultaneously, restoring any corruption to speculative branch predictions due to speculatively fetched branch instructions which were flushed from the instruction processing pipeline. Although discussed herein in terms of a branch prediction array, the memory may be adapted to many other applications.

    Abstract translation: 提供了包括用于存储第一值的第一存储电路和用于存储第二值的第二存储电路的存储器。 第一值可以是退休分支预测信息,而第二值可以是推测性分支预测信息。 当对应的指令被取出时,推测分支预测信息被更新,并且当对应的分支指令退出时,退出的分支预测值被更新。 推测分支预测信息用于形成分支预测。 因此,推测取得和执行的分支影响后续的分支预测。 在检测到错误的分支或引起异常的指令时,将推测性分支预测信息更新为相应的退出分支预测信息。 更新电路耦合在第一和第二存储电路之间,用于在断言控制信号时发送更新的信息。 控制信号可以被断言,以通过相应的退役分支预测使每个推测分支预测的更新。 这些更新基本上同时发生,由于从指令处理流水线刷新的推测性获取的分支指令,将任何损坏恢复到推测性分支预测。 尽管这里在分支预测阵列方面进行了讨论,但存储器可以适用于许多其他应用。

    High performance RAM array circuit employing self-time clock generator
for enabling array accessess
    107.
    发明授权
    High performance RAM array circuit employing self-time clock generator for enabling array accessess 失效
    采用自身时钟发生器的高性能RAM阵列电路,可实现阵列访问

    公开(公告)号:US5619464A

    公开(公告)日:1997-04-08

    申请号:US473103

    申请日:1995-06-07

    Applicant: Thang M. Tran

    Inventor: Thang M. Tran

    CPC classification number: G11C7/22 G11C7/12

    Abstract: A RAM array circuit is provided which includes a memory array formed by several RAM cell columns. A particular cell within each column and row may be selected for access (either read or write) by an address decode circuit. The RAM array circuit employs a self-time column having a delay characteristic which is approximately equal to that of each of the RAM cell columns. The rising edge of a single-phase clock is used to precharge each RAM cell column as well as the self-time column. As the self-time column is precharged to a high level, the self-time control circuit disables the precharge and enables the array access for read or write. When a particular row is selected by the address decoding mechanism, the self-time column is discharged. Once the self-time column has discharged, a sense amplifier is enabled to read data from the array. Access is then disabled and precharge is again enabled upon the next rising edge of the clock.

    Abstract translation: 提供了一种RAM阵列电路,其包括由几个RAM单元列形成的存储器阵列。 每个列和行中的特定单元可以被选择用于由地址解码电路进行访问(读或写)。 RAM阵列电路采用具有近似等于每个RAM单元列的延迟特性的自身时间列。 单相时钟的上升沿用于对每个RAM单元列以及自身时间列进行预充电。 当自身时间列被预充电到高电平时,自身时间控制电路禁止预充电,并使阵列访问能够进行读或写。 当通过地址解码机制选择特定行时,自身时间列被排出。 一旦自身时间列放电,读出放大器就可以从阵列中读取数据。 然后禁用访问,并在时钟的下一个上升沿再次启用预充电。

    Systems and methods for handling instructions of in-order and out-of-order execution queues
    108.
    发明授权
    Systems and methods for handling instructions of in-order and out-of-order execution queues 有权
    用于处理有序和无序执行队列的指令的系统和方法

    公开(公告)号:US09110656B2

    公开(公告)日:2015-08-18

    申请号:US13210566

    申请日:2011-08-16

    Abstract: A processor configured to provide instructions of a first instruction type to a first execution unit, and a second execution queue configured to provide instructions of a second instruction type to a second execution unit. A first instruction of the second instruction type is received. The first instruction is decoded by the decode/issue unit to determine operands of the first instruction. The operands of the first instruction are determined to include a dependency on a second instruction of the first instruction type stored in a first entry of the first execution queue. The first instruction is stored in a first entry of the second execution queue. A synchronization indicator corresponding to the first instruction in a second entry of the first execution queue is set immediately adjacent the first entry of the first execution queue, which indicates that the first instruction is stored in another execution queue.

    Abstract translation: 一种处理器,被配置为向第一执行单元提供第一指令类型的指令,以及第二执行队列,被配置为向第二执行单元提供第二指令类型的指令。 接收第二指令类型的第一指令。 第一指令由解码/发布单元解码以确定第一指令的操作数。 第一指令的操作数被确定为包括对存储在第一执行队列的第一条目中的第一指令类型的第二指令的依赖性。 第一个指令存储在第二个执行队列的第一个条目中。 与第一执行队列的第二条目中的第一指令相对应的同步指示符被设置为紧邻第一执行队列的第一条目,其指示第一指令被存储在另一个执行队列中。

    Systems and methods for reducing branch misprediction penalty
    109.
    发明授权
    Systems and methods for reducing branch misprediction penalty 有权
    减少分支误判处罚的制度和方法

    公开(公告)号:US09092225B2

    公开(公告)日:2015-07-28

    申请号:US13362720

    申请日:2012-01-31

    CPC classification number: G06F9/3851 G06F9/3804 G06F9/381

    Abstract: In a processing system capable of single and multi-thread execution, a branch prediction unit can be configured to detect hard to predict branches and loop instructions. In a dual-threading (simultaneous multi-threading) configuration, one instruction queues (IQ) is used for each thread and instructions are alternately sent from each IQ to decode units. In single thread mode, the second IQ can be used to store the “not predicted path” of the hard-to-predict branch or the “fall-through” path of the loop. On mis-prediction, the mis-prediction penalty is reduced by getting the instructions from IQ instead of instruction cache.

    Abstract translation: 在能够进行单线程和多线程执行的处理系统中,分支预测单元可以被配置为检测难以预测分支和循环指令。 在双线程(同时多线程)配置中,每个线程都使用一个指令队列(IQ),并将指令从每个IQ交替发送到解码单元。 在单线程模式中,第二个IQ可用于存储难以预测的分支的“未预测路径”或循环的“直通”路径。 在误预测中,通过从IQ而不是指令高速缓存获取指令来减少错误预测损失。

    Microprocessor systems and methods for a combined register file and checkpoint repair register
    110.
    发明授权
    Microprocessor systems and methods for a combined register file and checkpoint repair register 有权
    组合寄存器文件和检查点修复寄存器的微处理器系统和方法

    公开(公告)号:US09063747B2

    公开(公告)日:2015-06-23

    申请号:US13096282

    申请日:2011-04-28

    Applicant: Thang M. Tran

    Inventor: Thang M. Tran

    CPC classification number: G06F9/3863 G06F9/30116 G06F9/3851

    Abstract: In a processor, a decode unit identifies instructions needing a checkpoint and enables selected checkpoints. A register file unit includes a plurality of architectural registers. A first set of checkpoint registers correspond to a first checkpoint. Each checkpoint register corresponds to a corresponding architectural register. A first set of indicators correspond to the first set of checkpoint registers to indicate whether the corresponding architectural register has been modified or is intended to be modified prior to enabling of the first checkpoint. A second set of indicators correspond to the first set of checkpoint registers and indicate whether the corresponding architectural register has been modified or is intended to be modified after enabling the first checkpoint.

    Abstract translation: 在处理器中,解码单元识别需要检查点的指令,并启用选定的检查点。 寄存器文件单元包括多个架构寄存器。 第一组检查点寄存器对应于第一个检查点。 每个检查点寄存器对应于相应的架构寄存器。 第一组指示符对应于第一组检查点寄存器,以指示对应的体系结构寄存器是否已经被修改,或者是在第一检查点启用之前被修改。 第二组指示符对应于第一组检查点寄存器,并且指示相应的体系结构寄存器是否已被修改或在启用第一检查点之后被修改。

Patent Agency Ranking