Thread-specific branch prediction by logically splitting branch history tables and predicted target address cache in a simultaneous multithreading processing environment
    1.
    发明授权
    Thread-specific branch prediction by logically splitting branch history tables and predicted target address cache in a simultaneous multithreading processing environment 失效
    通过在同时多线程处理环境中逻辑分割分支历史表和预测目标地址缓存来进行线程专用分支预测

    公开(公告)号:US07120784B2

    公开(公告)日:2006-10-10

    申请号:US10425064

    申请日:2003-04-28

    IPC分类号: G06F9/40 G06F9/00

    摘要: Branch prediction logic is enhanced to provide a monitoring function for certain conditions which indicate that the use of separate BHTs and predicted target address cache would provide better results for branch prediction. The branch prediction logic responds to the occurrence of the monitored condition by logically splitting the BHTs and count cache so that half of the address space is allocated to a first thread and the second half is allocated to the next thread. Prediction-generated addresses that belong to the first thread are then directed to the half of the array that is allocated to that thread and prediction-generated addresses that belong to the second thread are directed to the next half of the array that is allocated to the second thread. In order to split the array, the highest order bit in the array is utilized to uniquely identify addresses of the first and the second threads.

    摘要翻译: 分支预测逻辑被增强以提供用于某些条件的监视功能,其指示使用单独的BHT和预测的目标地址高速缓存将为分支预测提供更好的结果。 分支预测逻辑通过逻辑分割BHT和计数高速缓存来响应监视条件的发生,使得一半的地址空间被分配给第一个线程,而后半部分被分配给下一个线程。 属于第一个线程的预测生成的地址然后被定向到分配给该线程的数组的一半,并且属于第二个线程的预测生成的地址被定向到分配给该线程的数组的下一半 第二线程。 为了拆分阵列,阵列中的最高位用于唯一标识第一和第二个线程的地址。

    Cache predictor for simultaneous multi-threaded processor system supporting multiple transactions
    2.
    发明授权
    Cache predictor for simultaneous multi-threaded processor system supporting multiple transactions 有权
    支持多个事务的同时多线程处理器系统的缓存预测器

    公开(公告)号:US07039768B2

    公开(公告)日:2006-05-02

    申请号:US10424487

    申请日:2003-04-25

    IPC分类号: G06F12/00

    摘要: A set-associative I-cache that enables early cache hit prediction and correct way selection when the processor is executing instructions of multiple threads having similar EAs. Each way of the I-cache comprises an EA Directory (EA Dir), which includes a series of thread valid bits that are individually assigned to one of the multiple threads. Particular ones of the thread valid bits are set in each EA Dir to indicate when an instruction block the thread is cached within the particular way with which the EA Dir is associated. When a cache line request for a particular thread is received, a cache hit is predicted when the EA of the request matches the EA in the EA Dir and the cache line is selected from the way associated with the EA Dir who has the thread valid bit for that thread set. Early way selection is thus achieved since the way selection only requires a check of the thread valid bits.

    摘要翻译: 当处理器执行具有类似EA的多个线程的指令时,能够实现早期缓存命中预测和正确选择方法的集合关联I缓存。 I缓存的每个方式包括EA目录(EA目录),其包括单独分配给多个线程之一的一系列线程有效位。 在每个EA Dir中设置特定的线程有效位,以指示线程是否以EA Dir所关联的特定方式缓存的时间。 当接收到针对特定线程的高速缓存线请求时,当请求的EA与EA Dir中的EA匹配时,预测缓存命中,并且从与具有线程有效位的EA Dir相关联的方式中选择高速缓存行 为该线程集。 因此,由于选择方式仅需要检查线程有效位,因此实现了早期方式选择。

    Simultaneous multithread processor with result data delay path to adjust pipeline length for input to respective thread
    3.
    发明授权
    Simultaneous multithread processor with result data delay path to adjust pipeline length for input to respective thread 失效
    具有结果数据延迟路径的同时多线程处理器,用于调整输入到相应线程的流水线长度

    公开(公告)号:US07000233B2

    公开(公告)日:2006-02-14

    申请号:US10422653

    申请日:2003-04-21

    IPC分类号: G06F9/46

    摘要: An SMT system has a single thread mode and an SMT mode. Instructions are alternately selected from two threads every clock cycle and loaded into the IFAR in a three cycle pipeline of the IFU. If a branch predicted taken instruction is detected in the branch prediction circuit in stage three of the pipeline, then in the single thread mode a calculated address from the branch prediction circuit is loaded into the IFAR on the next clock cycle. If the instruction in the branch prediction circuit detects a branch predicted taken in the SMT mode, then the selected instruction address is loaded into the IFAR on the first clock cycle following branch predicted taken detection. The calculated target address is fed back and loaded into the IFAR in the second clock cycle following branch predicted taken detection. Feedback delay effectively switches the pipeline from three stages to four stages.

    摘要翻译: SMT系统具有单线程模式和SMT模式。 每个时钟周期从两个线程交替选择指令,并在IFU的三个循环管道中加载到IFAR中。 如果在流水线的第三级中在分支预测电路中检测到分支预测的指令,则在单线程模式中,来自分支预测电路的计算的地址在下一个时钟周期被加载到IFAR中。 如果分支预测电路中的指令检测到以SMT模式取得的分支预测,则在分支预测采集检测之后,所选择的指令地址在第一时钟周期被加载到IFAR中。 计算的目标地址在分支预测采集检测后的第二个时钟周期中反馈并加载到IFAR中。 反馈延迟有效地将管道从三个阶段切换到四个阶段。

    Apparatus and method of branch prediction utilizing a comparison of a branch history table to an aliasing table
    6.
    发明授权
    Apparatus and method of branch prediction utilizing a comparison of a branch history table to an aliasing table 失效
    使用分支历史表与混叠表的比较的分支预测的装置和方法

    公开(公告)号:US06484256B1

    公开(公告)日:2002-11-19

    申请号:US09370680

    申请日:1999-08-09

    IPC分类号: G06F900

    CPC分类号: G06F9/3806 G06F9/3848

    摘要: Improved conditional branch instruction prediction by detecting branch aliasing in a branch history table. Each entry in an aliasing table is associated with only one of a plurality of conditional branch instructions tracked by the branch history table. Prior to executing a conditional branch instruction, outcome of the execution of the conditional branch instruction is predicted utilizing the branch history table entry associated with the conditional branch instruction. Outcome of the execution of the conditional branch instruction is also predicted utilizing the aliasing table entry associated with the conditional branch instruction. Branch aliasing is detected by comparing the prediction made utilizing the branch history table with the prediction made utilizing the aliasing table. In response to the predictions being different, a determination is made that branch aliasing occurred, and the prediction made utilizing the aliasing table is utilized for predicting the outcome of the execution of the conditional branch instruction.

    摘要翻译: 通过检测分支历史表中的分支别名来改进条件分支指令预测。 混叠表中的每个条目仅与由分支历史表跟踪的多个条件转移指令中的一个相关联。 在执行条件转移指令之前,利用与条件转移指令相关联的分支历史表条目来预测条件转移指令的执行结果。 还使用与条件分支指令相关联的混叠表条目来预测条件分支指令的执行的结果。 通过将利用分支历史表进行的预测与利用混叠表进行的预测进行比较来检测分支混叠。 响应于不同的预测,确定发生分支混叠,并且使用利用混叠表进行的预测用于预测条件分支指令的执行结果。

    Apparatus and method for instruction fetching using a multi-port
instruction cache directory
    7.
    发明授权
    Apparatus and method for instruction fetching using a multi-port instruction cache directory 失效
    使用多端口指令缓存目录进行指令读取的装置和方法

    公开(公告)号:US5918044A

    公开(公告)日:1999-06-29

    申请号:US741465

    申请日:1996-10-31

    摘要: In an instruction fetch unit for an information handling system which decodes instructions, calculates target addresses of multiple branch instructions, and resolves multiple branch instructions in parallel instead of sequentially, the critical path through a multiple way set associative instruction cache is through a directory and compare circuit which selects which way instructions will be retrieved. This patch is known as the late select path. A multi-ported effective address (EA) directory is provided and is accessed prior to selection of a fetch address which fetches the next set of instructions from the cache. In this manner, the time required for the late select path can be reduced.

    摘要翻译: 在用于解码指令的信息处理系统的指令获取单元中,计算多个分支指令的目标地址,并行而不是依次解析多个分支指令,通过多路组合关联指令高速缓存的关键路径通过目录并进行比较 选择哪种方式指令将被检索的电路。 这个补丁被称为晚期选择路径。 提供多端口有效地址(EA)目录,并且在选择从高速缓存取出下一组指令的获取地址之前被访问。 以这种方式,可以减少延迟选择路径所需的时间。

    Method and system of addressing which minimize memory utilized to store
logical addresses by storing high order bits within a register
    8.
    发明授权
    Method and system of addressing which minimize memory utilized to store logical addresses by storing high order bits within a register 失效
    寻址方法和系统,通过在寄存器中存储高阶位来最小化用于存储逻辑地址的存储器

    公开(公告)号:US5765221A

    公开(公告)日:1998-06-09

    申请号:US767568

    申请日:1996-12-16

    摘要: An improved method of addressing within a pipelined processor having an address bit width of m+n bits is disclosed, which includes storing m high order bits corresponding to a first range of addresses, which encompasses a selected plurality of data executing within the pipelined processor. The n low order bits of addresses associated with each of the selected plurality of data are also stored. After determining the address of a subsequent datum to be executed within the processor, the subsequent datum is fetched. In response to fetching a subsequent datum having an address outside of the first range of addresses, a status register is set to a first of two states to indicate that an update to the first address register is required. In response to the status register being set to the second of the two states, the subsequent datum is dispatched for execution within the pipelined processor. The n low order bits of the subsequent datum are then stored, such that memory required to store addresses of instructions executing within the pipelined processor is thereby decreased.

    摘要翻译: 公开了一种具有地址位宽度为m + n位的流水线处理器内的寻址改进方法,其包括存储对应于第一地址范围的m个高位,其包含在流水线处理器内执行的选定的多个数据。 还存储与所选择的多个数据中的每一个相关联的n个低位地址。 在确定要在处理器中执行的后续数据的地址之后,获取随后的数据。 响应于获取具有在第一地址范围之外的地址的后续数据,状态寄存器被设置为两种状态中的第一状态,以指示需要对第一地址寄存器的更新。 响应于将状态寄存器设置为两个状态中的第二个状态,随后的数据被调度以在流水线处理器内执行。 然后存储随后数据的n个低位,从而减少了在流水线处理器内执行的指令的存储地址所需的存储器。

    METHODS FOR STORING BRANCH INFORMATION IN AN ADDRESS TABLE OF A PROCESSOR
    9.
    发明申请
    METHODS FOR STORING BRANCH INFORMATION IN AN ADDRESS TABLE OF A PROCESSOR 有权
    在处理器的地址表中存储分支信息的方法

    公开(公告)号:US20080276080A1

    公开(公告)日:2008-11-06

    申请号:US12171370

    申请日:2008-07-11

    IPC分类号: G06F9/38 G06F9/44

    CPC分类号: G06F9/3806

    摘要: Methods for storing branch information in an address table of a processor are disclosed. A processor of the disclosed embodiments may generally include an instruction fetch unit connected to an instruction cache, a branch execution unit, and an address table being connected to the instruction fetch unit and the branch execution unit. The address table may generally be adapted to store a plurality of entries with each entry of the address table being adapted to store a base address and a base instruction tag. In a further embodiment, the branch execution unit may be adapted to determine the address of a branch instruction having an instruction tag based on the base address and the base instruction tag of an entry of the address table associated with the instruction tag. In some embodiments, the address table may further be adapted to store branch information.

    摘要翻译: 公开了将分支信息存储在处理器的地址表中的方法。 所公开的实施例的处理器通常可以包括连接到指令高速缓存的指令获取单元,分支执行单元和连接到指令获取单元和分支执行单元的地址表。 地址表通常适于存储多个条目,其中地址表的每个条目适于存储基地址和基本指令标签。 在另一实施例中,分支执行单元可以适于基于与指令标签相关联的地址表的条目的基地址和基本指令标签来确定具有指令标签的分支指令的地址。 在一些实施例中,地址表还可以适于存储分支信息。

    Fencing off instruction buffer until re-circulation of rejected preceding and branch instructions to avoid mispredict flush
    10.
    发明授权
    Fencing off instruction buffer until re-circulation of rejected preceding and branch instructions to avoid mispredict flush 失效
    禁止指令缓冲区直到重新循环被拒绝的前导和分支指令,以避免错误预测冲洗

    公开(公告)号:US07254700B2

    公开(公告)日:2007-08-07

    申请号:US11056512

    申请日:2005-02-11

    IPC分类号: G06F9/38

    摘要: Systems and methods for handling the event of a wrong branch prediction and an instruction rejection in a digital processor are disclosed. More particularly, hardware and software are disclosed for detecting a condition where a branch instruction was mispredicted and an instruction that preceded the branch instruction is rejected after the branch instruction is executed. When the condition is detected, the branch instruction and rejected instruction are recirculated for execution. Until, the branch instruction is re-executed, control circuitry can prevent instructions from being received into an instruction buffer that feeds instructions to the execution units of the processor by fencing the instruction buffer from the fetcher. The instruction fetcher may continue fetching instructions along the branch target path into a local cache until the fence is dropped.

    摘要翻译: 公开了用于处理数字处理器中错误分支预测和指令拒绝的事件的系统和方法。 更具体地,公开了用于检测分支指令被错误预测的条件并且在执行分支指令之后拒绝分支指令之前的指令的硬件和软件。 当检测到条件时,分支指令和拒绝指令被再循环以执行。 直到分支指令被重新执行为止,控制电路可以防止指令被接收到指令缓冲器中,该指令缓冲器通过从提取器中击打指令缓冲器来将指令馈送到处理器的执行单元。 指令读取器可以继续沿着分支目标路径获取指令到本地高速缓存中,直到栅栏被丢弃。