DATA PROCESSING SYSTEM, PROCESSOR AND METHOD OF DATA PROCESSING HAVING IMPROVED BRANCH TARGET ADDRESS CACHE
    1.
    发明申请
    DATA PROCESSING SYSTEM, PROCESSOR AND METHOD OF DATA PROCESSING HAVING IMPROVED BRANCH TARGET ADDRESS CACHE 失效
    数据处理系统,具有改进的分支目标地址高速缓存的数据处理的处理器和方法

    公开(公告)号:US20090049286A1

    公开(公告)日:2009-02-19

    申请号:US11837893

    申请日:2007-08-13

    IPC分类号: G06F9/38

    CPC分类号: G06F9/3804 G06F9/3844

    摘要: A processor includes an execution unit and instruction sequencing logic that fetches instructions from a memory system for execution. The instruction sequencing logic includes branch logic that outputs predicted branch target addresses for use as instruction fetch addresses. The branch logic includes a level one branch target address cache (BTAC) and a level two BTAC each having a respective plurality of entries each associating at least a tag with a predicted branch target address. The branch logic accesses the level one and level two BTACs in parallel with a tag portion of a first instruction fetch address to obtain a first predicted branch target address from the level one BTAC for use as a second instruction fetch address in a first processor clock cycle and a second predicted branch target address from the level two BTAC for use as a third instruction fetch address in a later second processor clock cycle.

    摘要翻译: 处理器包括执行单元和从存储器系统执行指令的指令排序逻辑。 指令排序逻辑包括分支逻辑,该分支逻辑输出用作指令获取地址的预测分支目标地址。 分支逻辑包括一级分支目标地址高速缓存(BTAC)和二级BTAC,每级具有相应的多个条目,每个条目将至少一个标签与预测的分支目标地址相关联。 分支逻辑与第一指令获取地址的标签部分并行地访问一级和二级BTAC以从第一级BTAC获得第一预测分支目标地址,以在第一处理器时钟周期中用作第二指令获取地址 以及来自第二级BTAC的第二预测分支目标地址,以在随后的第二处理器时钟周期中用作第三指令提取地址。

    Branch target address cache
    2.
    发明授权
    Branch target address cache 失效
    分支目标地址缓存

    公开(公告)号:US07783870B2

    公开(公告)日:2010-08-24

    申请号:US11837893

    申请日:2007-08-13

    IPC分类号: G06F9/38 G06F9/32

    CPC分类号: G06F9/3804 G06F9/3844

    摘要: A processor includes an execution unit and instruction sequencing logic that fetches instructions from a memory system for execution. The instruction sequencing logic includes branch logic that outputs predicted branch target addresses for use as instruction fetch addresses. The branch logic includes a level one branch target address cache (BTAC) and a level two BTAC each having a respective plurality of entries each associating at least a tag with a predicted branch target address. The branch logic accesses the level one and level two BTACs in parallel with a tag portion of a first instruction fetch address to obtain a first predicted branch target address from the level one BTAC for use as a second instruction fetch address in a first processor clock cycle and a second predicted branch target address from the level two BTAC for use as a third instruction fetch address in a later second processor clock cycle.

    摘要翻译: 处理器包括执行单元和从存储器系统执行指令的指令排序逻辑。 指令排序逻辑包括分支逻辑,该分支逻辑输出用作指令获取地址的预测分支目标地址。 分支逻辑包括一级分支目标地址高速缓存(BTAC)和二级BTAC,每级具有相应的多个条目,每个条目将至少一个标签与预测的分支目标地址相关联。 分支逻辑与第一指令获取地址的标签部分并行地访问一级和二级BTAC以从第一级BTAC获得第一预测分支目标地址,以在第一处理器时钟周期中用作第二指令获取地址 以及来自第二级BTAC的第二预测分支目标地址,以在随后的第二处理器时钟周期中用作第三指令提取地址。

    DATA PROCESSING SYSTEM, PROCESSOR AND METHOD OF DATA PROCESSING HAVING BRANCH TARGET ADDRESS CACHE SELECTIVELY APPLYING A DELAYED HIT
    3.
    发明申请
    DATA PROCESSING SYSTEM, PROCESSOR AND METHOD OF DATA PROCESSING HAVING BRANCH TARGET ADDRESS CACHE SELECTIVELY APPLYING A DELAYED HIT 有权
    数据处理系统,具有分支目标地址高速缓存的数据处理的处理器和方法选择性地应用延迟的HIT

    公开(公告)号:US20090198982A1

    公开(公告)日:2009-08-06

    申请号:US12024190

    申请日:2008-02-01

    IPC分类号: G06F9/30

    摘要: In at least one embodiment, a processor includes at least one execution unit that executes instructions and instruction sequencing logic, coupled to the at least one execution unit, that fetches instructions from a memory system for execution by the at least one execution unit. The instruction sequencing logic including branch target address prediction circuitry that stores a branch target address prediction associating a first instruction fetch address with a branch target address to be used as a second instruction fetch address. The branch target address prediction circuitry includes delay logic that, in response to at least a tag portion of a third instruction fetch address matching the first instruction fetch address, delays access to the memory system utilizing the second instruction fetch address if no branch target address prediction was made in an immediately previous cycle of operation.

    摘要翻译: 在至少一个实施例中,处理器包括至少一个执行单元,其执行耦合到所述至少一个执行单元的指令和指令排序逻辑,所述指令和指令排序逻辑从存储器系统中取出用于由所述至少一个执行单元执行的指令。 指令排序逻辑包括分支目标地址预测电路,其存储将第一指令取出地址与要用作第二指令取出地址的分支目标地址相关联的分支目标地址预测。 分支目标地址预测电路包括延迟逻辑,响应于与第一指令获取地址匹配的第三指令获取地址的至少一个标签部分,如果没有分支目标地址预测,则使用第二指令获取地址延迟对存储器系统的访问 是在上一个操作循环中进行的。

    Address translation through an intermediate address space
    4.
    发明授权
    Address translation through an intermediate address space 有权
    通过中间地址空间进行地址转换

    公开(公告)号:US08966219B2

    公开(公告)日:2015-02-24

    申请号:US11928125

    申请日:2007-10-30

    IPC分类号: G06F12/10

    CPC分类号: G06F12/1063 G06F12/1072

    摘要: In a data processing system capable of concurrently executing multiple hardware threads of execution, an intermediate address translation unit in a processing unit translates an effective address for a memory access into an intermediate address. A cache memory is accessed utilizing the intermediate address. In response to a miss in cache memory, the intermediate address is translated into a real address by a real address translation unit that performs address translation for multiple hardware threads of execution. The system memory is accessed with the real address.

    摘要翻译: 在能够同时执行多个硬件执行线程的数据处理系统中,处理单元中的中间地址转换单元将存储器访问的有效地址转换为中间地址。 使用中间地址访问高速缓冲存储器。 响应于高速缓冲存储器中的缺失,中间地址被实现地址转换单元转换成实地址,该单元执行多个硬件执行线程的地址转换。 使用实际地址访问系统内存。

    Read and write aware cache with a read portion and a write portion of a tag and status array
    5.
    发明授权
    Read and write aware cache with a read portion and a write portion of a tag and status array 有权
    具有读取部分和标签和状态数组的写入部分的读写感知高速缓存

    公开(公告)号:US08843705B2

    公开(公告)日:2014-09-23

    申请号:US13572916

    申请日:2012-08-13

    IPC分类号: G06F12/08

    摘要: A mechanism is provided in a cache for providing a read and write aware cache. The mechanism partitions a large cache into a read-often region and a write-often region. The mechanism considers read/write frequency in a non-uniform cache architecture replacement policy. A frequently written cache line is placed in one of the farther banks. A frequently read cache line is placed in one of the closer banks. The size ratio between read-often and write-often regions may be static or dynamic. The boundary between the read-often region and the write-often region may be distinct or fuzzy.

    摘要翻译: 在缓存中提供了一种机制,用于提供读写感知高速缓存。 该机制将大型缓存分区分为常读区域和经常写区域。 该机制将读/写频率视为非均匀缓存架构替换策略。 经常写入的高速缓存行放置在更远的存储区之一中。 经常读取的高速缓存行被放置在其中一个较近的存储体中。 常读区域和经常写区域之间的大小比可以是静态的或动态的。 经常读区域和经常写区域之间的边界可能是不同的或模糊的。

    Instruction set architecture extensions for performing power versus performance tradeoffs
    6.
    发明授权
    Instruction set architecture extensions for performing power versus performance tradeoffs 失效
    用于执行功率与性能折衷的指令集架构扩展

    公开(公告)号:US08589665B2

    公开(公告)日:2013-11-19

    申请号:US12788940

    申请日:2010-05-27

    IPC分类号: G06F9/00

    摘要: Mechanisms are provided for processing an instruction in a processor of a data processing system. The mechanisms operate to receive, in a processor of the data processing system, an instruction, the instruction including power/performance tradeoff information associated with the instruction. The mechanisms further operate to determine power/performance tradeoff priorities or criteria, specifying whether power conservation or performance is prioritized with regard to execution of the instruction, based on the power/performance tradeoff information. Moreover, the mechanisms process the instruction in accordance with the power/performance tradeoff priorities or criteria identified based on the power/performance tradeoff information of the instruction.

    摘要翻译: 提供了用于处理数据处理系统的处理器中的指令的机制。 这些机制操作以在数据处理系统的处理器中接收指令,该指令包括与指令相关联的功率/性能权衡信息。 这些机制进一步操作以基于功率/性能折衷信息来确定功率/性能折衷优先级或标准,指定功率节省或关于指令的执行是否优先的性能。 此外,机构根据功率/性能折衷优先级或基于指令的功率/性能折衷信息识别的标准处理指令。

    Techniques for dynamically sharing a fabric to facilitate off-chip communication for multiple on-chip units
    7.
    发明授权
    Techniques for dynamically sharing a fabric to facilitate off-chip communication for multiple on-chip units 失效
    用于动态共享结构以促进多个片上单元的片外通信的技术

    公开(公告)号:US08346988B2

    公开(公告)日:2013-01-01

    申请号:US12786716

    申请日:2010-05-25

    IPC分类号: G06F3/00 G06F13/00

    摘要: A technique for sharing a fabric to facilitate off-chip communication for on-chip units includes dynamically assigning a first unit that implements a first communication protocol to a first portion of the fabric when private fabrics are indicated for the on-chip units. The technique also includes dynamically assigning a second unit that implements a second communication protocol to a second portion of the fabric when the private fabrics are indicated for the on-chip units. In this case, the first and second units are integrated in a same chip and the first and second protocols are different. The technique further includes dynamically assigning, based on off-chip traffic requirements of the first and second units, the first unit or the second unit to the first and second portions of the fabric when the private fabrics are not indicated for the on-chip units.

    摘要翻译: 一种用于共享一个结构以促进片上单元的片外通信的技术包括:当针对片上单元指示专用结构时,动态分配实现第一通信协议的第一单元到该结构的第一部分。 该技术还包括当为片上单元指示专用结构时,动态地将实现第二通信协议的第二单元分配给该结构的第二部分。 在这种情况下,第一和第二单元集成在相同的芯片中,并且第一和第二协议是不同的。 该技术还包括:当私有结构未被指示用于片上单元时,基于第一单元或第二单元的片外流量要求将第一单元或第二单元动态地分配给该结构的第一和第二部分 。

    Fine Grained Cache Allocation
    8.
    发明申请
    Fine Grained Cache Allocation 有权
    细粒度缓存分配

    公开(公告)号:US20110022773A1

    公开(公告)日:2011-01-27

    申请号:US12509752

    申请日:2009-07-27

    IPC分类号: G06F12/08 G06F12/00

    摘要: A mechanism is provided in a virtual machine monitor for fine grained cache allocation in a shared cache. The mechanism partitions a cache tag into a most significant bit (MSB) portion and a least significant bit (LSB) portion. The MSB portion of the tags is shared among the cache lines in a set. The LSB portion of the tags is private, one per cache line. The mechanism allows software to set the MSB portion of tags in a cache to allocate sets of cache lines. The cache controller determines whether a cache line is locked based on the MSB portion of the tag.

    摘要翻译: 在虚拟机监视器中提供了用于共享高速缓存中的细粒度高速缓存分配的机制。 该机制将高速缓存标签分成最高有效位(MSB)部分和最低有效位(LSB)部分。 标签的MSB部分在一组中的高速缓存行之间共享。 标签的LSB部分是私有的,每个缓存行一个。 该机制允许软件将缓存中的标签的MSB部分设置为分配高速缓存行集合。 高速缓存控制器基于标签的MSB部分来确定高速缓存行是否被锁定。

    Techniques for Indirect Data Prefetching
    10.
    发明申请
    Techniques for Indirect Data Prefetching 有权
    间接数据预取技术

    公开(公告)号:US20090198950A1

    公开(公告)日:2009-08-06

    申请号:US12024239

    申请日:2008-02-01

    IPC分类号: G06F12/02 G06F12/10

    CPC分类号: G06F12/0862 G06F2212/6028

    摘要: A processor includes a first address translation engine, a second address translation engine, and a prefetch engine. The first address translation engine is configured to determine a first memory address of a pointer associated with a data prefetch instruction. The prefetch engine is coupled to the first translation engine and is configured to fetch content, included in a first data block (e.g., a first cache line) of a memory, at the first memory address. The second address translation engine is coupled to the prefetch engine and is configured to determine a second memory address based on the content of the memory at the first memory address. The prefetch engine is also configured to fetch (e.g., from the memory or another memory) a second data block (e.g., a second cache line) that includes data at the second memory address.

    摘要翻译: 处理器包括第一地址转换引擎,第二地址转换引擎和预取引擎。 第一地址转换引擎被配置为确定与数据预取指令相关联的指针的第一存储器地址。 预取引擎被耦合到第一翻译引擎,并被配置为在第一存储器地址处提取包含在存储器的第一数据块(例如,第一高速缓存行)中的内容。 第二地址转换引擎耦合到预取引擎,并且被配置为基于第一存储器地址处的存储器的内容来确定第二存储器地址。 预取引擎还被配置为从第二存储器地址提取包括数据的第二数据块(例如,第二高速缓存行)(例如,从存储器或另一存储器)。