Method and apparatus for VLSI clock gated power estimation using LCB counts
    1.
    发明申请
    Method and apparatus for VLSI clock gated power estimation using LCB counts 审中-公开
    使用LCB计数的VLSI时钟选通功率估计的方法和装置

    公开(公告)号:US20050159907A1

    公开(公告)日:2005-07-21

    申请号:US10759936

    申请日:2004-01-16

    IPC分类号: G06F17/50 G06F19/00

    CPC分类号: G06F17/5036

    摘要: A method, an apparatus, and a computer program are provided for modeling algorithm for power consumption with improved accuracy. The improved model allows for variable operation of Local Clock Buffers (LCBs), which consume significant amounts power. Also, the amount of power consumed by each LCB can be varied. These variations of operation of LCBs and power consumption by LCBs allows for a more realistic model of operation of a given circuit or macro instead of the traditional “worst-case scenario” of power consumption.

    摘要翻译: 提供了一种方法,装置和计算机程序,用于提高精度的功耗建模算法。 改进的模型允许本地时钟缓冲器(LCB)的可变操作,其消耗大量的功率。 此外,每个LCB消耗的功率量可以变化。 LCB的这些操作变化和LCB的功耗允许给定电路或宏的更实际的操作模型,而不是传统的“最坏情况”的功耗。

    Dynamically rewriting branch instructions in response to cache line eviction
    2.
    发明授权
    Dynamically rewriting branch instructions in response to cache line eviction 有权
    动态地重写分支指令以响应缓存线驱逐

    公开(公告)号:US08782381B2

    公开(公告)日:2014-07-15

    申请号:US13444890

    申请日:2012-04-12

    IPC分类号: G06F9/44

    摘要: Mechanisms are provided for evicting cache lines from an instruction cache of the data processing system. The mechanisms store, for a portion of code in a current cache line, a linked list of call sites that directly or indirectly target the portion of code in the current cache line. A determination is made as to whether the current cache line is to be evicted from the instruction cache. The linked list of call sites is processed to identify one or more rewritten branch instructions having associated branch stubs, that either directly or indirectly target the portion of code in the current cache line. In addition, the one or more rewritten branch instructions are rewritten to restore the one or more rewritten branch instructions to an original state based on information in the associated branch stubs.

    摘要翻译: 提供用于从数据处理系统的指令高速缓存中驱逐高速缓存行的机制。 机制存储当前高速缓存行中代码的一部分,直接或间接地定位当前高速缓存行中代码部分的调用站点的链接列表。 确定当前高速缓存行是否将从指令高速缓存中逐出。 处理呼叫站点的链接列表以识别具有相关联的分支存根的一个或多个重写的分支指令,其直接或间接地对目标当前高速缓存行中的代码部分。 此外,重写一个或多个重写的分支指令,以基于相关联的分支存根中的信息将一个或多个重写的分支指令恢复到原始状态。

    Dynamically rewriting branch instructions to directly target an instruction cache location
    3.
    发明授权
    Dynamically rewriting branch instructions to directly target an instruction cache location 有权
    动态地重写分支指令直接指向指令高速缓存位置

    公开(公告)号:US08627051B2

    公开(公告)日:2014-01-07

    申请号:US13442919

    申请日:2012-04-10

    IPC分类号: G06F9/44

    CPC分类号: G06F9/3806 G06F12/0875

    摘要: Mechanisms are provided for dynamically rewriting branch instructions in a portion of code. The mechanisms execute a branch instruction in the portion of code. The mechanisms determine if a target instruction of the branch instruction, to which the branch instruction branches, is present in an instruction cache associated with the processor. Moreover, the mechanisms directly branch execution of the portion of code to the target instruction in the instruction cache, without intervention from an instruction cache runtime system, in response to a determination that the target instruction is present in the instruction cache. In addition, the mechanisms redirect execution of the portion of code to the instruction cache runtime system in response to a determination that the target instruction cannot be determined to be present in the instruction cache.

    摘要翻译: 提供了用于在代码的一部分中动态地重写分支指令的机制。 这些机制在代码的一部分中执行分支指令。 这些机制确定分支指令的目标指令是否存在于与处理器相关联的指令高速缓存中。 此外,响应于确定目标指令存在于指令高速缓存中,机制直接将代码部分的执行分支到指令高速缓存中的目标指令,而不需要来自指令高速缓存运行时系统的干预。 此外,响应于确定目标指令不能被确定为存在于指令高速缓存中,这些机制将代码部分的执行重定向到指令高速缓存运行时系统。

    Arranging Binary Code Based on Call Graph Partitioning
    4.
    发明申请
    Arranging Binary Code Based on Call Graph Partitioning 有权
    基于调用图划分二进制代码

    公开(公告)号:US20110321021A1

    公开(公告)日:2011-12-29

    申请号:US12823244

    申请日:2010-06-25

    IPC分类号: G06F9/45

    CPC分类号: G06F8/4442

    摘要: Mechanisms are provided for arranging binary code to reduce instruction cache conflict misses. These mechanisms generate a call graph of a portion of code. Nodes and edges in the call graph are weighted to generate a weighted call graph. The weighted call graph is then partitioned according to the weights, affinities between nodes of the call graph, and the size of cache lines in an instruction cache of the data processing system, so that binary code associated with one or more subsets of nodes in the call graph are combined into individual cache lines based on the partitioning. The binary code corresponding to the partitioned call graph is then output for execution in a computing device.

    摘要翻译: 提供了用于布置二进制代码以减少指令高速缓存冲突未命中的机制。 这些机制产生一部分代码的调用图。 调用图中的节点和边被加权以生成加权调用图。 然后根据权重,调用图的节点之间的亲和度和数据处理系统的指令高速缓存中的高速缓存行的大小来分配加权调用图,使得与一个或多个节点的子集相关联的二进制代码 调用图被组合到基于分区的各个高速缓存行。 然后输出与划分的调用图对应的二进制代码,以在计算设备中执行。

    Rewriting Branch Instructions Using Branch Stubs
    5.
    发明申请
    Rewriting Branch Instructions Using Branch Stubs 有权
    使用分支存根重写分支指令

    公开(公告)号:US20110321002A1

    公开(公告)日:2011-12-29

    申请号:US12823204

    申请日:2010-06-25

    IPC分类号: G06F9/44 G06F9/45

    摘要: Mechanisms are provided for rewriting branch instructions in a portion of code. The mechanisms receive a portion of source code having an original branch instruction. The mechanisms generate a branch stub for the original branch instruction. The branch stub stores information about the original branch instruction including an original target address of the original branch instruction. Moreover, the mechanisms rewrite the original branch instruction so that a target of the rewritten branch instruction references the branch stub. In addition, the mechanisms output compiled code including the rewritten branch instruction and the branch stub for execution by a computing device. The branch stub is utilized by the computing device at runtime to determine if execution of the rewritten branch instruction can be redirected directly to a target instruction corresponding to the original target address in an instruction cache of the computing device without intervention by an instruction cache runtime system.

    摘要翻译: 提供了用于在一部分代码中重写分支指令的机制。 该机制接收一部分具有原始分支指令的源代码。 机制为原始分支指令生成分支存根。 分支存根存储关于原始分支指令的信息,包括原始分支指令的原始目标地址。 此外,机制重写原始分支指令,使得重写的分支指令的目标引用分支存根。 此外,机制输出编译代码,包括重写的分支指令和分支存根,以供计算设备执行。 计算设备在运行时利用分支存根来确定重写的分支指令的执行是否可以被直接重定向到与计算设备的指令高速缓存中的原始目标地址相对应的目标指令,而无需指令高速缓存运行时系统的干预 。

    Efficient Multi-Level Software Cache Using SIMD Vector Permute Functionality
    6.
    发明申请
    Efficient Multi-Level Software Cache Using SIMD Vector Permute Functionality 有权
    使用SIMD向量权限功能的高效多级软件缓存

    公开(公告)号:US20110161548A1

    公开(公告)日:2011-06-30

    申请号:US12648667

    申请日:2009-12-29

    IPC分类号: G06F12/08 G06F12/00

    摘要: A cache manager receives a request for data, which includes a requested effective address. The cache manager determines whether the requested effective address matches a most recently used effective address stored in a mapped tag vector. When the most recently used effective address matches the requested effective address, the cache manager identifies a corresponding cache location and retrieves the data from the identified cache location. However, when the most recently used effective address fails to match the requested effective address, the cache manager determines whether the requested effective address matches a subsequent effective address stored in the mapped tag vector. When the cache manager determines a match to a subsequent effective address, the cache manager identifies a different cache location corresponding to the subsequent effective address and retrieves the data from the different cache location.

    摘要翻译: 缓存管理器接收对数据的请求,其中包括请求的有效地址。 高速缓存管理器确定所请求的有效地址是否匹配存储在映射的标签向量中的最近使用的有效地址。 当最近使用的有效地址与所请求的有效地址匹配时,高速缓存管理器识别对应的高速缓存位置并从所识别的高速缓存位置检索数据。 然而,当最近使用的有效地址不能匹配所请求的有效地址时,高速缓存管理器确定所请求的有效地址是否匹配存储在映射的标签向量中的后续有效地址。 当高速缓存管理器确定与随后的有效地址的匹配时,高速缓存管理器识别与随后的有效地址相对应的不同高速缓存位置,并从不同的高速缓存位置检索数据。

    Combination of forwarding/bypass network with history file
    7.
    发明申请
    Combination of forwarding/bypass network with history file 审中-公开
    转发/旁路网络与历史文件的组合

    公开(公告)号:US20060224869A1

    公开(公告)日:2006-10-05

    申请号:US11095908

    申请日:2005-03-31

    IPC分类号: G06F9/44

    摘要: An apparatus, a method, and a processor are provided for recovering the correct state of processor instructions in a processor. This apparatus contains a pipeline of latches, a register file, and a replay loop. The replay loop repairs incorrect results and inserts the repaired results back into the pipeline. A state machine detects incorrect results within the pipeline and sends the incorrect results to the replay loop. A correction module on the replay loop repairs the incorrect results and transmits the repaired results back into the pipeline. When an incorrect result enters the replay loop, a flush operation: ceases other operations within the pipeline; flushes the rest of the data results in the pipeline to the replay loop; opens the pipeline for the repaired results to be inserted; and eliminates any operations within the processor that would utilize the incorrect results.

    摘要翻译: 提供了一种用于在处理器中恢复处理器指令的正确状态的装置,方法和处理器。 该设备包含一个锁存器流水线,一个寄存器文件和一个重放循环。 重播循环修复不正确的结果,并将修复的结果插入管道。 状态机在管道中检测不正确的结果,并将不正确的结果发送到重放循环。 重播循环上的校正模块修复错误的结果,并将修复的结果发送回管道。 当不正确的结果进入重放循环时,刷新操作:停止管道内的其他操作; 将流水线中的其余数据结果刷新到重放循环; 打开要插入的修复结果的管道; 并消除处理器内利用错误结果的任何操作。

    System and method for instruction line buffer holding a branch target buffer
    8.
    发明申请
    System and method for instruction line buffer holding a branch target buffer 审中-公开
    用于指示行缓冲器的系统和方法,保持分支目标缓冲区

    公开(公告)号:US20060179277A1

    公开(公告)日:2006-08-10

    申请号:US11052502

    申请日:2005-02-04

    IPC分类号: G06F9/30

    摘要: A system and method that maintains a relatively small Instruction Load Buffer (ILB) is maintained for scheduling instructions. Instructions are sent from Local Store (LS) to the ILB using either an inline prefetcher or a branch table buffer loader. In one embodiment, the prefetcher is a hardware-based prefetcher that fetches, in address order, the next instructions likely to be scheduled. In one embodiment, the predicted branch instructions are loaded as a result of a software program, such as a dispatcher, issuing a “load branch table buffer (loadbtb)” instruction. Predicted branch instructions are loaded in one area of the ILB and inline instructions are loaded in another area of the ILB. In one embodiment, the loadbtb loads the instruction line that includes the predicted branch target address as well as the instruction line that immediately follows the instruction line with the predicted branch target address.

    摘要翻译: 维护相对较小的指令加载缓冲器(ILB)的系统和方法被维护用于调度指令。 使用内联预取器或分支表缓冲区加载器将本地存储(LS)发送到ILB。 在一个实施例中,预取器是基于硬件的预取器,其以地​​址顺序提取可能被调度的下一个指令。 在一个实施例中,作为诸如调度器的软件程序的结果,预测的分支指令被加载,发出“加载分支表缓冲器(loadbtb)”指令。 预测的分支指令被加载到ILB的一个区域中,并且内联指令被加载到ILB的另一个区域中。 在一个实施例中,loadbtb将包含预测的分支目标地址的指令行以及与预测的分支目标地址紧接在指令行之后的指令行一起加载。

    Method and apparatus for cooperative software multitasking in a processor system with a partitioned register file
    9.
    发明授权
    Method and apparatus for cooperative software multitasking in a processor system with a partitioned register file 有权
    用于具有分区寄存器文件的处理器系统中的协作软件多任务的方法和装置

    公开(公告)号:US08677101B2

    公开(公告)日:2014-03-18

    申请号:US11759636

    申请日:2007-06-07

    IPC分类号: G06F9/30 G06F9/40 G06F15/00

    摘要: A processor system executes multiple applet programs within a software application program in an information handling system. The information handling system includes operating system software that manages processor system hardware and software in a multi-tasking environment. In particular, the operating system software manages partitioning of a register file in the processor system to achieve a cooperative relationship among multiple applet programs within respective partitions of the register file. In one embodiment, the operating system software manages unique applet ID's to modify register file partition sizes and locations during applet program instruction text execution. In one embodiment, applet ID masking hardware provides sharing of register file space among multiple copies of applet program code.

    摘要翻译: 处理器系统在信息处理系统中的软件应用程序内执行多个小应用程序。 信息处理系统包括在多任务环境中管理处理器系统硬件和软件的操作系统软件。 特别地,操作系统软件管理处理器系统中的寄存器文件的分区,以实现在寄存器文件的相应分区内的多个小程序之间的协作关系。 在一个实施例中,操作系统软件管理独特的小应用程序ID,以在小程序指令文本执行期间修改寄存器文件分区大小和位置。 在一个实施例中,小应用程序ID屏蔽硬件提供小程序代码的多个副本之间的寄存器文件空间的共享。

    Arithmetic decoding acceleration
    10.
    发明授权
    Arithmetic decoding acceleration 失效
    算术解码加速

    公开(公告)号:US08520740B2

    公开(公告)日:2013-08-27

    申请号:US12874564

    申请日:2010-09-02

    IPC分类号: H04N7/12

    摘要: Mechanisms for performing decoding of context-adaptive binary arithmetic coding (CABAC) encoded data. The mechanisms receive, in a first single instruction multiple data (SIMD) vector register of the data processing system, CABAC encoded data of a bit stream. The CABAC encoded data includes a value to be decoded and bit stream state information. The mechanisms receive, in a second SIMD vector register of the data processing system, CABAC decoder context information. The mechanisms process the value, the bit stream state information, and the CABAC decoder context information in a non-recursive manner to generate a decoded value, updated bit stream state information, and updated CABAC decoder context information. The mechanisms store, in a third SIMD vector register, a result vector that combines the decoded value, updated bit stream state information, and updated CABAC decoder context information. The mechanisms use the decoded value to generate a video output on the data processing system.

    摘要翻译: 用于执行上下文自适应二进制算术编码(CABAC)编码数据的解码的机制。 这些机制在数据处理系统的第一个单指令多数据(SIMD)向量寄存器中接收位数据流的CABAC编码数据。 CABAC编码数据包括要解码的值和位流状态信息。 该机制在数据处理系统的第二SIMD向量寄存器中接收CABAC解码器上下文信息。 该机制以非递归方式处理值,比特流状态信息和CABAC解码器上下文信息,以生成解码值,更新的比特流状态信息和更新的CABAC解码器上下文信息。 该机制在第三SIMD向量寄存器中存储组合解码值,更新位流状态信息和更新的CABAC解码器上下文信息的结果向量。 这些机制使用解码的值在数据处理系统上生成视频输出。