TWO-STAGE COMMIT (TSC) REGION FOR DYNAMIC BINARY OPTIMIZATION IN X86
    71.
    发明申请
    TWO-STAGE COMMIT (TSC) REGION FOR DYNAMIC BINARY OPTIMIZATION IN X86 有权
    X86中动态二进制优化的两阶段委托(TSC)区域

    公开(公告)号:US20110145551A1

    公开(公告)日:2011-06-16

    申请号:US12639251

    申请日:2009-12-16

    申请人: Cheng Wang Youfeng Wu

    发明人: Cheng Wang Youfeng Wu

    IPC分类号: G06F12/00 G06F9/312

    摘要: Generally, the present disclosure provides systems and methods to generate a two-stage commit (TSC) region which has two separate commit stages. Frequently executed code may be identified and combined for the TSC region. Binary optimization operations may be performed on the TSC region to enable the code to run more efficiently by, for example, reording load and store instructions. In the first stage, load operations in the region may be committed atomically and in the second stage, store operations in the region may be committed atomically.

    摘要翻译: 通常,本公开提供了用于生成具有两个单独提交阶段的两阶段提交(TSC)区域的系统和方法。 可以为TSC区域识别并组合经常执行的代码。 可以在TSC区域上执行二进制优化操作,以通过例如重新加载和存储指令来使代码更有效地运行。 在第一阶段中,区域中的加载操作可以原子地进行,并且在第二阶段中,区域中的存储操作可以原子地进行。

    COMPILER TECHNIQUE FOR EFFICIENT REGISTER CHECKPOINTING TO SUPPORT TRANSACTION ROLL-BACK
    72.
    发明申请
    COMPILER TECHNIQUE FOR EFFICIENT REGISTER CHECKPOINTING TO SUPPORT TRANSACTION ROLL-BACK 有权
    用于有效注册的编译器技术支持交易滚动

    公开(公告)号:US20100306512A1

    公开(公告)日:2010-12-02

    申请号:US12856505

    申请日:2010-08-13

    申请人: Cheng Wang Youfeng Wu

    发明人: Cheng Wang Youfeng Wu

    IPC分类号: G06F9/312

    摘要: A method and apparatus for efficient register checkpointing is herein described. A transaction is detected in program code. A recovery block is inserted in the program code to perform recovery operations in response to an abort of the first transaction. A roll-back edge is potentially inserted from an abort point to the recovery block. A control flow edge is inserted from the recovery block to a entry point of the transaction. Checkpoint code is inserted before the entry point to backup live-in registers in backup storage elements and recovery code is inserted in the recovery block to restore the live-in registers from the backup storage elements in response to an abort of the transaction.

    摘要翻译: 这里描述用于有效的寄存器检查点的方法和装置。 在程序代码中检测到事务。 在程序代码中插入恢复块,以响应于第一个事务的中止来执行恢复操作。 回退边缘可能从中止点插入到恢复块。 将控制流程边缘从恢复块插入到事务的入口点。 检查点代码被插入到备份存储元件中的备份实时寄存器的入口点之前,并且恢复代码被插入到恢复块中,以便响应于事务的中止从备份存储元件恢复实时寄存器。

    Method and system for reducing program code size
    73.
    发明授权
    Method and system for reducing program code size 有权
    减少程序代码大小的方法和系统

    公开(公告)号:US07840953B2

    公开(公告)日:2010-11-23

    申请号:US11020481

    申请日:2004-12-22

    IPC分类号: G06F9/45

    CPC分类号: G06F8/4434

    摘要: In a method for reducing code size a replaceable subset of instructions at a first location within a set of instructions and a matching target subset of instructions at a second location within the set of instructions are identified. A base offset and a relative offset are determined. The base offset and the relative offset indicate an absolute offset from the first location to the second location. An instruction to cause a base offset storage element to be loaded with the base offset is inserted prior to the first location. The replaceable subset of instructions is replaced with a second instruction to cause a program counter to be modified based on the relative offset and a value in the base offset register so that the modified program counter indicates the second location.

    摘要翻译: 在用于减小代码大小的方法中,识别在一组指令内的第一位置处的可替换子集指令和指令集中的第二位置处的指令的匹配目标子集。 确定基础偏移和相对偏移。 基本偏移和相对偏移表示从第一个位置到第二个位置的绝对偏移。 在第一位置之前插入使基本偏移存储元件加载基准偏移的指令。 可替换的指令子集被替换为第二指令,以使得基于相对偏移和基本偏移寄存器中的值来修改程序计数器,使得修改的程序计数器指示第二位置。

    Software constructed stands for execution on a multi-core architecture
    74.
    发明申请
    Software constructed stands for execution on a multi-core architecture 有权
    构建的软件代表在多核架构上执行

    公开(公告)号:US20090077360A1

    公开(公告)日:2009-03-19

    申请号:US11901644

    申请日:2007-09-18

    IPC分类号: G06F9/44 G06F9/38

    CPC分类号: G06F8/433

    摘要: In one embodiment, the present invention includes a software-controlled method of forming instruction strands. The software may include instructions to obtain code of a superblock including a plurality of basic blocks, build a dependency directed acyclic graph (DAG) for the code, sort nodes coupled by edges of the dependency DAG into a topological order, form strands from the nodes based on hardware constraints, rule constraints, and scheduling constraints, and generate executable code for the strands and store the executable code in a storage. Other embodiments are described and claimed.

    摘要翻译: 在一个实施例中,本发明包括一种形成指令串的软件控制方法。 软件可以包括用于获得包括多个基本块的超级块的代码的指令,为代码构建依赖性有向非循环图(DAG),将依赖性DAG的边缘耦合的分类节点排列成拓扑顺序,从节点形成线 基于硬件约束,规则约束和调度约束,并且生成链的可执行代码并将可执行代码存储在存储器中。 描述和要求保护其他实施例。

    Transient Fault Detection by Integrating an SRMT Code and a Non SRMT Code in a Single Application

    公开(公告)号:US20080282116A1

    公开(公告)日:2008-11-13

    申请号:US11770095

    申请日:2007-06-28

    申请人: Cheng Wang Youfeng Wu

    发明人: Cheng Wang Youfeng Wu

    IPC分类号: G06F9/44 G06F11/00

    摘要: Disclosed is a method for running a first code generated by a Software-based Redundant Multi-Threading (SRMT) compiler along with a second code generated by a normal compiler at runtime, the first code including a first function and a second function, the second code including a third function. The method comprises running the first function in a leading thread and a tailing thread (104); running the third function in a single thread (106), the leading thread calls the third function and running the second function in the leading thread and the tailing thread (108), the third function calls the second function. The present disclosure provides a mechanism for handling function calls wherein SRMT functions and binary functions can call each other irrespective of whether the callee function is a SRMT function or a binary function and thereby dynamically adjusts reliability and performance tradeoff based on run-time information and user selectable policies.

    Genetic algorithm for microcode compression
    76.
    发明授权
    Genetic algorithm for microcode compression 有权
    微码压缩的遗传算法

    公开(公告)号:US07451121B2

    公开(公告)日:2008-11-11

    申请号:US11237562

    申请日:2005-09-27

    IPC分类号: G06F15/18 G06N3/00 G06N3/12

    CPC分类号: G06N3/126 G06F9/3017

    摘要: A method to compress microcode utilizing a genetic algorithm includes generating a population of chromosomes, each chromosome including one or more elements that indicate a cluster to which a portion of microcode memory belongs. The method further includes determining a fitness value of each chromosome and modifying the population of chromosomes based on the fitness values of the chromosomes to generate a new population of chromosomes. In addition, the method includes compressing the microcode memory using a cluster-based compression technique, wherein clusters are selected according to a chromosome from the new population with the best fitness value. Other embodiments are also disclosed.

    摘要翻译: 使用遗传算法压缩微码的方法包括生成染色体群体,每个染色体包括指示微代码存储器的一部分所属的簇的一个或多个元件。 该方法还包括根据染色体的适应度值确定每个染色体的适应度值和修饰染色体群体以产生新的染色体群体。 此外,该方法包括使用基于簇的压缩技术来压缩微代码存储器,其中根据来自具有最佳适应度值的新群体的染色体来选择簇。 还公开了其他实施例。

    Methods and apparatus to compile a software program to manage parallel μcaches
    77.
    发明授权
    Methods and apparatus to compile a software program to manage parallel μcaches 有权
    编写软件程序来管理平行粘液的方法和装置

    公开(公告)号:US07448031B2

    公开(公告)日:2008-11-04

    申请号:US10739500

    申请日:2003-12-17

    申请人: Youfeng Wu

    发明人: Youfeng Wu

    IPC分类号: G06F9/45

    摘要: Methods and apparatus to compile a software program to manage parallel μ caches are disclosed. In an example method, a compiler attempts to schedule a software program such that load instructions in a first set of load instructions has a first predetermine latency greater than the latency of the first cache. The compiler also marks a second set of load instructions with a latency less than the first predetermined latency to access the first cache. The compiler attempts to schedule the software program such that the load instruction in a third set have at least a second predetermined latency greater than the latency of the second cache. The compiler identifies a fourth set of load instructions in the scheduled software program having less than the second predetermined latency and marks the fourth set of load instructions to access the second cache.

    摘要翻译: 公开了编译软件程序来管理并行的多个高速缓存的方法和装置。 在示例性方法中,编译器尝试调度软件程序,使得第一组加载指令中的加载指令具有大于第一高速缓存的等待时间的第一预定延迟。 编译器还标记第二组加载指令,其延迟小于第一预定延迟以访问第一高速缓存。 编译器尝试调度软件程序,使得第三组中的加载指令具有比第二高速缓存的等待时间更长的至少第二预定等待时间。 编译器在预定软件程序中识别具有小于第二预定等待时间的第四组加载指令,并标记第四组加载指令以访问第二高速缓存。

    Efficient bloom filter
    78.
    发明申请
    Efficient bloom filter 失效
    高效绽放滤波器

    公开(公告)号:US20080147714A1

    公开(公告)日:2008-06-19

    申请号:US11642314

    申请日:2006-12-19

    IPC分类号: G06F17/30

    CPC分类号: G06F12/0864 Y10S707/99943

    摘要: Implementation of a Bloom filter using multiple single-ported memory slices. A control value is combined with a hashed address value such that the resultant address value has the property that one, and only one, of the k memories or slices is selected for a given input value, a, for each bank. Collisions are thereby avoided and the multiple hash accesses for a given input value, a, may be performed concurrently. Other embodiments are also described and claimed.

    摘要翻译: 使用多个单端口存储器片的Bloom过滤器的实现。 控制值与散列地址值组合,使得所得到的地址值具有对于每个存储体的给定输入值a选择k个存储器或片中仅一个且仅一个的属性。 因此避免了冲突,并且可以同时执行给定输入值a的多个哈希访问。 还描述和要求保护其他实施例。

    Using transactional memory for precise exception handling in aggressive dynamic binary optimizations
    79.
    发明申请
    Using transactional memory for precise exception handling in aggressive dynamic binary optimizations 有权
    在积极的动态二进制优化中使用事务内存进行精确的异常处理

    公开(公告)号:US20080126764A1

    公开(公告)日:2008-05-29

    申请号:US11528801

    申请日:2006-09-27

    IPC分类号: G06F9/44 G06F9/318

    CPC分类号: G06F9/466

    摘要: Dynamic optimization of application code is performed by selecting a portion of the application code as a possible transaction. A transaction has a property that when it is executed, it is either atomically committed or atomically aborted. Determining whether to convert the selected portion of the application code to a transaction includes determining whether to apply at least one of a group of code optimizations to the portion of the application code. If it is determined to apply at least one of the code optimizations of the group of optimizations to the portion of application code, then the optimization is applied to the portion of the code and the portion of the code is converted to a transaction.

    摘要翻译: 通过选择应用代码的一部分作为可能的事务来执行应用代码的动态优化。 事务有一个属性,当它被执行时,它被原子地提交或原子地中止。 确定是否将应用程序代码的所选部分转换为事务包括确定是否将应用程序代码的一部分中的至少一个代码优化组合应用。 如果确定将优化组的代码优化中的至少一个应用于应用代码的部分,则优化被应用于代码的该部分,并将该部分代码转换为事务。

    Automatic function call in multithreaded application
    80.
    发明申请
    Automatic function call in multithreaded application 有权
    在多线程应用程序中自动调用函数

    公开(公告)号:US20080120590A1

    公开(公告)日:2008-05-22

    申请号:US11603375

    申请日:2006-11-22

    IPC分类号: G06F9/44

    CPC分类号: G06F9/466 G06F8/41

    摘要: In general, in one aspect, the disclosure describes a method to detect a transaction and direct non transactional memory (TM) user functions within the transaction. The non TM user functions are treated as TM functions and added to the TM list.

    摘要翻译: 通常,在一个方面,本公开描述了一种检测事务中的交易和直接非事务性存储器(TM)用户功能的方法。 非TM用户功能被视为TM功能并添加到TM列表中。