OVERLAPPING ATOMIC REGIONS IN A PROCESSOR
    31.
    发明申请
    OVERLAPPING ATOMIC REGIONS IN A PROCESSOR 有权
    在处理者中重写原始地区

    公开(公告)号:US20140122845A1

    公开(公告)日:2014-05-01

    申请号:US13993364

    申请日:2011-12-30

    IPC分类号: G06F9/38

    摘要: In one embodiment, the present invention includes a processor having a core to execute instructions. This core can include various structures and logic that enable instructions of different atomic regions to be executed in an overlapping manner. To this end, the core can include a register file having registers to store data for use in execution of the instructions, and multiple shadow register files each to store a register checkpoint on initiation of a given atomic region. In this way, overlapping execution of atomic regions identified by a programmer or compiler can occur. Other embodiments are described and claimed.

    摘要翻译: 在一个实施例中,本发明包括具有执行指令的核心的处理器。 该核心可以包括能够以重叠的方式执行不同原子区域的指令的各种结构和逻辑。 为此,核心可以包括具有用于存储用于执行指令的数据的寄存器的寄存器文件,以及每个在给定原子区域的启动时存储寄存器检查点的多个影子寄存器文件。 以这种方式,可以发生由程序员或编译器识别的原子区域的重叠执行。 描述和要求保护其他实施例。

    FLEXIBLE ACCELERATION OF CODE EXECUTION
    32.
    发明申请
    FLEXIBLE ACCELERATION OF CODE EXECUTION 有权
    代码执行的灵活加速

    公开(公告)号:US20140096132A1

    公开(公告)日:2014-04-03

    申请号:US13631408

    申请日:2012-09-28

    申请人: Cheng Wang Youfeng Wu

    发明人: Cheng Wang Youfeng Wu

    IPC分类号: G06F9/455 G06F9/00

    摘要: Technologies for performing flexible code acceleration on a computing device includes initializing an accelerator virtual device on the computing device. The computing device allocates memory-mapped input and output (I/O) for the accelerator virtual device and also allocates an accelerator virtual device context for a code to be accelerated. The computing device accesses a bytecode of the code to be accelerated and determines whether the bytecode is an operating system-dependent bytecode. If not, the computing device performs hardware acceleration of the bytecode via the memory-mapped I/O using an internal binary translation module. However, if the bytecode is operating system-dependent, the computing device performs software acceleration of the bytecode.

    摘要翻译: 在计算设备上执行灵活代码加速的技术包括在计算设备上初始化加速器虚拟设备。 计算设备为加速器虚拟设备分配内存映射输入和输出(I / O),并为加速的代码分配加速器虚拟设备上下文。 计算设备访问要加速的代码的字节码,并确定字节码是否是依赖于操作系统的字节码。 如果不是,计算设备通过使用内部二进制翻译模块的内存映射I / O执行字节码的硬件加速。 但是,如果字节码与操作系统有关,则计算设备执行字节码的软件加速。

    Compact trace trees for dynamic binary parallelization
    33.
    发明授权
    Compact trace trees for dynamic binary parallelization 有权
    用于动态二进制并行化的紧凑跟踪树

    公开(公告)号:US08332558B2

    公开(公告)日:2012-12-11

    申请号:US12242371

    申请日:2008-09-30

    IPC分类号: G06F9/44 G06F9/00

    CPC分类号: G06F9/45516

    摘要: Methods and apparatus relating to compact trace trees for dynamic binary parallelization are described. In one embodiment, a compact trace tree (CTT) is generated to improve the effectiveness of dynamic binary parallelization. CTT may be used to determine which traces are to be duplicated and specialized for execution on separate processing elements. Other embodiments are also described and claimed.

    摘要翻译: 描述了用于动态二进制并行化的紧凑跟踪树的方法和设备。 在一个实施例中,生成紧凑跟踪树(CTT)以提高动态二进制并行化的有效性。 可以使用CTT来确定哪些跟踪被复制并专用于在单独的处理元件上执行。 还描述和要求保护其他实施例。

    Program translation and transactional memory formation
    34.
    发明授权
    Program translation and transactional memory formation 有权
    程序翻译和事务记忆形成

    公开(公告)号:US08296749B2

    公开(公告)日:2012-10-23

    申请号:US11966453

    申请日:2007-12-28

    IPC分类号: G06F9/45

    CPC分类号: G06F9/45516

    摘要: Disclosed are methods, machine readable medium and systems that dynamically translate binary programs. The dynamic binary translation may include identifying a hot code trace of a program. The translation may further include determining a completion ratio for the hot code trace. The translation may also include packaging the hot code trace into a transactional memory region in response to the completion ratio having a predetermined relationship to a threshold ratio.

    摘要翻译: 公开了动态地翻译二进制程序的方法,机器可读介质和系统。 动态二进制翻译可以包括识别程序的热代码跟踪。 该翻译还可以包括确定热代码跟踪的完成率。 翻译还可以包括响应于具有与阈值比率的预定关系的完成比率将热代码跟踪封装到事务存储区域中。

    METHOD, APPARATUS, AND SYSTEM FOR ENERGY EFFICIENCY AND ENERGY CONSERVATION INCLUDING CODE RECIRCULATION TECHNIQUES
    35.
    发明申请
    METHOD, APPARATUS, AND SYSTEM FOR ENERGY EFFICIENCY AND ENERGY CONSERVATION INCLUDING CODE RECIRCULATION TECHNIQUES 审中-公开
    能源效率和能源保护的方法,装置和系统,包括代码回收技术

    公开(公告)号:US20120185714A1

    公开(公告)日:2012-07-19

    申请号:US13327683

    申请日:2011-12-15

    摘要: An apparatus, method and system is described herein for enabling intelligent recirculation of hot code sections. A hot code section is determined and marked with a begin and end instruction. When the begin instruction is decoded, recirculation logic in a back-end of a processor enters a detection mode and loads decoded loop instructions. When the end instruction is decoded, the recirculation logic enters a recirculation mode. And during the recirculation mode, the loop instructions are dispatched directly from the recirculation logic to execution stages for execution. Since the loop is being directly serviced out of the back-end, the front-end may be powered down into a standby state to save power and increase energy efficiency. Upon finishing the loop, the front-end is powered back on and continues normal operation, which potentially includes propagating next instructions after the loop that were prefetched before the front-end entered the standby mode.

    摘要翻译: 本文描述了一种用于实现热代码部分的智能再循环的装置,方法和系统。 确定热代码部分并用开始和结束指令标记。 当开始指令被解码时,处理器后端的再循环逻辑进入检测模式并加载解码的循环指令。 当结束指令被解码时,再循环逻辑进入循环模式。 并且在再循环模式期间,循环指令直接从再循环逻辑调度到执行阶段以便执行。 由于循环是从后端直接服务的,所以前端可以掉电到待机状态,以节省电力并提高能源效率。 在完成循环后,前端被重新接通并继续正常操作,这可能包括在前端进入待机模式之前预取的循环之后传播下一个指令。

    Mechanism for software transactional memory commit/abort in unmanaged runtime environment
    36.
    发明授权
    Mechanism for software transactional memory commit/abort in unmanaged runtime environment 有权
    在非托管运行时环境中软件事务内存提交/中止的机制

    公开(公告)号:US08132158B2

    公开(公告)日:2012-03-06

    申请号:US11648005

    申请日:2006-12-28

    IPC分类号: G06F9/44

    摘要: A method and apparatus for ensuring integrity of transaction exit functions is herein described. Dead local data in a transaction is prevented from overwriting local variables associated with a transaction exit function. In a write-buffering Software Transactional Memory (STM) system, a commit function is associated with a private stack to store local variables to ensure write-back of local dead data in a write-buffer does not corrupt the commit function. Similarly, in a roll-back STM, an abort function is associated with a private stack to store local variables to ensure the roll-back of a program stack with local dead data from a write log does not corrupt the abort function. Alternatively, one stack may be used for the transaction including a first function and an exit function. Here, local dead variables are detected and prevented from overwriting local variables of the exit function.

    摘要翻译: 这里描述了用于确保交易退出功能的完整性的方法和装置。 防止事务中的死地方数据覆盖与事务退出功能相关联的局部变量。 在写缓冲软件事务内存(STM)系统中,提交函数与专用堆栈相关联,以存储局部变量,以确保写缓冲区中的本地死数据的写回不会损坏提交函数。 类似地,在回滚STM中,中止功能与专用堆栈相关联以存储局部变量,以确保来自写入日志的本地死亡数据的程序堆栈的回滚不会破坏中止功能。 或者,可以将一个堆栈用于包括第一功能和退出功能的交易。 这里,检测并防止局部死变量覆盖退出函数的局部变量。

    Using transactional memory for precise exception handling in aggressive dynamic binary optimizations
    38.
    发明授权
    Using transactional memory for precise exception handling in aggressive dynamic binary optimizations 有权
    在积极的动态二进制优化中使用事务内存进行精确的异常处理

    公开(公告)号:US07865885B2

    公开(公告)日:2011-01-04

    申请号:US11528801

    申请日:2006-09-27

    IPC分类号: G06F9/45

    CPC分类号: G06F9/466

    摘要: Dynamic optimization of application code is performed by selecting a portion of the application code as a possible transaction. A transaction has a property that when it is executed, it is either atomically committed or atomically aborted. Determining whether to convert the selected portion of the application code to a transaction includes determining whether to apply at least one of a group of code optimizations to the portion of the application code. If it is determined to apply at least one of the code optimizations of the group of optimizations to the portion of application code, then the optimization is applied to the portion of the code and the portion of the code is converted to a transaction.

    摘要翻译: 通过选择应用代码的一部分作为可能的事务来执行应用代码的动态优化。 事务有一个属性,当它被执行时,它被原子地提交或原子地中止。 确定是否将应用程序代码的所选部分转换为事务包括确定是否将应用程序代码的一部分中的至少一个代码优化组合应用。 如果确定将优化组的代码优化中的至少一个应用于应用代码的部分,则优化被应用于代码的该部分,并将该部分代码转换为事务。

    Apparatus and method for redundant software thread computation
    39.
    发明授权
    Apparatus and method for redundant software thread computation 有权
    冗余软件线程计算的装置和方法

    公开(公告)号:US07818744B2

    公开(公告)日:2010-10-19

    申请号:US11325925

    申请日:2005-12-30

    IPC分类号: G06F9/46 G06F5/00

    CPC分类号: G06F9/544 G06F11/1497

    摘要: An apparatus and method for redundant transient fault detection. In one embodiment, the method includes the replication of an application into two communicating threads, a leading thread and a trailing thread. The trailing thread may repeat computations performed by the leading thread to detect transient faults, referred to herein as “soft errors.” A first in, first out (FIFO) buffer of shared memory is reserved for passing data between the leading thread and the trailing thread. The FIFO buffer may include a buffer head variable to write data to the FIFO buffer and a buffer tail variable to read data from the FIFO buffer. In one embodiment, data passing between the leading thread data buffering is restricted according to a data unit size and thread synchronization between a leading thread and the trailing thread is limited to buffer overflow/underflow detection. Other embodiments are described and claimed.

    摘要翻译: 一种用于冗余瞬态故障检测的装置和方法。 在一个实施例中,该方法包括将应用程序复制到两个通信线程,前导线程和后退线程中。 尾随线程可以重复由前导线程执行的计算,以检测瞬态故障,这里称为“软错误”。共享存储器的先进先出(FIFO)缓冲器被保留用于在前导线程和尾随线程之间传递数据 线。 FIFO缓冲器可以包括用于向FIFO缓冲器写入数据的缓冲器头变量和用于从FIFO缓冲器读取数据的缓冲器尾部变量。 在一个实施例中,根据数据单元大小限制在前导线程数据缓冲之间传递的数据,并且前导线程和后退线程之间的线程同步被限制为缓冲器溢出/下溢检测。 描述和要求保护其他实施例。

    Apparatus and method for dynamic binary translator to support precise exceptions with minimal optimization constraints
    40.
    发明授权
    Apparatus and method for dynamic binary translator to support precise exceptions with minimal optimization constraints 有权
    用于动态二进制转换器的装置和方法,以最小的优化约束来支持精确异常

    公开(公告)号:US07757221B2

    公开(公告)日:2010-07-13

    申请号:US11241610

    申请日:2005-09-30

    IPC分类号: G06F9/45

    CPC分类号: G06F9/45516 G06F8/443

    摘要: A method and apparatus for dynamic binary translator to support precise exceptions with minimal optimization constraints. In one embodiment, the method includes the translation of a source binary application generated for a source instruction set architecture (ISA) into a sequential, intermediate representation (IR) of the source binary application. In one embodiment, the sequential IR is modified to incorporate exception recovery information for each of the exception instructions identified from the source binary application to enable a dynamic binary translator (DBT) to represent exception recovery values as regular values used by IR instructions. In one embodiment, the sequential IR may be optimized with a constraint on movement of an exception instruction downward past an irreversible instruction to form a non-sequential IR. In one embodiment, the non-sequential IR is optimized to form a translated binary application for a target ISA. Other embodiments are described and claimed.

    摘要翻译: 一种用于动态二进制转换器的方法和装置,以最小的优化约束来支持精确的异常。 在一个实施例中,该方法包括将源指令集架构(ISA)生成的源二进制应用程序转换为源二进制应用程序的顺序中间表示(IR)。 在一个实施例中,顺序IR被修改为包含从源二进制应用程序识别的每个异常指令的异常恢复信息,以使动态二进制转换器(DBT)能够将异常恢复值表示为由IR指令使用的常规值。 在一个实施例中,可以对异常指令向下移动通过不可逆指令以形成非顺序IR的限制来优化顺序IR。 在一个实施例中,非顺序IR被优化以形成目标ISA的翻译二进制应用程序。 描述和要求保护其他实施例。