专利检索 ap:"Youfeng WU" 第 4 页

31.

发明申请
OVERLAPPING ATOMIC REGIONS IN A PROCESSOR 有权
标题翻译：在处理者中重写原始地区

公开(公告)号：US20140122845A1

公开(公告)日：2014-05-01

申请号：US13993364

申请日：2011-12-30

申请人： Jaewoong Chung , Cheng Wang , Youfeng Wu

发明人： Jaewoong Chung , Cheng Wang , Youfeng Wu

IPC分类号： G06F9/38

CPC分类号： G06F9/3861 , G06F9/30116 , G06F9/3842 , G06F9/467 , G06F9/528

摘要： In one embodiment, the present invention includes a processor having a core to execute instructions. This core can include various structures and logic that enable instructions of different atomic regions to be executed in an overlapping manner. To this end, the core can include a register file having registers to store data for use in execution of the instructions, and multiple shadow register files each to store a register checkpoint on initiation of a given atomic region. In this way, overlapping execution of atomic regions identified by a programmer or compiler can occur. Other embodiments are described and claimed.

摘要翻译： 在一个实施例中，本发明包括具有执行指令的核心的处理器。该核心可以包括能够以重叠的方式执行不同原子区域的指令的各种结构和逻辑。为此，核心可以包括具有用于存储用于执行指令的数据的寄存器的寄存器文件，以及每个在给定原子区域的启动时存储寄存器检查点的多个影子寄存器文件。以这种方式，可以发生由程序员或编译器识别的原子区域的重叠执行。描述和要求保护其他实施例。

32.

发明申请
FLEXIBLE ACCELERATION OF CODE EXECUTION 有权
标题翻译：代码执行的灵活加速

公开(公告)号：US20140096132A1

公开(公告)日：2014-04-03

申请号：US13631408

申请日：2012-09-28

申请人： Cheng Wang , Youfeng Wu

发明人： Cheng Wang , Youfeng Wu

IPC分类号： G06F9/455 , G06F9/00

CPC分类号： G06F9/4552 , G06F8/44 , G06F9/45516 , G06F9/45533 , G06F9/45554 , Y02D10/26 , Y02D10/28

摘要： Technologies for performing flexible code acceleration on a computing device includes initializing an accelerator virtual device on the computing device. The computing device allocates memory-mapped input and output (I/O) for the accelerator virtual device and also allocates an accelerator virtual device context for a code to be accelerated. The computing device accesses a bytecode of the code to be accelerated and determines whether the bytecode is an operating system-dependent bytecode. If not, the computing device performs hardware acceleration of the bytecode via the memory-mapped I/O using an internal binary translation module. However, if the bytecode is operating system-dependent, the computing device performs software acceleration of the bytecode.

摘要翻译： 在计算设备上执行灵活代码加速的技术包括在计算设备上初始化加速器虚拟设备。计算设备为加速器虚拟设备分配内存映射输入和输出（I / O），并为加速的代码分配加速器虚拟设备上下文。计算设备访问要加速的代码的字节码，并确定字节码是否是依赖于操作系统的字节码。如果不是，计算设备通过使用内部二进制翻译模块的内存映射I / O执行字节码的硬件加速。但是，如果字节码与操作系统有关，则计算设备执行字节码的软件加速。

33.

发明授权
Compact trace trees for dynamic binary parallelization 有权
标题翻译：用于动态二进制并行化的紧凑跟踪树

公开(公告)号：US08332558B2

公开(公告)日：2012-12-11

申请号：US12242371

申请日：2008-09-30

申请人： Joao Paulo Porto , Edson Borin , Youfeng Wu , Cheng Wang

发明人： Joao Paulo Porto , Edson Borin , Youfeng Wu , Cheng Wang

IPC分类号： G06F9/44 , G06F9/00

CPC分类号： G06F9/45516

摘要： Methods and apparatus relating to compact trace trees for dynamic binary parallelization are described. In one embodiment, a compact trace tree (CTT) is generated to improve the effectiveness of dynamic binary parallelization. CTT may be used to determine which traces are to be duplicated and specialized for execution on separate processing elements. Other embodiments are also described and claimed.

摘要翻译： 描述了用于动态二进制并行化的紧凑跟踪树的方法和设备。在一个实施例中，生成紧凑跟踪树（CTT）以提高动态二进制并行化的有效性。可以使用CTT来确定哪些跟踪被复制并专用于在单独的处理元件上执行。还描述和要求保护其他实施例。

34.

发明授权
Program translation and transactional memory formation 有权
标题翻译：程序翻译和事务记忆形成

公开(公告)号：US08296749B2

公开(公告)日：2012-10-23

申请号：US11966453

申请日：2007-12-28

申请人： Chengyan Zhao , Cheng Wang , Youfeng Wu

发明人： Chengyan Zhao , Cheng Wang , Youfeng Wu

IPC分类号： G06F9/45

CPC分类号： G06F9/45516

摘要： Disclosed are methods, machine readable medium and systems that dynamically translate binary programs. The dynamic binary translation may include identifying a hot code trace of a program. The translation may further include determining a completion ratio for the hot code trace. The translation may also include packaging the hot code trace into a transactional memory region in response to the completion ratio having a predetermined relationship to a threshold ratio.

摘要翻译： 公开了动态地翻译二进制程序的方法，机器可读介质和系统。动态二进制翻译可以包括识别程序的热代码跟踪。该翻译还可以包括确定热代码跟踪的完成率。翻译还可以包括响应于具有与阈值比率的预定关系的完成比率将热代码跟踪封装到事务存储区域中。

35.

发明申请
METHOD, APPARATUS, AND SYSTEM FOR ENERGY EFFICIENCY AND ENERGY CONSERVATION INCLUDING CODE RECIRCULATION TECHNIQUES 审中-公开
标题翻译：能源效率和能源保护的方法，装置和系统，包括代码回收技术

公开(公告)号：US20120185714A1

公开(公告)日：2012-07-19

申请号：US13327683

申请日：2011-12-15

申请人： Jaewoong Chung , Youfeng Wu , Cheng Wang , Hanjun Kim

发明人： Jaewoong Chung , Youfeng Wu , Cheng Wang , Hanjun Kim

IPC分类号： G06F1/32 , G06F9/38 , G06F9/312 , G06F15/76 , G06F9/30

CPC分类号： G06F1/3203 , G06F1/3287 , G06F1/329 , G06F9/381 , Y02D10/171 , Y02D10/24 , Y02D50/20

摘要： An apparatus, method and system is described herein for enabling intelligent recirculation of hot code sections. A hot code section is determined and marked with a begin and end instruction. When the begin instruction is decoded, recirculation logic in a back-end of a processor enters a detection mode and loads decoded loop instructions. When the end instruction is decoded, the recirculation logic enters a recirculation mode. And during the recirculation mode, the loop instructions are dispatched directly from the recirculation logic to execution stages for execution. Since the loop is being directly serviced out of the back-end, the front-end may be powered down into a standby state to save power and increase energy efficiency. Upon finishing the loop, the front-end is powered back on and continues normal operation, which potentially includes propagating next instructions after the loop that were prefetched before the front-end entered the standby mode.

摘要翻译： 本文描述了一种用于实现热代码部分的智能再循环的装置，方法和系统。确定热代码部分并用开始和结束指令标记。当开始指令被解码时，处理器后端的再循环逻辑进入检测模式并加载解码的循环指令。当结束指令被解码时，再循环逻辑进入循环模式。并且在再循环模式期间，循环指令直接从再循环逻辑调度到执行阶段以便执行。由于循环是从后端直接服务的，所以前端可以掉电到待机状态，以节省电力并提高能源效率。在完成循环后，前端被重新接通并继续正常操作，这可能包括在前端进入待机模式之前预取的循环之后传播下一个指令。

36.

发明授权
Mechanism for software transactional memory commit/abort in unmanaged runtime environment 有权
标题翻译：在非托管运行时环境中软件事务内存提交/中止的机制

公开(公告)号：US08132158B2

公开(公告)日：2012-03-06

申请号：US11648005

申请日：2006-12-28

申请人： Cheng Wang , Youfeng Wu , Bratin Saha , Ali-Reza Adl-Tabatabai

发明人： Cheng Wang , Youfeng Wu , Bratin Saha , Ali-Reza Adl-Tabatabai

IPC分类号： G06F9/44

CPC分类号： G06F9/3004 , G06F9/30087 , G06F9/3842 , G06F9/3863 , G06F9/466

摘要： A method and apparatus for ensuring integrity of transaction exit functions is herein described. Dead local data in a transaction is prevented from overwriting local variables associated with a transaction exit function. In a write-buffering Software Transactional Memory (STM) system, a commit function is associated with a private stack to store local variables to ensure write-back of local dead data in a write-buffer does not corrupt the commit function. Similarly, in a roll-back STM, an abort function is associated with a private stack to store local variables to ensure the roll-back of a program stack with local dead data from a write log does not corrupt the abort function. Alternatively, one stack may be used for the transaction including a first function and an exit function. Here, local dead variables are detected and prevented from overwriting local variables of the exit function.

摘要翻译： 这里描述了用于确保交易退出功能的完整性的方法和装置。防止事务中的死地方数据覆盖与事务退出功能相关联的局部变量。在写缓冲软件事务内存（STM）系统中，提交函数与专用堆栈相关联，以存储局部变量，以确保写缓冲区中的本地死数据的写回不会损坏提交函数。类似地，在回滚STM中，中止功能与专用堆栈相关联以存储局部变量，以确保来自写入日志的本地死亡数据的程序堆栈的回滚不会破坏中止功能。或者，可以将一个堆栈用于包括第一功能和退出功能的交易。这里，检测并防止局部死变量覆盖退出函数的局部变量。

37.

发明申请
SYSTEMS, APPARATUSES, AND METHODS FOR A HARDWARE AND SOFTWARE SYSTEM TO AUTOMATICALLY DECOMPOSE A PROGRAM TO MULTIPLE PARALLEL THREADS 有权
标题翻译：用于硬件和软件系统的系统，设备和方法，用于自动将程序分解成多个并行线程

公开(公告)号：US20110167416A1

公开(公告)日：2011-07-07

申请号：US12978557

申请日：2010-12-25

申请人： David J. Sager , Ruchira Sasanka , Ron Gabor , Shlomo Raikin , Joseph Nuzman , Leeor Peled , Jason A. Domer , Ho-Seop Kim , Youfeng Wu , Koichi Yamada , Tin-Fook Ngai , Howard H. Chen , Jayaram Bobba , Jeffery J. Cook , Omar M. Shaikh , Suresh Srinivas

发明人： David J. Sager , Ruchira Sasanka , Ron Gabor , Shlomo Raikin , Joseph Nuzman , Leeor Peled , Jason A. Domer , Ho-Seop Kim , Youfeng Wu , Koichi Yamada , Tin-Fook Ngai , Howard H. Chen , Jayaram Bobba , Jeffery J. Cook , Omar M. Shaikh , Suresh Srinivas

IPC分类号： G06F9/45

CPC分类号： G06F8/4442 , G06F9/3842 , G06F9/3851 , G06F9/3861 , G06F9/54 , G06F11/3612 , G06F11/3636 , G06F11/3648 , G06F2213/0038

摘要： Systems, apparatuses, and methods for a hardware and software system to automatically decompose a program into multiple parallel threads are described. In some embodiments, the systems and apparatuses execute a method of original code decomposition and/or generated thread execution.

摘要翻译： 描述了用于将程序自动分解为多个并行线程的硬件和软件系统的系统，装置和方法。在一些实施例中，系统和装置执行原始代码分解和/或生成的线程执行的方法。

38.

发明授权
Using transactional memory for precise exception handling in aggressive dynamic binary optimizations 有权
标题翻译：在积极的动态二进制优化中使用事务内存进行精确的异常处理

公开(公告)号：US07865885B2

公开(公告)日：2011-01-04

申请号：US11528801

申请日：2006-09-27

申请人： Youfeng Wu , Cheng Wang , Ho-seop Kim

发明人： Youfeng Wu , Cheng Wang , Ho-seop Kim

IPC分类号： G06F9/45

CPC分类号： G06F9/466

摘要： Dynamic optimization of application code is performed by selecting a portion of the application code as a possible transaction. A transaction has a property that when it is executed, it is either atomically committed or atomically aborted. Determining whether to convert the selected portion of the application code to a transaction includes determining whether to apply at least one of a group of code optimizations to the portion of the application code. If it is determined to apply at least one of the code optimizations of the group of optimizations to the portion of application code, then the optimization is applied to the portion of the code and the portion of the code is converted to a transaction.

摘要翻译： 通过选择应用代码的一部分作为可能的事务来执行应用代码的动态优化。事务有一个属性，当它被执行时，它被原子地提交或原子地中止。确定是否将应用程序代码的所选部分转换为事务包括确定是否将应用程序代码的一部分中的至少一个代码优化组合应用。如果确定将优化组的代码优化中的至少一个应用于应用代码的部分，则优化被应用于代码的该部分，并将该部分代码转换为事务。

39.

发明授权
Apparatus and method for redundant software thread computation 有权
标题翻译：冗余软件线程计算的装置和方法

公开(公告)号：US07818744B2

公开(公告)日：2010-10-19

申请号：US11325925

申请日：2005-12-30

申请人： Cheng C. Wang , Youfeng Wu

发明人： Cheng C. Wang , Youfeng Wu

IPC分类号： G06F9/46 , G06F5/00

CPC分类号： G06F9/544 , G06F11/1497

摘要： An apparatus and method for redundant transient fault detection. In one embodiment, the method includes the replication of an application into two communicating threads, a leading thread and a trailing thread. The trailing thread may repeat computations performed by the leading thread to detect transient faults, referred to herein as “soft errors.” A first in, first out (FIFO) buffer of shared memory is reserved for passing data between the leading thread and the trailing thread. The FIFO buffer may include a buffer head variable to write data to the FIFO buffer and a buffer tail variable to read data from the FIFO buffer. In one embodiment, data passing between the leading thread data buffering is restricted according to a data unit size and thread synchronization between a leading thread and the trailing thread is limited to buffer overflow/underflow detection. Other embodiments are described and claimed.

摘要翻译： 一种用于冗余瞬态故障检测的装置和方法。在一个实施例中，该方法包括将应用程序复制到两个通信线程，前导线程和后退线程中。尾随线程可以重复由前导线程执行的计算，以检测瞬态故障，这里称为“软错误”。共享存储器的先进先出（FIFO）缓冲器被保留用于在前导线程和尾随线程之间传递数据线。 FIFO缓冲器可以包括用于向FIFO缓冲器写入数据的缓冲器头变量和用于从FIFO缓冲器读取数据的缓冲器尾部变量。在一个实施例中，根据数据单元大小限制在前导线程数据缓冲之间传递的数据，并且前导线程和后退线程之间的线程同步被限制为缓冲器溢出/下溢检测。描述和要求保护其他实施例。

40.

发明授权
Apparatus and method for dynamic binary translator to support precise exceptions with minimal optimization constraints 有权
标题翻译：用于动态二进制转换器的装置和方法，以最小的优化约束来支持精确异常

公开(公告)号：US07757221B2

公开(公告)日：2010-07-13

申请号：US11241610

申请日：2005-09-30

申请人： Bixia Zheng , Cheng C. Wang , Ho-seop Kim , Mauricio Breternitz, Jr. , Youfeng Wu

发明人： Bixia Zheng , Cheng C. Wang , Ho-seop Kim , Mauricio Breternitz, Jr. , Youfeng Wu

IPC分类号： G06F9/45

CPC分类号： G06F9/45516 , G06F8/443

摘要： A method and apparatus for dynamic binary translator to support precise exceptions with minimal optimization constraints. In one embodiment, the method includes the translation of a source binary application generated for a source instruction set architecture (ISA) into a sequential, intermediate representation (IR) of the source binary application. In one embodiment, the sequential IR is modified to incorporate exception recovery information for each of the exception instructions identified from the source binary application to enable a dynamic binary translator (DBT) to represent exception recovery values as regular values used by IR instructions. In one embodiment, the sequential IR may be optimized with a constraint on movement of an exception instruction downward past an irreversible instruction to form a non-sequential IR. In one embodiment, the non-sequential IR is optimized to form a translated binary application for a target ISA. Other embodiments are described and claimed.

摘要翻译： 一种用于动态二进制转换器的方法和装置，以最小的优化约束来支持精确的异常。在一个实施例中，该方法包括将源指令集架构（ISA）生成的源二进制应用程序转换为源二进制应用程序的顺序中间表示（IR）。在一个实施例中，顺序IR被修改为包含从源二进制应用程序识别的每个异常指令的异常恢复信息，以使动态二进制转换器（DBT）能够将异常恢复值表示为由IR指令使用的常规值。在一个实施例中，可以对异常指令向下移动通过不可逆指令以形成非顺序IR的限制来优化顺序IR。在一个实施例中，非顺序IR被优化以形成目标ISA的翻译二进制应用程序。描述和要求保护其他实施例。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类