Patent search ap:"YOUFENG WU" Page 7

61.

发明授权
Method and apparatus for fuzzy stride prefetch 有权
Title translation: 用于模糊步幅预取的方法和装置

公开(公告)号：US08433852B2

公开(公告)日：2013-04-30

申请号：US12871164

申请日：2010-08-30

Applicant: Shiliang Hu , Youfeng Wu

Inventor： Shiliang Hu , Youfeng Wu

IPC: G06F12/00

CPC classification number: G06F12/0862 , G06F2212/1041 , G06F2212/502 , G06F2212/6024 , G06F2212/6026

Abstract: In one embodiment, the present invention includes a prefetching engine to detect when data access strides in a memory fall into a range, to compute a predicted next stride, to selectively prefetch a cache line using the predicted next stride, and to dynamically control prefetching. Other embodiments are also described and claimed.

Abstract translation: 在一个实施例中，本发明包括一种预取引擎，用于检测存储器中的数据访问步进何时落入一个范围内，以计算预测的下一步，以使用预测的下一步来选择性地预取高速缓存行，并且动态地控制预取。还描述和要求保护其他实施例。

62.

发明授权
Two-stage commit (TSC) region for dynamic binary optimization in X86 有权
Title translation: X86中动态二进制优化的两阶段提交（TSC）区域

公开(公告)号：US08418156B2

公开(公告)日：2013-04-09

申请号：US12639251

申请日：2009-12-16

Applicant: Cheng Wang , Youfeng Wu

Inventor： Cheng Wang , Youfeng Wu

IPC: G06F9/45

CPC classification number: G06F9/3834 , G06F8/52 , G06F9/3004 , G06F9/30087 , G06F9/3857

Abstract: Generally, the present disclosure provides systems and methods to generate a two-stage commit (TSC) region which has two separate commit stages. Frequently executed code may be identified and combined for the TSC region. Binary optimization operations may be performed on the TSC region to enable the code to run more efficiently by, for example, reordering load and store instructions. In the first stage, load operations in the region may be committed atomically and in the second stage, store operations in the region may be committed atomically.

Abstract translation: 通常，本公开提供了用于生成具有两个单独提交阶段的两阶段提交（TSC）区域的系统和方法。可以为TSC区域识别并组合经常执行的代码。可以在TSC区域上执行二进制优化操作，以通过例如重新排序负载和存储指令来使代码更有效地运行。在第一阶段中，区域中的加载操作可以原子地进行，并且在第二阶段中，区域中的存储操作可以原子地进行。

63.

发明申请
APPARATUS, METHOD, AND SYSTEM FOR PROVIDING A DECISION MECHANISM FOR CONDITIONAL COMMITS IN AN ATOMIC REGION 有权
Title translation: 设备，方法和系统，用于提供原子地区条件性的决策机制

公开(公告)号：US20120079246A1

公开(公告)日：2012-03-29

申请号：US12890639

申请日：2010-09-25

Applicant: Mauricio Breternitz, JR. , Youfeng Wu , Cheng Wang , Edson Borin , Shiliang Hu , Craig B. Zilles

Inventor： Mauricio Breternitz, JR. , Youfeng Wu , Cheng Wang , Edson Borin , Shiliang Hu , Craig B. Zilles

IPC: G06F9/30 , G06F9/44 , G06F15/00

CPC classification number: G06F8/443 , G06F8/52 , G06F9/3004 , G06F9/30072 , G06F9/30087 , G06F9/30116 , G06F9/3842 , G06F9/3857 , G06F9/466 , G06F11/3672 , G06F11/3688

Abstract: An apparatus and method is described herein for conditionally committing /andor speculative checkpointing transactions, which potentially results in dynamic resizing of transactions. During dynamic optimization of binary code, transactions are inserted to provide memory ordering safeguards, which enables a dynamic optimizer to more aggressively optimize code. And the conditional commit enables efficient execution of the dynamic optimization code, while attempting to prevent transactions from running out of hardware resources. While the speculative checkpoints enable quick and efficient recovery upon abort of a transaction. Processor hardware is adapted to support dynamic resizing of the transactions, such as including decoders that recognize a conditional commit instruction, a speculative checkpoint instruction, or both. And processor hardware is further adapted to perform operations to support conditional commit or speculative checkpointing in response to decoding such instructions.

Abstract translation: 本文描述了用于有条件地提交/推测的检查点事务的装置和方法，这可能导致事务的动态调整大小。在二进制代码的动态优化期间，插入事务以提供存储器排序保护措施，这使得动态优化器能够更积极地优化代码。并且条件提交可以有效地执行动态优化代码，同时尝试防止事务用尽硬件资源。虽然投机检查点能够在中止交易后快速有效地恢复。处理器硬件适于支持事务的动态调整大小，诸如包括识别条件提交指令的解码器，推测性检查点指令或两者。并且处理器硬件还适于执行响应于解码这样的指令来支持条件提交或推测性检查点的操作。

64.

发明授权
On-demand emulation via user-level exception handling 有权
Title translation: 通过用户级异常处理进行按需仿真

公开(公告)号：US08146106B2

公开(公告)日：2012-03-27

申请号：US11968055

申请日：2007-12-31

Applicant: Ho-Seop Kim , Mauricio Breternitz, Jr. , Youfeng Wu

Inventor： Ho-Seop Kim , Mauricio Breternitz, Jr. , Youfeng Wu

IPC: G06F9/44 , G06F9/45 , G06F9/455

CPC classification number: G06F9/30145 , G06F9/4552

Abstract: Methods and apparatuses enable on-demand instruction emulation via user-level exception handling. A non-supported instruction triggers an exception during runtime of a program. In response to the exception, a user-level or application-level exception handler is launched, instead of a kernel-level handler. Then the exception handler can execute at the application layer instead of the kernel level. The handler identifies the instruction and emulates the instruction, where emulation of the instruction is supported by the handler. Emulating the instructions enables the program to continue execution. Repeated instruction emulation is amortized via dynamic binary translation of hot code.

Abstract translation: 方法和设备通过用户级异常处理实现按需指令仿真。不支持的指令在程序运行时触发异常。响应于异常，启动用户级或应用程序级异常处理程序，而不是内核级处理程序。然后异常处理程序可以在应用程序层而不是内核级别执行。处理程序标识指令并模拟指令，其中指令的仿真由处理程序支持。仿真指令使程序能够继续执行。重复的指令仿真通过热代码的动态二进制转换进行分摊。

65.

发明申请
Methods And Apparatuses For Efficient Load Processing Using Buffers 有权
Title translation: 使用缓冲器高效加载处理的方法和设备

公开(公告)号：US20110154002A1

公开(公告)日：2011-06-23

申请号：US12640707

申请日：2009-12-17

Applicant: Wei Liu , Youfeng Wu , Christopher B. Wilkerson , Herbert H. Hum

Inventor： Wei Liu , Youfeng Wu , Christopher B. Wilkerson , Herbert H. Hum

IPC: G06F9/38

CPC classification number: G06F12/0888 , G06F8/4442 , G06F9/30043 , G06F9/3826 , G06F9/383 , Y02D10/13

Abstract: Various embodiments of the invention concern methods and apparatuses for power and time efficient load handling. A compiler may identify producer loads, consumer reuse loads, consumer forwarded loads, and producer/consumer hybrid loads. Based on this identification, performance of the load may be efficiently directed to a load value buffer, store buffer, data cache, or elsewhere. Consequently, accesses to cache are reduced, through direct loading from load value buffers and store buffers, thereby efficiently processing the loads.

Abstract translation: 本发明的各种实施例涉及用于功率和时间有效的负载处理的方法和装置。编译器可以识别生产者负载，消费者重用负载，消费者转发负载以及生产者/消费者混合负载。基于该识别，可以将负载的性能有效地指向负载值缓冲器，存储缓冲器，数据高速缓存或其他位置。因此，通过从负载值缓冲区和存储缓冲区直接加载，从而降低对高速缓存的访问，从而有效地处理负载。

66.

发明申请
TWO-STAGE COMMIT (TSC) REGION FOR DYNAMIC BINARY OPTIMIZATION IN X86 有权
Title translation: X86中动态二进制优化的两阶段委托（TSC）区域

公开(公告)号：US20110145551A1

公开(公告)日：2011-06-16

申请号：US12639251

申请日：2009-12-16

Applicant: Cheng Wang , Youfeng Wu

Inventor： Cheng Wang , Youfeng Wu

IPC: G06F12/00 , G06F9/312

CPC classification number: G06F9/3834 , G06F8/52 , G06F9/3004 , G06F9/30087 , G06F9/3857

Abstract: Generally, the present disclosure provides systems and methods to generate a two-stage commit (TSC) region which has two separate commit stages. Frequently executed code may be identified and combined for the TSC region. Binary optimization operations may be performed on the TSC region to enable the code to run more efficiently by, for example, reording load and store instructions. In the first stage, load operations in the region may be committed atomically and in the second stage, store operations in the region may be committed atomically.

Abstract translation: 通常，本公开提供了用于生成具有两个单独提交阶段的两阶段提交（TSC）区域的系统和方法。可以为TSC区域识别并组合经常执行的代码。可以在TSC区域上执行二进制优化操作，以通过例如重新加载和存储指令来使代码更有效地运行。在第一阶段中，区域中的加载操作可以原子地进行，并且在第二阶段中，区域中的存储操作可以原子地进行。

67.

发明申请
COMPILER TECHNIQUE FOR EFFICIENT REGISTER CHECKPOINTING TO SUPPORT TRANSACTION ROLL-BACK 有权
Title translation: 用于有效注册的编译器技术支持交易滚动

公开(公告)号：US20100306512A1

公开(公告)日：2010-12-02

申请号：US12856505

申请日：2010-08-13

Applicant: Cheng Wang , Youfeng Wu

Inventor： Cheng Wang , Youfeng Wu

IPC: G06F9/312

CPC classification number: G06F9/3863 , G06F9/3004 , G06F9/3834 , G06F11/1407

Abstract: A method and apparatus for efficient register checkpointing is herein described. A transaction is detected in program code. A recovery block is inserted in the program code to perform recovery operations in response to an abort of the first transaction. A roll-back edge is potentially inserted from an abort point to the recovery block. A control flow edge is inserted from the recovery block to a entry point of the transaction. Checkpoint code is inserted before the entry point to backup live-in registers in backup storage elements and recovery code is inserted in the recovery block to restore the live-in registers from the backup storage elements in response to an abort of the transaction.

Abstract translation: 这里描述用于有效的寄存器检查点的方法和装置。在程序代码中检测到事务。在程序代码中插入恢复块，以响应于第一个事务的中止来执行恢复操作。回退边缘可能从中止点插入到恢复块。将控制流程边缘从恢复块插入到事务的入口点。检查点代码被插入到备份存储元件中的备份实时寄存器的入口点之前，并且恢复代码被插入到恢复块中，以便响应于事务的中止从备份存储元件恢复实时寄存器。

68.

发明授权
Method and system for reducing program code size 有权
Title translation: 减少程序代码大小的方法和系统

公开(公告)号：US07840953B2

公开(公告)日：2010-11-23

申请号：US11020481

申请日：2004-12-22

Applicant: Youfeng Wu , Mauricio Breternitz, Jr.

Inventor： Youfeng Wu , Mauricio Breternitz, Jr.

IPC: G06F9/45

CPC classification number: G06F8/4434

Abstract: In a method for reducing code size a replaceable subset of instructions at a first location within a set of instructions and a matching target subset of instructions at a second location within the set of instructions are identified. A base offset and a relative offset are determined. The base offset and the relative offset indicate an absolute offset from the first location to the second location. An instruction to cause a base offset storage element to be loaded with the base offset is inserted prior to the first location. The replaceable subset of instructions is replaced with a second instruction to cause a program counter to be modified based on the relative offset and a value in the base offset register so that the modified program counter indicates the second location.

Abstract translation: 在用于减小代码大小的方法中，识别在一组指令内的第一位置处的可替换子集指令和指令集中的第二位置处的指令的匹配目标子集。确定基础偏移和相对偏移。基本偏移和相对偏移表示从第一个位置到第二个位置的绝对偏移。在第一位置之前插入使基本偏移存储元件加载基准偏移的指令。可替换的指令子集被替换为第二指令，以使得基于相对偏移和基本偏移寄存器中的值来修改程序计数器，使得修改的程序计数器指示第二位置。

69.

发明申请
Software constructed stands for execution on a multi-core architecture 有权
Title translation: 构建的软件代表在多核架构上执行

公开(公告)号：US20090077360A1

公开(公告)日：2009-03-19

申请号：US11901644

申请日：2007-09-18

Applicant: Wei Liu , Lixin Su , Youfeng Wu , Herbert Hum

Inventor： Wei Liu , Lixin Su , Youfeng Wu , Herbert Hum

IPC: G06F9/44 , G06F9/38

CPC classification number: G06F8/433

Abstract: In one embodiment, the present invention includes a software-controlled method of forming instruction strands. The software may include instructions to obtain code of a superblock including a plurality of basic blocks, build a dependency directed acyclic graph (DAG) for the code, sort nodes coupled by edges of the dependency DAG into a topological order, form strands from the nodes based on hardware constraints, rule constraints, and scheduling constraints, and generate executable code for the strands and store the executable code in a storage. Other embodiments are described and claimed.

Abstract translation: 在一个实施例中，本发明包括一种形成指令串的软件控制方法。软件可以包括用于获得包括多个基本块的超级块的代码的指令，为代码构建依赖性有向非循环图（DAG），将依赖性DAG的边缘耦合的分类节点排列成拓扑顺序，从节点形成线基于硬件约束，规则约束和调度约束，并且生成链的可执行代码并将可执行代码存储在存储器中。描述和要求保护其他实施例。

70.

发明申请
Transient Fault Detection by Integrating an SRMT Code and a Non SRMT Code in a Single Application 有权

公开(公告)号：US20080282116A1

公开(公告)日：2008-11-13

申请号：US11770095

申请日：2007-06-28

Applicant: Cheng Wang , Youfeng Wu

Inventor： Cheng Wang , Youfeng Wu

IPC: G06F9/44 , G06F11/00

CPC classification number: G06F11/1487 , G06F8/458 , G06F9/4484

Abstract: Disclosed is a method for running a first code generated by a Software-based Redundant Multi-Threading (SRMT) compiler along with a second code generated by a normal compiler at runtime, the first code including a first function and a second function, the second code including a third function. The method comprises running the first function in a leading thread and a tailing thread (104); running the third function in a single thread (106), the leading thread calls the third function and running the second function in the leading thread and the tailing thread (108), the third function calls the second function. The present disclosure provides a mechanism for handling function calls wherein SRMT functions and binary functions can call each other irrespective of whether the callee function is a SRMT function or a binary function and thereby dynamically adjusts reliability and performance tradeoff based on run-time information and user selectable policies.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification