专利检索 ap:"Youfeng WU" 第 12 页

111.

发明申请
Cache mechanism 失效
标题翻译：缓存机制

公开(公告)号：US20050210197A1

公开(公告)日：2005-09-22

申请号：US10803452

申请日：2004-03-18

申请人： Ryan Rakvic , Youfeng Wu , Bryan Black , John Shen

发明人： Ryan Rakvic , Youfeng Wu , Bryan Black , John Shen

IPC分类号： G06F12/08 , G06F12/00

CPC分类号： G06F12/0848 , G06F12/0888

摘要： According to one embodiment a system is disclosed. The system includes a central processing unit (CPU), a first cache memory coupled to the CPU to store only data for vital loads that are to be immediately processed at the CPU, a second cache memory coupled to the CPU to store data for semi-vital loads to be processed at the CPU, and a third cache memory coupled to the CPU, the first cache memory and the second cache memory to store non-vital loads to be processed at the CPU.

摘要翻译： 根据一个实施例，公开了一种系统。该系统包括中央处理单元（CPU），第一高速缓存存储器，其耦合到CPU以仅存储要在CPU处理的重要负载的数据;耦合到CPU的第二高速缓存存储器，在CPU处理的重要负载，以及耦合到CPU，第一高速缓冲存储器和第二高速缓冲存储器的第三高速缓存存储器，用于存储要在CPU处理的非重要负载。

112.

发明授权
Predicting output of a reuse region using prior execution results associated with the reuse region 失效
标题翻译：使用与重用区域相关联的先前执行结果来预测重用区域的输出

公开(公告)号：US06836841B1

公开(公告)日：2004-12-28

申请号：US09607580

申请日：2000-06-29

申请人： Youfeng Wu , Dong-Yuan Chen

发明人： Youfeng Wu , Dong-Yuan Chen

IPC分类号： G06F940

CPC分类号： G06F8/36

摘要： In one embodiment, a method for speculatively reusing regions of code includes identifying a reuse region and a data input to the reuse region, determining whether a data output of the reuse region is contained within reuse region instance information pertaining to a plurality of instances of the reuse region, and when the data output is not contained within the reuse region instance information, predicting the data output of the reuse region based on the reuse region instance information.

摘要翻译： 在一个实施例中，一种用于推测地重用码区域的方法包括：识别重用区域和输入到重用区域的数据，确定重用区域的数据输出是否包含在与多个实例的多个实例有关的重用区域实例信息内并且当数据输出不包含在重用区域实例信息中时，基于重用区域实例信息来预测重用区域的数据输出。

113.

发明授权
Method and apparatus for software pipelining of nested loops 失效
标题翻译：嵌套循环软件流水线的方法和装置

公开(公告)号：US06230317B1

公开(公告)日：2001-05-08

申请号：US08891544

申请日：1997-07-11

申请人： Youfeng Wu

发明人： Youfeng Wu

IPC分类号： G06F945

CPC分类号： G06F8/4452

摘要： A method for executing software pipelined executable code generated by compiling a set of unexecutable instructions having an inner loop and an outer loop is disclosed. Instructions are executed that perform the operations specified in the outer loop using a first storage area. A second storage area is allocated for use when performing the operations specified in the inner loop. Instructions are then executed that perform the operations specified in the inner loop using the second storage area, wherein at least certain storage locations in the first storage area are not alterable while the operations specified in the inner loop are being performed.

摘要翻译： 公开了一种通过编译一组具有内循环和外循环的不可执行指令而产生的软件流水线执行代码的方法。执行使用第一存储区域执行外部循环中指定的操作的指令。在执行内循环中指定的操作时，分配第二个存储区域供使用。然后执行执行使用第二存储区域在内循环中指定的操作的指令，其中当正在执行在内循环中指定的操作时，第一存储区域中的至少某些存储位置不可更改。

114.

发明授权
Comprehensive redundant load elimination for architectures supporting control and data speculation 失效
标题翻译：支持控制和数据推测的架构的全面冗余负载消除

公开(公告)号：US06202204B1

公开(公告)日：2001-03-13

申请号：US09038755

申请日：1998-03-11

申请人： Youfeng Wu , Yong-Fong Lee

发明人： Youfeng Wu , Yong-Fong Lee

IPC分类号： G06F945

CPC分类号： G06F8/433 , G06F8/443 , G06F8/445 , G06F9/383 , G06F9/3842

摘要： In one implementation of the invention, a computer implemented method used in compiling a program includes identifying a covering load, which may be one of a set of covering loads, and a redundant load. The covering load and the redundant load have a first and second load type, respectively. The first and the second load type each may be one of a group of load types including a regular load and at least one speculative-type load. In one implementation, the group of load types includes at least one check-type load. One implementation of the invention is in a machine readable medium.

摘要翻译： 在本发明的一个实现中，用于编译程序的计算机实现方法包括识别可以是一组覆盖负载之一的覆盖负载和冗余负载。覆盖负载和冗余负载分别具有第一和第二负载类型。第一和第二负载类型各自可以是一组负载类型中的一种，包括常规负载和至少一个推测型负载。在一个实现中，负载类型组包括至少一个检查类型的负载。本发明的一个实施方案在机器可读介质中。

115.

发明授权
Instruction and logic to monitor loop trip count and remove loop optimizations 有权

公开(公告)号：US09715388B2

公开(公告)日：2017-07-25

申请号：US13996861

申请日：2012-03-30

申请人： Jaewoong Chung , Hyunchul Park , Hongbo Rong , Cheng Wang , Youfeng Wu

发明人： Jaewoong Chung , Hyunchul Park , Hongbo Rong , Cheng Wang , Youfeng Wu

IPC分类号： G06F15/00 , G06F7/38 , G06F9/00 , G06F9/44 , G06F9/32 , G06F9/30 , G06F9/38 , G06F11/34 , G06F9/45

CPC分类号： G06F9/325 , G06F8/443 , G06F9/30072 , G06F9/3842 , G06F9/3857 , G06F11/3409 , G06F11/348 , G06F2201/88

摘要： Logic and instruction to monitor loop trip count are disclosed. Loop trip count information of a loop may be stored in a dedicated hardware buffer. Average loop trip count of the loop may be calculated based on the stored loop trip count information. Based on the average trip count, loop optimizations may be removed from the loop. The stored loop trip count information may include an identifier identifying the loop, a total loop trip count of the loop, and an exit count of the loop.

116.

发明授权
Overlapping atomic regions in a processor 有权

公开(公告)号：US09710280B2

公开(公告)日：2017-07-18

申请号：US13993364

申请日：2011-12-30

申请人： Jaewoong Chung , Cheng Wang , Youfeng Wu

发明人： Jaewoong Chung , Cheng Wang , Youfeng Wu

IPC分类号： G06F9/30 , G06F9/38 , G06F9/46 , G06F9/52

CPC分类号： G06F9/3861 , G06F9/30116 , G06F9/3842 , G06F9/467 , G06F9/528

摘要： In one embodiment, the present invention includes a processor having a core to execute instructions. This core can include various structures and logic that enable instructions of different atomic regions to be executed in an overlapping manner. To this end, the core can include a register file having registers to store data for use in execution of the instructions, and multiple shadow register files each to store a register checkpoint on initiation of a given atomic region. In this way, overlapping execution of atomic regions identified by a programmer or compiler can occur. Other embodiments are described and claimed.

117.

发明授权
Instruction and logic for cache-based speculative vectorization 有权

公开(公告)号：US09690582B2

公开(公告)日：2017-06-27

申请号：US14143576

申请日：2013-12-30

申请人： Nalini Vasudevan , Youfeng Wu , Cheng Wang , Sara Baghsorkhi , Albert Hartono

发明人： Nalini Vasudevan , Youfeng Wu , Cheng Wang , Sara Baghsorkhi , Albert Hartono

IPC分类号： G06F9/44 , G06F9/30 , G06F9/38

CPC分类号： G06F9/30036 , G06F9/30043 , G06F9/3824 , G06F9/3834

摘要： A processor includes a decoder to decode an instruction, a scheduler to schedule the instruction, and an execution unit to execute the instruction. The instruction is to load a memory operation applicable to a quantity of addresses into an execution vector. The execution vector includes a plurality of vector positions for respective addressees. The instruction is further to evaluate, for a given address in the execution vector at a vector position, whether a cache indicates that a previous memory operation was performed at a higher vector position than the vector position of the given address. The instruction is also to determine, based on the evaluation whether the cache indicates that the previous memory operation was performed at a higher vector position than the vector position of the given address, whether the memory operation will cause a memory error.

118.

发明授权
Systems, apparatuses, and methods for a hardware and software system to automatically decompose a program to multiple parallel threads 有权

公开(公告)号：US09672019B2

公开(公告)日：2017-06-06

申请号：US12978557

申请日：2010-12-25

申请人： David J. Sager , Ruchira Sasanka , Ron Gabor , Shlomo Raikin , Joseph Nuzman , Leeor Peled , Jason A. Domer , Ho-Seop Kim , Youfeng Wu , Koichi Yamada , Tin-Fook Ngai , Howard H. Chen , Jayaram Bobba , Jeffery J. Cook , Omar M. Shaikh , Suresh Srinivas

发明人： David J. Sager , Ruchira Sasanka , Ron Gabor , Shlomo Raikin , Joseph Nuzman , Leeor Peled , Jason A. Domer , Ho-Seop Kim , Youfeng Wu , Koichi Yamada , Tin-Fook Ngai , Howard H. Chen , Jayaram Bobba , Jeffery J. Cook , Omar M. Shaikh , Suresh Srinivas

IPC分类号： G06F9/45 , G06F9/38 , G06F9/54 , G06F11/36

CPC分类号： G06F8/4442 , G06F9/3842 , G06F9/3851 , G06F9/3861 , G06F9/54 , G06F11/3612 , G06F11/3636 , G06F11/3648 , G06F2213/0038

摘要： Systems, apparatuses, and methods for a hardware and software system to automatically decompose a program into multiple parallel threads are described. In some embodiments, the systems and apparatuses execute a method of original code decomposition and/or generated thread execution.

119.

发明授权
Methods and apparatus to manage partial-commit checkpoints with fixup support 有权

公开(公告)号：US09354882B2

公开(公告)日：2016-05-31

申请号：US14041170

申请日：2013-09-30

申请人： Edson Borin , Youfeng Wu

发明人： Edson Borin , Youfeng Wu

IPC分类号： G06F7/38 , G06F9/00 , G06F9/44 , G06F15/00 , G06F9/30 , G06F9/38

CPC分类号： G06F9/30145 , G06F9/3834 , G06F9/3842 , G06F9/3857 , G06F9/3863

摘要： Example methods and apparatus to manage partial commit-checkpoints are disclosed. A disclosed example method includes identifying a commit instruction associated with a region of instructions executed by a processor, identifying candidate instructions from the region of instructions, and generating a processor partial commit-checkpoint to save a current state of the processor, the checkpoint based on calculated register values associated with live instructions, and including instruction reference addresses to link the candidate instructions.

120.

发明申请
AUTOMATIC LOOP VECTORIZATION USING HARDWARE TRANSACTIONAL MEMORY 有权
标题翻译：使用硬件交易记忆的自动环路测向

公开(公告)号：US20150268940A1

公开(公告)日：2015-09-24

申请号：US14222040

申请日：2014-03-21

申请人： Sara S. Baghsorkhi , Albert Hartono , Youfeng Wu , Nalini Vasudevan , Cheng Wang

发明人： Sara S. Baghsorkhi , Albert Hartono , Youfeng Wu , Nalini Vasudevan , Cheng Wang

IPC分类号： G06F9/45

CPC分类号： G06F8/452

摘要： Technologies for automatic loop vectorization include a computing device with an optimizing compiler. During an optimization pass, the compiler identifies a loop and generates a transactional code segment including a vectorized implementation of the loop body including one or more vector memory read instructions capable of generating an exception. The compiler also generates a non-transactional fallback code segment including a scalar implementation of the loop body that is executed in response to an exception generated within the transactional code segment. The compiler may detect whether the loop contains a memory read dependent on a condition that may be updated in a previous iteration or whether the loop contains a potential data dependence between two iterations. The compiler may generate a dynamic check for an actual data dependence and an explicit transactional abort instruction to be executed when an actual data dependence exists. Other embodiments are described and claimed.

摘要翻译： 用于自动循环矢量化的技术包括具有优化编译器的计算设备。在优化传递期间，编译器识别循环并生成包括循环体的向量化实现的事务代码段，其包括能够产生异常的一个或多个向量存储器读取指令。编译器还生成非事务性回退代码段，其包括响应于在事务代码段内生成的异常被执行的循环体的标量实现。编译器可以检测循环是否包含依赖于可以在先前迭代中更新的条件的存储器读取，或者循环是否包含两次迭代之间的潜在数据依赖性。当实际数据依赖性存在时，编译器可以生成实际数据依赖性和要执行的显式事务中止指令的动态检查。描述和要求保护其他实施例。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类