专利检索 ap:("Alexandre E. Eichenberger" OR "Amy (K. T.) Wang" OR "Peng Wu" OR "Peng Zhao") AND inv:"Peng Wu" 第 1 页

1.

发明授权
Method and system for versioning codes based on relative alignment for single instruction multiple data units 失效
标题翻译：基于单指令多数据单元相对对齐的版本编码方法和系统

公开(公告)号：US07673284B2

公开(公告)日：2010-03-02

申请号：US11333614

申请日：2006-01-17

申请人： Alexandre E. Eichenberger , Amy (K. T.) Wang , Peng Wu , Peng Zhao

发明人： Alexandre E. Eichenberger , Amy (K. T.) Wang , Peng Wu , Peng Zhao

IPC分类号： G06F9/44 , G06F9/45

CPC分类号： G06F8/49

摘要： A method and system for generating efficient versioned codes for single instruction multiple data units whose memory systems have alignment constraints. The system creates multiple versions of codes based on relative alignments of the data streams involved in the computation. The system also analyzes characteristics of relative alignments (e.g. compile-time or runtime) to determine whether code versioning is beneficial based on a cost model.

摘要翻译： 一种用于为存储器系统具有对准约束的单指令多数据单元生成有效版本代码的方法和系统。该系统基于计算中涉及的数据流的相对对齐来创建多个版本的代码。系统还分析相对比对的特征（例如编译时或运行时），以确定代码版本化是否基于成本模型是有益的。

2.

发明授权
Efficient generation of SIMD code in presence of multi-threading and other false sharing conditions and in machines having memory protection support 有权
标题翻译：在存在多线程和其他虚假共享条件的情况下以及具有存储器保护支持的机器中有效地生成SIMD代码

公开(公告)号：US07730463B2

公开(公告)日：2010-06-01

申请号：US11358372

申请日：2006-02-21

申请人： Alexandre E. Eichenberger , Kai-Ting Amy Wang , Peng Wu , Peng Zhao

发明人： Alexandre E. Eichenberger , Kai-Ting Amy Wang , Peng Wu , Peng Zhao

IPC分类号： G06F9/45

CPC分类号： G06F9/3851 , G06F8/44

摘要： A computer implemented method, system and computer program product for automatically generating SIMD code. The method begins by analyzing data to be accessed by a targeted loop including at least one statement, where each statement has at least one memory reference, to determine if memory accesses are safe. If memory accesses are safe, the targeted loop is simdized. If not safe, it is determined if a scheme can be applied in which safety need not be guaranteed. If such a scheme can be applied, the targeted loop is simdized according to the scheme. If such a scheme cannot be applied, it is determined if padding is appropriate. If padding is appropriate, the data is padded and the targeted loop is simdized. If padding is not appropriate, non-simdized code is generated based on the targeted loop for handling boundary conditions, the targeted loop is simdized and combined with the non-simdized code.

摘要翻译： 一种用于自动生成SIMD代码的计算机实现的方法，系统和计算机程序产品。该方法开始于分析要由目标循环访问的数据，包括至少一个语句，其中每个语句具有至少一个存储器引用，以确定存储器访问是否安全。如果存储器访问是安全的，则对象循环被简化。如果不安全，则确定是否可以应用不需要保证安全性的方案。如果可以应用这种方案，则根据该方案对目标循环进行模拟。如果不能应用这种方案，则确定填充是否合适。如果填充是合适的，则填充数据并对目标循环进行模拟。如果填充不合适，则基于用于处理边界条件的目标循环生成非模拟代码，目标循环被简化并与非模拟代码组合。

3.

发明申请
METHOD TO EXPLOIT SUPERWORD-LEVEL PARALLELISM USING SEMI-ISOMORPHIC PACKING 失效
标题翻译：使用半正交包装开发超级平行平行的方法

公开(公告)号：US20080127144A1

公开(公告)日：2008-05-29

申请号：US11536990

申请日：2006-09-29

申请人： Alexandre E. Eichenberger , Kai-Ting Amy Wang , Peng Wu , Peng Zhao

发明人： Alexandre E. Eichenberger , Kai-Ting Amy Wang , Peng Wu , Peng Zhao

IPC分类号： G06F9/45

CPC分类号： G06F8/456

摘要： A computer program product is provided for extracting SIMD parallelism. The computer program product includes instructions for providing a stream of input code comprising basic blocks; identifying pairs of statements that are semi-isomorphic with respect to each other within a basic block; iteratively combining into packs, pairs of statements that are semi-isomorphic with respect to each other, and combining packs into combined packs; collecting packs whose statements can be scheduled together for processing; and generating SIMD instructions for each pack to provide for extracting the SIMD parallelism.

摘要翻译： 提供了一种用于提取SIMD并行性的计算机程序产品。计算机程序产品包括用于提供包括基本块的输入代码流的指令; 识别在基本块内相对于彼此半同构的语句对; 迭代地组合成包，相对于半同构的语句对，以及将包合并成组合包; 收集包，其陈述可以一起安排处理; 并为每个包生成SIMD指令以提供SIMD并行性。

4.

发明授权
Analyze and reduce number of data reordering operations in SIMD code 有权
标题翻译：分析和减少SIMD代码中数据重排序的数量

公开(公告)号：US08954943B2

公开(公告)日：2015-02-10

申请号：US11340452

申请日：2006-01-26

申请人： Alexandre E. Eichenberger , Kai-Ting Amy Wang , Peng Wu , Peng Zhao

发明人： Alexandre E. Eichenberger , Kai-Ting Amy Wang , Peng Wu , Peng Zhao

IPC分类号： G06F9/45 , G06F15/00 , G06F15/76

CPC分类号： G06F8/443

摘要： A method for analyzing data reordering operations in Single Issue Multiple Data source code and generating executable code therefrom is provided. Input is received. One or more data reordering operations in the input are identified and each data reordering operation in the input is abstracted into a corresponding virtual shuffle operation so that each virtual shuffle operation forms part of an expression tree. One or more virtual shuffle trees are collapsed by combining virtual shuffle operations within at least one of the one or more virtual shuffle trees to form one or more combined virtual shuffle operations, wherein each virtual shuffle tree is a subtree of the expression tree that only contains virtual shuffle operations. Then code is generated for the one or more combined virtual shuffle operations.

摘要翻译： 提供了一种用于分析单发多数据源代码中的数据重排序操作并从中生成可执行代码的方法。收到输入。识别输入中的一个或多个数据重排序操作，并将输入中的每个数据重排序操作抽象为相应的虚拟随机播放操作，使得每个虚拟随机播放操作形成表达式树的一部分。通过将所述一个或多个虚拟随机播放树中的至少一个中的虚拟随机播放操作组合以形成一个或多个组合的虚拟随机播放操作来折叠一个或多个虚拟洗牌树，其中每个虚拟随机播放树是仅包含表达式树的子树虚拟随机操作。然后为一个或多个组合的虚拟随机操作生成代码。

5.

发明申请
Code generation for complex arithmetic reduction for architectures lacking cross data-path support 有权
标题翻译：针对缺乏跨数据路径支持的架构的复杂算术减少的代码生成

公开(公告)号：US20080092124A1

公开(公告)日：2008-04-17

申请号：US11548851

申请日：2006-10-12

申请人： Roch Georges Archambault , Alexandre E. Eichenberger , Amy Kai-Ting Wang , Peng Wu , Peng Zhao

发明人： Roch Georges Archambault , Alexandre E. Eichenberger , Amy Kai-Ting Wang , Peng Wu , Peng Zhao

IPC分类号： G06F9/45

CPC分类号： G06F8/445 , G06F8/45

摘要： A computer implemented method, apparatus, and computer usable program code for compiling source code for performing a complex operation followed by a complex reduction operation. A method is determined for generating executable code for performing the complex operation and the complex reduction operation. Executable code is generated for computing sub-products, reducing the sub-products to intermediate results, and summing the intermediate results to generate a final result in response to a determination that a reduced single instruction multiple data method is appropriate.

摘要翻译： 一种计算机实现的方法，装置和计算机可用程序代码，用于编译用于执行复杂操作的复杂缩减操作的源代码。确定用于生成用于执行复杂操作和复合缩减操作的可执行代码的方法。生成用于计算子产品的可执行代码，将子产品减少到中间结果，并且对中间结果求和以响应于减少的单指令多数据方法的确定而产生最终结果。

6.

发明申请
GENERATING OPTIMIZED SIMD CODE IN THE PRESENCE OF DATA DEPENDENCES 有权
标题翻译：在数据依赖的情况下生成优化的SIMD代码

公开(公告)号：US20080127059A1

公开(公告)日：2008-05-29

申请号：US11535181

申请日：2006-09-26

申请人： Alexandre E. Eichenberger , Amy K. Wang , Peng Wu , Peng Zhao

发明人： Alexandre E. Eichenberger , Amy K. Wang , Peng Wu , Peng Zhao

IPC分类号： G06F9/44

CPC分类号： G06F8/447 , G06F8/43

摘要： A method for generating code, including identifying at least one portion of source code that is simdizable and has a dependence, analyzing the dependence for characteristics, based upon the characteristics, selecting a transformation from a predefined group of transformations, applying the transformation to the at least one portion to generate SIMD code for the at least one portion.

摘要翻译： 一种用于生成代码的方法，包括识别可仿真并具有依赖性的源代码的至少一部分，基于特征来分析对特征的依赖性，从预定义的变换组中选择变换，将转换应用于至少一个部分，用于为所述至少一个部分生成SIMD代码。

7.

发明授权
Generating optimized SIMD code in the presence of data dependences 有权
标题翻译：在存在数据依赖性的情况下生成优化的SIMD代码

公开(公告)号：US08037464B2

公开(公告)日：2011-10-11

申请号：US11535181

申请日：2006-09-26

申请人： Alexandre E. Eichenberger , Amy K. Wang , Peng Wu , Peng Zhao

发明人： Alexandre E. Eichenberger , Amy K. Wang , Peng Wu , Peng Zhao

IPC分类号： G06F9/45

CPC分类号： G06F8/447 , G06F8/43

摘要： A method for generating code, including identifying at least one portion of source code that is simdizable and has a dependence, analyzing the dependence for characteristics, based upon the characteristics, selecting a transformation from a predefined group of transformations, applying the transformation to the at least one portion to generate SIMD code for the at least one portion.

摘要翻译： 一种用于生成代码的方法，包括识别可仿真并具有依赖性的源代码的至少一部分，基于特征来分析对特征的依赖性，从预定义的变换组中选择变换，将转换应用于至少一个部分，用于为所述至少一个部分生成SIMD代码。

8.

发明授权
Method to exploit superword-level parallelism using semi-isomorphic packing 失效
标题翻译：利用半同构包装开发超级平行度的方法

公开(公告)号：US08136105B2

公开(公告)日：2012-03-13

申请号：US11536990

申请日：2006-09-29

申请人： Alexandre E. Eichenberger , Kai-Ting Amy Wang , Peng Wu , Peng Zhao

发明人： Alexandre E. Eichenberger , Kai-Ting Amy Wang , Peng Wu , Peng Zhao

IPC分类号： G06F9/45

CPC分类号： G06F8/456

摘要： A computer program product is provided for extracting SIMD parallelism. The computer program product includes instructions for providing a stream of input code comprising basic blocks; identifying pairs of statements that are semi-isomorphic with respect to each other within a basic block; iteratively combining into packs, pairs of statements that are semi-isomorphic with respect to each other, and combining packs into combined packs; collecting packs whose statements can be scheduled together for processing; and generating SIMD instructions for each pack to provide for extracting the SIMD parallelism..

摘要翻译： 提供了一种用于提取SIMD并行性的计算机程序产品。计算机程序产品包括用于提供包括基本块的输入代码流的指令; 识别在基本块内相对于彼此半同构的语句对; 迭代地组合成包，相对于半同构的语句对，以及将包合并成组合包; 收集包，其陈述可以一起安排处理; 并为每个包生成SIMD指令以提供SIMD并行性。

9.

发明授权
Multi-petascale highly efficient parallel supercomputer 有权
标题翻译：多千兆高效并行超级计算机

公开(公告)号：US09081501B2

公开(公告)日：2015-07-14

申请号：US13004007

申请日：2011-01-10

申请人： Sameh Asaad , Ralph E. Bellofatto , Michael A. Blocksome , Matthias A. Blumrich , Peter Boyle , Jose R. Brunheroto , Dong Chen , Chen-Yong Cher , George L. Chiu , Norman Christ , Paul W. Coteus , Kristan D. Davis , Gabor J. Dozsa , Alexandre E. Eichenberger , Noel A. Eisley , Matthew R. Ellavsky , Kahn C. Evans , Bruce M. Fleischer , Thomas W. Fox , Alan Gara , Mark E. Giampapa , Thomas M. Gooding , Michael K. Gschwind , John A. Gunnels , Shawn A. Hall , Rudolf A. Haring , Philip Heidelberger , Todd A. Inglett , Brant L. Knudson , Gerard V. Kopcsay , Sameer Kumar , Amith R. Mamidala , James A. Marcella , Mark G. Megerian , Douglas R. Miller , Samuel J. Miller , Adam J. Muff , Michael B. Mundy , John K. O'Brien , Kathryn M. O'Brien , Martin Ohmacht , Jeffrey J. Parker , Ruth J. Poole , Joseph D. Ratterman , Valentina Salapura , David L. Satterfield , Robert M. Senger , Brian Smith , Burkhard Steinmacher-Burow , William M. Stockdell , Craig B. Stunkel , Krishnan Sugavanam , Yutaka Sugawara , Todd E. Takken , Barry M. Trager , James L. Van Oosten , Charles D. Wait , Robert E. Walkup , Alfred T. Watson , Robert W. Wisniewski , Peng Wu

发明人： Sameh Asaad , Ralph E. Bellofatto , Michael A. Blocksome , Matthias A. Blumrich , Peter Boyle , Jose R. Brunheroto , Dong Chen , Chen-Yong Cher , George L. Chiu , Norman Christ , Paul W. Coteus , Kristan D. Davis , Gabor J. Dozsa , Alexandre E. Eichenberger , Noel A. Eisley , Matthew R. Ellavsky , Kahn C. Evans , Bruce M. Fleischer , Thomas W. Fox , Alan Gara , Mark E. Giampapa , Thomas M. Gooding , Michael K. Gschwind , John A. Gunnels , Shawn A. Hall , Rudolf A. Haring , Philip Heidelberger , Todd A. Inglett , Brant L. Knudson , Gerard V. Kopcsay , Sameer Kumar , Amith R. Mamidala , James A. Marcella , Mark G. Megerian , Douglas R. Miller , Samuel J. Miller , Adam J. Muff , Michael B. Mundy , John K. O'Brien , Kathryn M. O'Brien , Martin Ohmacht , Jeffrey J. Parker , Ruth J. Poole , Joseph D. Ratterman , Valentina Salapura , David L. Satterfield , Robert M. Senger , Brian Smith , Burkhard Steinmacher-Burow , William M. Stockdell , Craig B. Stunkel , Krishnan Sugavanam , Yutaka Sugawara , Todd E. Takken , Barry M. Trager , James L. Van Oosten , Charles D. Wait , Robert E. Walkup , Alfred T. Watson , Robert W. Wisniewski , Peng Wu

IPC分类号： G06F15/173 , G06F9/06 , G06F15/76

CPC分类号： G06F13/287 , G06F9/06 , G06F9/3004 , G06F9/30047 , G06F9/3885 , G06F12/0811 , G06F12/0831 , G06F12/0862 , G06F12/0864 , G06F12/1027 , G06F15/17381 , G06F15/17387 , G06F15/76 , G06F15/8069 , G06F2212/1016 , G06F2212/602 , G06F2212/6022 , G06F2212/6024 , G06F2212/6032 , Y02D10/13 , Y02D10/14

摘要： A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC). Each ASIC computing node comprises a system-on-chip ASIC utilizing four or more processors integrated into one die, with each having full access to all system resources and enabling adaptive partitioning of the processors to functions such as compute or messaging I/O on an application by application basis, and preferably, enable adaptive partitioning of functions in accordance with various algorithmic phases within an application, or if I/O or other processors are underutilized, then can participate in computation or communication nodes are interconnected by a five dimensional torus network with DMA that optimally maximize the throughput of packet communications between nodes and minimize latency.

摘要翻译： 具有100 petaOPS规模计算的多Petascale高效并行超级计算机，其成本，功耗和占地面积都在降低，并且允许从互连角度来看处理节点的最大封装密度。超级计算机利用了VLSI的技术进步，实现了许多处理器可以集成到单个专用集成电路（ASIC）中的计算模型。每个ASIC计算节点包括利用集成到一个管芯中的四个或更多个处理器的片上系统ASIC，每个处理器具有对所有系统资源的完全访问，并且使得处理器能够对诸如计算或消息传递I / O 并且优选地，根据应用内的各种算法阶段实现功能的自适应分割，或者如果I / O或其他处理器未被充分利用，则可以参与计算或通信节点通过五维环面网络互连使用DMA来最大限度地最大化节点之间的分组通信的吞吐量并最小化等待时间。

10.

发明授权
Code generation for complex arithmetic reduction for architectures lacking cross data-path support 有权
标题翻译：针对缺乏跨数据路径支持的架构的复杂算术减少的代码生成

公开(公告)号：US08423979B2

公开(公告)日：2013-04-16

申请号：US11548851

申请日：2006-10-12

申请人： Roch Georges Archambault , Alexandre E. Eichenberger , Amy Kai-Ting Wang , Peng Wu , Peng P. Zhao

发明人： Roch Georges Archambault , Alexandre E. Eichenberger , Amy Kai-Ting Wang , Peng Wu , Peng P. Zhao

IPC分类号： G06F9/45

CPC分类号： G06F8/445 , G06F8/45

摘要： A computer implemented method, apparatus, and computer usable program code for compiling source code for performing a complex operation followed by a complex reduction operation. A method is determined for generating executable code for performing the complex operation and the complex reduction operation. Executable code is generated for computing sub-products, reducing the sub-products to intermediate results, and summing the intermediate results to generate a final result in response to a determination that a reduced single instruction multiple data method is appropriate.

摘要翻译： 一种计算机实现的方法，装置和计算机可用程序代码，用于编译用于执行复杂操作的复杂缩减操作的源代码。确定用于生成用于执行复杂操作和复合缩减操作的可执行代码的方法。生成用于计算子产品的可执行代码，将子产品减少到中间结果，并且对中间结果求和以响应于减少的单指令多数据方法的确定而产生最终结果。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类