Patent search ap:"Kalyan Muthukumar" Page 1

1.

发明申请
Selecting A Low Power State Based On Cache Flush Latency Determination 有权
Title translation: 基于缓存冲突延迟确定选择低功耗状态

公开(公告)号：US20150268711A1

公开(公告)日：2015-09-24

申请号：US14221696

申请日：2014-03-21

Applicant: Sundar Ramani , Arvind Raman , Arvind Mandhani , Ashish V. Choubal , Kalyan Muthukumar , Ajaya V. Durg , Samudyatha Chakki

Inventor： Sundar Ramani , Arvind Raman , Arvind Mandhani , Ashish V. Choubal , Kalyan Muthukumar , Ajaya V. Durg , Samudyatha Chakki

IPC: G06F1/32 , G06F12/08

CPC classification number: G06F1/3287 , G06F1/3203 , G06F1/324 , G06F1/3243 , G06F1/3275 , G06F1/3296 , G06F12/0804 , G06F12/0808 , G06F12/0811 , G06F12/0815 , G06F12/0831 , G06F12/084 , G06F12/128 , G06F2212/1028 , G06F2212/314 , G06F2212/621 , G06F2212/69 , Y02D10/126 , Y02D10/13 , Y02D10/14 , Y02D10/152 , Y02D10/172 , Y02D50/20

Abstract: In an embodiment, a processor includes a plurality of cores to independently execute instructions, a shared cache coupled to the cores and including a plurality of lines to store data, and a power controller including a low power control logic to calculate a flush latency to flush the shared cache based on a state of the plurality of lines. Other embodiments are described and claimed.

Abstract translation: 在一个实施例中，处理器包括多个核以独立地执行指令，耦合到核的共享高速缓存并且包括多条线来存储数据，以及包括低功率控制逻辑的功率控制器来计算冲洗等待时间所述共享高速缓存基于所述多条线路的状态。描述和要求保护其他实施例。

2.

发明申请
System, method, and apparatus for spilling and filling rotating registers in software-pipelined loops 失效
Title translation: 用于在软件流水线循环中溢出和填充旋转寄存器的系统，方法和装置

公开(公告)号：US20050071607A1

公开(公告)日：2005-03-31

申请号：US10673741

申请日：2003-09-29

Applicant: Kalyan Muthukumar

Inventor： Kalyan Muthukumar

IPC: G06F9/30 , G06F9/45 , G06F15/00

CPC classification number: G06F9/3013 , G06F8/4452 , G06F9/30134

Abstract: An efficient method for software-pipelining (SWP) of loops to translate programs, from higher level languages into equivalent object or machine language code for execution on a computer. In one example embodiment, this is accomplished by spilling and filling multiple computed values, in a register, that are live across multiple stages in a software-pipelined loop, using multiple rotating stack memory locations to reduce compiler-time of SWP, and complexity of the implemented SWP.

Abstract translation: 一种循环软件流水线（SWP）的有效方法，用于将程序从较高级别的语言翻译成等效对象或机器语言代码，以便在计算机上执行。在一个示例实施例中，这是通过使用多个旋转堆栈存储器位置溢出并填充在寄存器中的多个计算值来实现的，所述多个计算值在软件流水线循环中的多个阶段处于活动状态，以减少SWP的编译器时间和复杂度实施的SWP。

3.

发明申请
ENERGY OPTIMIZATION TECHNIQUES IN A COMPUTING SYSTEM 有权
Title translation: 计算机系统能源优化技术

公开(公告)号：US20120089852A1

公开(公告)日：2012-04-12

申请号：US13018810

申请日：2011-02-01

Applicant: Kalyan Muthukumar , Seshadri Harinarayanan , Rama Kishan V. Malladi , Raghavendra S. Hebbalalu , Mukesh Gangadhar

Inventor： Kalyan Muthukumar , Seshadri Harinarayanan , Rama Kishan V. Malladi , Raghavendra S. Hebbalalu , Mukesh Gangadhar

IPC: G06F1/32

CPC classification number: G06F1/3206 , G06F1/3228 , G06F1/324 , Y02D10/126

Abstract: A computing platform may include components to determine performance loss values and energy savings values for each of the plurality of regions and/or the memory boundedness value of each of a plurality of regions within an application. The computing platform may provide a user interface for a user to provide a user input, which provides an indication of an acceptable performance loss. For the provided performance loss value, the frequency values may be determined and the processing element may be operated at the frequency values while processing each of the plurality of regions.

Abstract translation: 计算平台可以包括用于确定多个区域中的每一个的性能损失值和能量节省值和/或应用程序内的多个区域中的每一个的存储器有界值的组件。计算平台可以为用户提供用户界面以提供用户输入，其提供可接受的性能损失的指示。为了提供的性能损失值，可以确定频率值，并且可以在处理多个区域中的每个区域的同时以频率值操作处理元件。

4.

发明申请
METHOD AND SYSTEM FOR PARALLEL EXECUTION OF MEMORY INSTRUCTIONS IN AN IN-ORDER PROCESSOR 审中-公开
Title translation: 在订单处理器中并行执行存储器指令的方法和系统

公开(公告)号：US20100077145A1

公开(公告)日：2010-03-25

申请号：US12238341

申请日：2008-09-25

Applicant: Sebastian C. Winkel , Kalyan Muthukumar , Don C. Soltis, JR.

Inventor： Sebastian C. Winkel , Kalyan Muthukumar , Don C. Soltis, JR.

IPC: G06F12/08 , G06F9/30

CPC classification number: G06F9/3842 , G06F9/30043 , G06F9/3861 , G06F12/0859

Abstract: A method of parallel execution of a first and a second instruction in an in-order processor. Embodiments of the invention enable parallel execution of memory instructions that are stalled by cache memory misses. The in-order processor processes cache memory misses of instructions in parallel by overlapping the first cache memory miss with cache memory misses that occur after the first cache memory miss. Memory-level parallelism in the in-order processor can be increased when more parallel and outstanding cache memory misses are generated.

Abstract translation: 一种在顺序处理器中并行执行第一和第二指令的方法。本发明的实施例使得能够并行执行由高速缓存存储器未命中停滞的存储器指令。按顺序处理器通过将第一高速缓存存储器未命中与在第一高速缓冲存储器未命中之后发生的高速缓存存储器未命中重叠来并行处理高速缓存存储器未命中。当产生更多并行和未完成的高速缓存存储器未命中时，可以增加按顺序处理器中的存储器级并行性。

5.

发明申请
Method for predicate promotion in a software loop 失效
Title translation: 软件循环中谓词升级的方法

公开(公告)号：US20070079302A1

公开(公告)日：2007-04-05

申请号：US11241144

申请日：2005-09-30

Applicant: Kalyan Muthukumar , Robyn Sampson , Daniel Lavery

Inventor： Kalyan Muthukumar , Robyn Sampson , Daniel Lavery

IPC: G06F9/45

CPC classification number: G06F8/443 , G06F8/433

Abstract: A method and system for optimizing the execution of a software loop is provided. The method involves the determination of an edge in a critical recurrence cycle in the software loop. The edge is a dependency link between two instructions and contains a dependee and a dependent. The dependee is an instruction that produces a result, and the dependent is an instruction that uses the result. The method further involves performing predicate promotion of at least one of the dependee and the dependent if one or more pre-determined conditions are met.

Abstract translation: 提供了一种用于优化软件循环执行的方法和系统。该方法涉及在软件循环中确定临界循环中的边缘。边缘是两条指令之间的依赖关系，包含一个依赖关系和一个依赖关系。依赖者是产生结果的指令，依赖关系是使用结果的指令。该方法还涉及如果满足一个或多个预定条件，则执行至少一个依赖者和依赖者的谓词升级。

6.

发明授权
Method of, system for, and computer program product for providing efficient utilization of memory hierarchy through code restructuring 有权
Title translation: 方法，系统和计算机程序产品，通过代码重组提供有效利用存储器层次结构

公开(公告)号：US06839895B1

公开(公告)日：2005-01-04

申请号：US09685481

申请日：2000-10-10

Applicant: Dz Ching Ju , Kalyan Muthukumar , Shankar Ramaswamy , Barbara Bluestein Simons

Inventor： Dz Ching Ju , Kalyan Muthukumar , Shankar Ramaswamy , Barbara Bluestein Simons

IPC: G06F9/45

CPC classification number: G06F8/4442

Abstract: Code restructuring or reordering based on profiling information and memory hierarchy is provided by constructing a Program Execution Graph (PEG) corresponding to a level of the memory hierarchy, partitioning this PEG to reduce estimated memory overhead costs below an upper bound, and constructing a PEG for a next level of the memory hierarchy from the partitioned PEG. The PEG is constructed from control flow and frequency information from a profile of the program to be restructured. The PEG is a weighted undirected graph comprising nodes representing basic blocks and edges representing transfer of control between pairs of basic blocks. The weight of a node is the size of the basic block it represents and the weight of an edge is the frequency of transition between the pair of basic blocs it connects.

Abstract translation: 通过构建对应于存储器层级的级别的程序执行图（PEG）来提供基于分析信息和存储器层次结构的代码重构或重新排序，对该PEG进行划分以减少低于上限的估计的内存开销成本，以及构建用于来自分区PEG的内存层次结构的下一级别。 PEG由要重组程序的配置文件的控制流程和频率信息构成。 PEG是加权无向图，包括表示基本块的节点和表示基本块对之间的控制传输的边。节点的权重是其表示的基本块的大小，边的权重是其连接的一对基本块之间的转换频率。

7.

发明授权
Early exit transformations for software pipelining 有权
Title translation: 软件流水线的早期退出转换

公开(公告)号：US06571385B1

公开(公告)日：2003-05-27

申请号：US09273947

申请日：1999-03-22

Applicant: Kalyan Muthukumar , Dong-Yuan Chen , Youfeng Wu , Daniel M. Lavery

Inventor： Kalyan Muthukumar , Dong-Yuan Chen , Youfeng Wu , Daniel M. Lavery

IPC: G06F944

CPC classification number: G06F9/325 , G06F8/4452 , G06F9/30072 , G06F9/30094

Abstract: The invention is directed to the transformation of software loops having early exit conditions, thereby allowing the loops to be more effectively converted to a single basic block for software pipelining. The invention assigns a predicate register for each early exit condition of the software loop. The predicate registers are set when the corresponding early exit condition is satisfied. In this manner, when the loop terminates the predicate registers can be examined to indicate which early exit conditions were satisfied. The invention produces loops having a lower recurrence II and resource II than conventional techniques.

Abstract translation: 本发明涉及具有早期退出条件的软件循环的变换，从而允许循环更有效地转换成用于软件流水线化的单个基本块。本发明为软件循环的每个提前退出条件分配谓词寄存器。当满足相应的提前退出条件时，设定谓词寄存器。以这种方式，当循环终止时，可以检查谓词寄存器以指示哪个早期退出条件被满足。本发明产生具有比常规技术更低的复发II和资源II的环。

8.

发明授权
Method of, system for, and computer program product for providing efficient utilization of memory hierarchy through code restructuring 失效
Title translation: 方法，系统和计算机程序产品，通过代码重组提供有效利用存储器层次结构

公开(公告)号：US06175957B1

公开(公告)日：2001-01-16

申请号：US08987911

申请日：1997-12-09

Applicant: Dz Ching Ju , Kalyan Muthukumar , Shankar Ramaswamy , Barbara Bluestein Simons

Inventor： Dz Ching Ju , Kalyan Muthukumar , Shankar Ramaswamy , Barbara Bluestein Simons

IPC: G06F9445

CPC classification number: G06F8/4442

Abstract: Code restructuring or reordering based on profiling information and memory hierarchy is provided by constructing a Program Execution Graph (PEG) corresponding to a level of the memory hierarchy, partitioning this PEG to reduce estimated memory overhead costs below an upper bound, and constructing a PEG for a next level of the memory hierarchy from the partitioned PEG. The PEG is constructed from control flow and frequency information from a profile of the program to be restructured. The PEG is a weighted undirected graph comprising nodes representing basic blocks and edges representing transfer of control between pairs of basic blocks. The weight of a node is the size of the basic block it represents and the weight of an edge is the frequency of transition between the pair of basic blocks it connects. The nodes of the PEG are partitioned or clustered into clusters such that the sum of the weights of the nodes in any cluster is no greater than an upper bound. A next PEG is then constructed from the clusters of the partitioned PEG such that a node in the next PEG corresponds to a cluster in the partitioned PEG, and such that there is an edge between two nodes in the next PEG if there is an edge between the clusters represented by the two nodes. Weights are assigned to the nodes and edges of the next PEG to produce a PEG, and then the PEG partitioning, basic block reordering, and PEG construction steps may be repeated for each level of the memory hierarchy. After the clustering is completed, the basic blocks are reordered in memory by grouping all of the nodes of a cluster in an adjacent order beginning at a boundary for all the levels of the memory hierarchy. Because clusters must not cross boundaries of memory hierarchies, NOPs are added to fill out the portion of a memory hierarchy level that is not filled by the clusters.

Abstract translation: 通过构建对应于存储器层级的级别的程序执行图（PEG）来提供基于分析信息和存储器层次结构的代码重构或重新排序，对该PEG进行划分以减少低于上限的估计的内存开销成本，以及构建用于来自分区PEG的内存层次结构的下一级别。 PEG由要重组程序的配置文件的控制流程和频率信息构成。 PEG是加权无向图，包括表示基本块的节点和表示基本块对之间的控制传输的边。节点的权重是其表示的基本块的大小，边的权重是其连接的一对基本块之间的转换频率。 PEG的节点被分区或聚类成簇，使得任何簇中的节点的权重之和不大于上限。然后从分隔的PEG的簇构建下一个PEG，使得下一个PEG中的节点对应于分离的PEG中的簇，并且使得在下一个PEG中的两个节点之间存在边缘，如果在由两个节点代表的群集。将权重分配给下一个PEG的节点和边缘以产生PEG，然后可以针对存储器层级的每个级别重复PEG划分，基本块重排序和PEG构建步骤。在聚类完成之后，基本块在存储器中被重新排序，所以在存储器层次结构的所有级别的边界处以相邻顺序对簇的所有节点进行分组。因为集群不能跨越内存层次结构的边界，所以添加NOP以填充未由集群填充的内存层次结构级别的部分。

9.

发明授权
Energy optimization techniques in a computing system 有权
Title translation: 计算系统中的能量优化技术

公开(公告)号：US08843775B2

公开(公告)日：2014-09-23

申请号：US13018810

申请日：2011-02-01

Applicant: Kalyan Muthukumar , Seshadri Harinarayanan , Rama Kishan V. Malladi , Raghavendra S. Hebbalalu , Mukesh Gangadhar

Inventor： Kalyan Muthukumar , Seshadri Harinarayanan , Rama Kishan V. Malladi , Raghavendra S. Hebbalalu , Mukesh Gangadhar

IPC: G06F1/26 , G06F1/00 , G06F9/00

CPC classification number: G06F1/3206 , G06F1/3228 , G06F1/324 , Y02D10/126

Abstract: A computing platform may include components to determine performance loss values and energy savings values for each of the plurality of regions and/or the memory boundedness value of each of a plurality of regions within an application. The computing platform may provide a user interface for a user to provide a user input, which provides an indication of an acceptable performance loss. For the provided performance loss value, the frequency values may be determined and the processing element may be operated at the frequency values while processing each of the plurality of regions.

Abstract translation: 计算平台可以包括用于确定多个区域中的每一个的性能损失值和能量节省值和/或应用程序内的多个区域中的每一个的存储器有界值的组件。计算平台可以为用户提供用户界面以提供用户输入，其提供可接受的性能损失的指示。为了提供的性能损失值，可以确定频率值，并且可以在处理多个区域中的每个区域的同时以频率值操作处理元件。

10.

发明授权
Mechanism to avoid explicit prologs in software pipelined do-while loops 失效
Title translation: 机制避免在软件流水线的do-while循环中的明确序言

公开(公告)号：US06912709B2

公开(公告)日：2005-06-28

申请号：US09753254

申请日：2000-12-29

Applicant: David A Helder , Kalyan Muthukumar

Inventor： David A Helder , Kalyan Muthukumar

IPC: G06F9/30 , G06F9/44 , G06F9/48

CPC classification number: G06F9/4881

Abstract: The present invention provides a mechanism that facilitates speculative execution of instructions within software-pipelined loops. In accordance with one embodiment of the invention, a software-pipelined loop is initialized with a speculative instruction deactivated. At least one initiation interval of the software-pipelined loop is executed, and the speculative instruction is activated. Subsequent initiation intervals of the software-pipelined loop are then executed.

Abstract translation: 本发明提供了一种促进软件流水线循环内的指令执行的机制。根据本发明的一个实施例，软件流水线循环被初始化，推测性指令被禁用。执行软件流水线循环的至少一个启动间隔，并激活推测指令。然后执行软件流水线循环的后续启动间隔。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification