Patent search ap:("Sreenivas Subramoney" OR "Richard L. Hudson") AND inv:"Sreenivas Subramoney" Page 1

1.

发明授权
Method for using cache prefetch feature to improve garbage collection algorithm 失效

公开(公告)号：US06662274B2

公开(公告)日：2003-12-09

申请号：US09886068

申请日：2001-06-20

Applicant: Sreenivas Subramoney , Richard L. Hudson

Inventor： Sreenivas Subramoney , Richard L. Hudson

IPC: G06F1202

CPC classification number: G06F12/0862 , G06F12/0253 , G06F2212/6028 , Y10S707/99953 , Y10S707/99957

Abstract: A method for creating a mark stack for use in a moving garbage collection algorithm is described. The algorithm of the present invention creates a mark stack to implement a MGCA. The algorithm allows efficient use of cache memory prefetch features to reduce the required time to complete the mark stack and thus reduce the time required for garbage collection. Instructions are issued to prefetch data objects that will be examined in the future, so that by the time the scan pointer reaches the data object, the cache lines for the data object are already filled. At some point after the data object is prefetched, the address location of associated data objects is likewise prefetched. Finally, the associated data objects located at the previously fetched addresses are prefetched. This reduces garbage collection by continually supplying the garbage collector with a stream of preemptively prefetched data objects that require scanning.

2.

发明授权
Methods and apparatus to dynamically insert prefetch instructions based on compiler and garbage collector analysis 失效
Title translation: 基于编译器和垃圾回收器分析动态插入预取指令的方法和装置

公开(公告)号：US07389385B2

公开(公告)日：2008-06-17

申请号：US10742009

申请日：2003-12-19

Applicant: Mauricio J. Serrano , Sreenivas Subramoney , Richard L. Hudson , Ali-Reza Adl-Tabatabai

Inventor： Mauricio J. Serrano , Sreenivas Subramoney , Richard L. Hudson , Ali-Reza Adl-Tabatabai

IPC: G06F12/00 , G06F9/26 , G06F9/45 , G06F11/30

CPC classification number: G06F12/0253

Abstract: Methods and apparatus to insert prefetch instructions based on garbage collector analysis and compiler analysis are disclosed. In an example method, one or more batches of samples associated with cache misses from a performance monitoring unit in a processor system are received. One or more samples from the one or more batches of samples based on delinquent information are selected. A performance impact indicator associated with the one or more samples is generated. Based on the performance indicator, at least one of a garbage collector analysis and a compiler analysis is initiated to identify one or more delinquent paths. Based on the at least one of the garbage collector analysis and the compiler analysis, one or more prefetch points to insert prefetch instructions are identified.

Abstract translation: 公开了基于垃圾收集器分析和编译器分析来插入预取指令的方法和装置。在示例性方法中，接收与处理器系统中的来自性能监视单元的高速缓存未命中关联的一个或多个批次的样本。选择一个或多个基于犯罪信息的样本的一个或多个样本。产生与一个或多个样本相关联的性能影响指示符。基于性能指标，启动垃圾回收器分析和编译器分析中的至少一个以识别一个或多个违规路径。基于垃圾收集器分析和编译器分析中的至少一个，识别插入预取指令的一个或多个预取点。

3.

发明授权
Methods and apparatus to dynamically insert prefetch instructions based on garbage collector analysis and layout of objects 失效
Title translation: 基于垃圾回收器分析和对象布局动态插入预取指令的方法和装置

公开(公告)号：US07577947B2

公开(公告)日：2009-08-18

申请号：US10741897

申请日：2003-12-19

Applicant: Sreenivas Subramoney , Mauricio J. Serrano , Richard L. Hudson , Ali-Reza Adl-Tabatabai

Inventor： Sreenivas Subramoney , Mauricio J. Serrano , Richard L. Hudson , Ali-Reza Adl-Tabatabai

IPC: G06F9/45

CPC classification number: G06F12/0253 , G06F12/0862 , G06F2212/6026

Abstract: Methods and apparatus to dynamically insert prefetch instructions are disclosed. In an example method, one or more samples associated with cache misses are identified from a performance monitoring unit in a processor system. Based on sample information associated with the one or more samples, delinquent information is generated. To dynamically insert one or more prefetch instructions, a prefetch point is identified based on the delinquent information.

Abstract translation: 公开了动态插入预取指令的方法和装置。在示例性方法中，与处理器系统中的性能监视单元识别与高速缓存未命中相关联的一个或多个样本。根据与一个或多个样本相关联的样本信息，生成违法信息。为了动态地插入一个或多个预取指令，基于拖欠信息来识别预取点。

4.

发明授权
Method for using non-temporal streaming to improve garbage collection algorithm 失效
Title translation: 使用非时间流提高垃圾收集算法的方法

公开(公告)号：US06950837B2

公开(公告)日：2005-09-27

申请号：US09885745

申请日：2001-06-19

Applicant: Sreenivas Subramoney , Richard L. Hudson

Inventor： Sreenivas Subramoney , Richard L. Hudson

IPC: G06F12/02 , G06F12/08 , G06F17/00

CPC classification number: G06F12/0888 , G06F12/0253 , Y10S707/99957

Abstract: An improved moving garbage collection algorithm is described. The algorithm allows efficient use of non-temporal stores to reduce the required time for garbage collection. Non-temporal stores (or copies) are a CPU feature that allows the copy of data objects within main memory with no interference or pollution of the cache memory. The live objects copied to new memory locations will not be accessed again in the near future and therefore need not be copied to cache. This avoids copy operations and avoids taxing the CPU with cache determinations. In a preferred embodiment, the algorithm of the present invention exploits the fact that live data objects will be stored to consecutive new memory locations in order to perform streaming copies. Since each copy procedure has an associated CPU overhead, the process of streaming the copies reduces the degradation of system performance and thus reduces the time for garbage collection.

Abstract translation: 描述了改进的移动垃圾收集算法。该算法允许有效地使用非时间存储来减少垃圾收集所需的时间。非时间存储（或副本）是一种CPU功能，允许在主存储器内复制数据对象，而不会对高速缓冲存储器造成干扰或污染。复制到新内存位置的实时对象在不久的将来不再被访问，因此不需要复制到缓存中。这避免了复制操作，并避免了使用缓存确定对CPU进行征税。在优选实施例中，本发明的算法利用实时数据对象将被存储到连续的新存储器位置以便执行流拷贝的事实。由于每个复制过程都具有相关的CPU开销，所以流式传输副本的过程减少了系统性能的降级，从而减少了垃圾回收的时间。

5.

发明授权
Method and system performing concurrently mark-sweep garbage collection invoking garbage collection thread to track and mark live objects in heap block using bit vector 有权
Title translation: 同时执行标记扫描垃圾收集的方法和系统调用垃圾回收线程以使用位向量来跟踪和标记堆块中的活动对象

公开(公告)号：US07197521B2

公开(公告)日：2007-03-27

申请号：US10719443

申请日：2003-11-21

Applicant: Sreenivas Subramoney , Richard Hudson

Inventor： Sreenivas Subramoney , Richard Hudson

IPC: G06F17/30 , G06F17/00 , G06F9/45 , G06F12/00

CPC classification number: G06F12/0269 , Y10S707/99944 , Y10S707/99957

Abstract: An arrangement is provided for using bit vector toggling to achieve concurrent mark-sweep garbage collection in a managed runtime system. A heap may be divided into a number of heap blocks. Each heap block may contain a mark bit vector pointer, a sweep bit vector pointer, and two bit vectors of which one may be initially pointed to by the mark bit vector pointer and used for marking and the other may be initially pointed to by the sweep bit vector pointer and used for sweeping. At the end of the marking phase for a heap block, the bit vector used for marking and the bit vector used for sweeping may be toggled so that marking phase and sweeping phase may proceed concurrently and both phases may proceed concurrently with mutators.

Abstract translation: 提供了一种使用位向量切换在托管运行时系统中实现同时标记扫描垃圾收集的布置。堆可以分为多个堆块。每个堆块可以包含标记位向量指针，扫描位矢量指针和两个位向量，其中可以由标记位向量指针最初指向并用于标记的两个位向量，另一个可以由扫描开始指向位向量指针并用于扫描。在堆块的标记阶段结束时，可以切换用于标记的位向量和用于扫描的位向量，以便同时进行标记相位和扫描阶段，并且两个阶段可以与变异器同时进行。

6.

发明申请
METHODS AND APPARATUS TO PROFILE PAGE TABLES FOR MEMORY MANAGEMENT 有权

公开(公告)号：US20210232312A1

公开(公告)日：2021-07-29

申请号：US17214534

申请日：2021-03-26

Applicant: Aravinda Prasad , Sandeep Kumar , Sreenivas Subramoney , Andy Rudoff

Inventor： Aravinda Prasad , Sandeep Kumar , Sreenivas Subramoney , Andy Rudoff

IPC: G06F3/06

Abstract: Disclosed Methods, Apparatus, and articles of manufacture to profile page tables for memory management are disclosed. An example apparatus includes a processor to execute computer readable instructions to: profile a first page at a first level of a page table as not part of a target group; and in response to profiling the first page as not part of the target group, label a data page at a second level that corresponds to the first page as not part of the target group, the second level being lower than the first level.

7.

发明申请
ONLINE LEARNING BASED ALGORITHMS TO INCREASE RETENTION AND REUSE OF GPU-GENERATED DYNAMIC SURFACES IN OUTER-LEVEL CACHES 有权
Title translation: 基于在线学习的算法可以增加GPU级别的高速缓存中GPU生成的动态表面的保留和重用

公开(公告)号：US20140368524A1

公开(公告)日：2014-12-18

申请号：US13993811

申请日：2011-12-29

Applicant: Suresh Srinivasan , Ramesh K. Rakesh , Sreenivas Subramoney , Jayesh Gaur

Inventor： Suresh Srinivasan , Ramesh K. Rakesh , Sreenivas Subramoney , Jayesh Gaur

IPC: G06F12/08 , G06T1/60

CPC classification number: G06F12/0802 , G06F12/0888 , G06T1/60 , G06T2200/28

Abstract: Some implementations disclosed herein provide techniques for caching memory data and for managing cache retention. Different cache retention policies may be applied to different cached data streams such as those of a graphics processing unit. Actual performance of the cache with respect to the data streams may be observed, and the cache retention policies may be varied based on the observed actual performance.

Abstract translation: 本文中公开的一些实施例提供了用于缓存存储器数据和用于管理高速缓存保留的技术。不同的缓存保留策略可以应用于不同的缓存数据流，例如图形处理单元的缓存数据流。可以观察到相对于数据流的缓存的实际性能，并且可以基于所观察的实际性能来改变高速缓存保留策略。

8.

发明申请
SOFTWARE SPLITTING FOR SOFTWARE DEFINED CORES 有权

公开(公告)号：US20250004766A1

公开(公告)日：2025-01-02

申请号：US18759331

申请日：2024-06-28

Applicant: Jayesh Gaur , Sumeet Bandishte , Anant Nori , Michael Chynoweth , Sreenivas Subramoney , Adi Yoaz , Anshuman Dhuliya

Inventor： Jayesh Gaur , Sumeet Bandishte , Anant Nori , Michael Chynoweth , Sreenivas Subramoney , Adi Yoaz , Anshuman Dhuliya

IPC: G06F9/30

Abstract: Techniques for software defined super core usage are described. In some examples, a first and a second processor core are to operate as a single virtual core as configured by the operating system to execute the first set of instruction segments of the single threaded program and the second set of instruction segments of the single threaded program concurrently using a shared memory space, wherein the instruction segments are to include one or more of a store instruction to store live register data to be shared with another core and a load instruction to load live register data shared by another core.

9.

发明申请
Instruction and Logic for Managing Cumulative System Bandwidth through Dynamic Request Partitioning 审中-公开
Title translation: 通过动态请求分区管理累积系统带宽的指令和逻辑

公开(公告)号：US20160179387A1

公开(公告)日：2016-06-23

申请号：US14971057

申请日：2015-12-16

Applicant: Jayesh Gaur , Prasanna Rengasamy , Pradeep Ramachandran , Sreenivas Subramoney

Inventor： Jayesh Gaur , Prasanna Rengasamy , Pradeep Ramachandran , Sreenivas Subramoney

IPC: G06F3/06 , G06F12/08

CPC classification number: G06F3/0604 , G06F3/0613 , G06F3/0653 , G06F3/0671 , G06F12/0888

Abstract: A processor includes an execution unit, a memory subsystem, and a memory management unit (MMU). The MMU includes logic to evaluate a first bandwidth usage of the memory subsystem and logic to evaluate a second bandwidth usage between the processor and a memory. The memory is communicatively coupled to the memory subsystem. The memory subsystem is to implement a cache for the memory. The MMU further includes logic to evaluate a request of the memory subsystem, and, based upon the first bandwidth usage and the second bandwidth usage, fulfill the request by bypassing the memory subsystem.

Abstract translation: 处理器包括执行单元，存储器子系统和存储器管理单元（MMU）。 MMU包括评估存储器子系统的第一带宽使用和评估处理器和存储器之间的第二带宽使用的逻辑的逻辑。存储器通信地耦合到存储器子系统。内存子系统是为内存实现缓存。 MMU还包括评估存储器子系统的请求的逻辑，并且基于第一带宽使用和第二带宽使用，通过绕过存储器子系统来满足请求。

10.

发明授权
Identifying and prioritizing critical instructions within processor circuitry 有权
Title translation: 识别处理器电路中关键指令的优先级

公开(公告)号：US09323678B2

公开(公告)日：2016-04-26

申请号：US13993376

申请日：2011-12-30

Applicant: Amit Kumar , Sreenivas Subramoney

Inventor： Amit Kumar , Sreenivas Subramoney

IPC: G06F12/08 , G06F13/14 , G06F9/38

CPC classification number: G06F12/084 , G06F9/3824 , G06F9/3838 , G06F9/3857 , G06F13/14

Abstract: In one embodiment, the present invention includes a method for identifying a memory request corresponding to a load instruction as a critical transaction if an instruction pointer of the load instruction is present in a critical instruction table associated with a processor core, sending the memory request to a system agent of the processor with a critical indicator to identify the memory request as a critical transaction, and prioritizing the memory request ahead of other pending transactions responsive to the critical indicator. Other embodiments are described and claimed.

Abstract translation: 在一个实施例中，本发明包括一种用于将与加载指令相对应的存储器请求识别为关键事务的方法，如果加载指令的指令指针存在于与处理器核心相关联的关键指令表中，则将存储器请求发送到所述处理器的系统代理具有关键指示符以将所述存储器请求识别为关键事务，以及响应于所述关键指示符在其他待处理事务之前优先处理所述存储器请求。描述和要求保护其他实施例。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification