Method for using cache prefetch feature to improve garbage collection algorithm

    公开(公告)号:US06662274B2

    公开(公告)日:2003-12-09

    申请号:US09886068

    申请日:2001-06-20

    Abstract: A method for creating a mark stack for use in a moving garbage collection algorithm is described. The algorithm of the present invention creates a mark stack to implement a MGCA. The algorithm allows efficient use of cache memory prefetch features to reduce the required time to complete the mark stack and thus reduce the time required for garbage collection. Instructions are issued to prefetch data objects that will be examined in the future, so that by the time the scan pointer reaches the data object, the cache lines for the data object are already filled. At some point after the data object is prefetched, the address location of associated data objects is likewise prefetched. Finally, the associated data objects located at the previously fetched addresses are prefetched. This reduces garbage collection by continually supplying the garbage collector with a stream of preemptively prefetched data objects that require scanning.

    Methods and apparatus to dynamically insert prefetch instructions based on compiler and garbage collector analysis
    2.
    发明授权
    Methods and apparatus to dynamically insert prefetch instructions based on compiler and garbage collector analysis 失效
    基于编译器和垃圾回收器分析动态插入预取指令的方法和装置

    公开(公告)号:US07389385B2

    公开(公告)日:2008-06-17

    申请号:US10742009

    申请日:2003-12-19

    CPC classification number: G06F12/0253

    Abstract: Methods and apparatus to insert prefetch instructions based on garbage collector analysis and compiler analysis are disclosed. In an example method, one or more batches of samples associated with cache misses from a performance monitoring unit in a processor system are received. One or more samples from the one or more batches of samples based on delinquent information are selected. A performance impact indicator associated with the one or more samples is generated. Based on the performance indicator, at least one of a garbage collector analysis and a compiler analysis is initiated to identify one or more delinquent paths. Based on the at least one of the garbage collector analysis and the compiler analysis, one or more prefetch points to insert prefetch instructions are identified.

    Abstract translation: 公开了基于垃圾收集器分析和编译器分析来插入预取指令的方法和装置。 在示例性方法中,接收与处理器系统中的来自性能监视单元的高速缓存未命中关联的一个或多个批次的样本。 选择一个或多个基于犯罪信息的样本的一个或多个样本。 产生与一个或多个样本相关联的性能影响指示符。 基于性能指标,启动垃圾回收器分析和编译器分析中的至少一个以识别一个或多个违规路径。 基于垃圾收集器分析和编译器分析中的至少一个,识别插入预取指令的一个或多个预取点。

    Method for using non-temporal streaming to improve garbage collection algorithm
    4.
    发明授权
    Method for using non-temporal streaming to improve garbage collection algorithm 失效
    使用非时间流提高垃圾收集算法的方法

    公开(公告)号:US06950837B2

    公开(公告)日:2005-09-27

    申请号:US09885745

    申请日:2001-06-19

    CPC classification number: G06F12/0888 G06F12/0253 Y10S707/99957

    Abstract: An improved moving garbage collection algorithm is described. The algorithm allows efficient use of non-temporal stores to reduce the required time for garbage collection. Non-temporal stores (or copies) are a CPU feature that allows the copy of data objects within main memory with no interference or pollution of the cache memory. The live objects copied to new memory locations will not be accessed again in the near future and therefore need not be copied to cache. This avoids copy operations and avoids taxing the CPU with cache determinations. In a preferred embodiment, the algorithm of the present invention exploits the fact that live data objects will be stored to consecutive new memory locations in order to perform streaming copies. Since each copy procedure has an associated CPU overhead, the process of streaming the copies reduces the degradation of system performance and thus reduces the time for garbage collection.

    Abstract translation: 描述了改进的移动垃圾收集算法。 该算法允许有效地使用非时间存储来减少垃圾收集所需的时间。 非时间存储(或副本)是一种CPU功能,允许在主存储器内复制数据对象,而不会对高速缓冲存储器造成干扰或污染。 复制到新内存位置的实时对象在不久的将来不再被访问,因此不需要复制到缓存中。 这避免了复制操作,并避免了使用缓存确定对CPU进行征税。 在优选实施例中,本发明的算法利用实时数据对象将被存储到连续的新存储器位置以便执行流拷贝的事实。 由于每个复制过程都具有相关的CPU开销,所以流式传输副本的过程减少了系统性能的降级,从而减少了垃圾回收的时间。

    Method and system performing concurrently mark-sweep garbage collection invoking garbage collection thread to track and mark live objects in heap block using bit vector
    5.
    发明授权
    Method and system performing concurrently mark-sweep garbage collection invoking garbage collection thread to track and mark live objects in heap block using bit vector 有权
    同时执行标记扫描垃圾收集的方法和系统调用垃圾回收线程以使用位向量来跟踪和标记堆块中的活动对象

    公开(公告)号:US07197521B2

    公开(公告)日:2007-03-27

    申请号:US10719443

    申请日:2003-11-21

    CPC classification number: G06F12/0269 Y10S707/99944 Y10S707/99957

    Abstract: An arrangement is provided for using bit vector toggling to achieve concurrent mark-sweep garbage collection in a managed runtime system. A heap may be divided into a number of heap blocks. Each heap block may contain a mark bit vector pointer, a sweep bit vector pointer, and two bit vectors of which one may be initially pointed to by the mark bit vector pointer and used for marking and the other may be initially pointed to by the sweep bit vector pointer and used for sweeping. At the end of the marking phase for a heap block, the bit vector used for marking and the bit vector used for sweeping may be toggled so that marking phase and sweeping phase may proceed concurrently and both phases may proceed concurrently with mutators.

    Abstract translation: 提供了一种使用位向量切换在托管运行时系统中实现同时标记扫描垃圾收集的布置。 堆可以分为多个堆块。 每个堆块可以包含标记位向量指针,扫描位矢量指针和两个位向量,其中可以由标记位向量指针最初指向并用于标记的两个位向量,另一个可以由扫描开始指向 位向量指针并用于扫描。 在堆块的标记阶段结束时,可以切换用于标记的位向量和用于扫描的位向量,以便同时进行标记相位和扫描阶段,并且两个阶段可以与变异器同时进行。

    Instruction and Logic for Managing Cumulative System Bandwidth through Dynamic Request Partitioning
    9.
    发明申请
    Instruction and Logic for Managing Cumulative System Bandwidth through Dynamic Request Partitioning 审中-公开
    通过动态请求分区管理累积系统带宽的指令和逻辑

    公开(公告)号:US20160179387A1

    公开(公告)日:2016-06-23

    申请号:US14971057

    申请日:2015-12-16

    Abstract: A processor includes an execution unit, a memory subsystem, and a memory management unit (MMU). The MMU includes logic to evaluate a first bandwidth usage of the memory subsystem and logic to evaluate a second bandwidth usage between the processor and a memory. The memory is communicatively coupled to the memory subsystem. The memory subsystem is to implement a cache for the memory. The MMU further includes logic to evaluate a request of the memory subsystem, and, based upon the first bandwidth usage and the second bandwidth usage, fulfill the request by bypassing the memory subsystem.

    Abstract translation: 处理器包括执行单元,存储器子系统和存储器管理单元(MMU)。 MMU包括评估存储器子系统的第一带宽使用和评估处理器和存储器之间的第二带宽使用的逻辑的逻辑。 存储器通信地耦合到存储器子系统。 内存子系统是为内存实现缓存。 MMU还包括评估存储器子系统的请求的逻辑,并且基于第一带宽使用和第二带宽使用,通过绕过存储器子系统来满足请求。

    Identifying and prioritizing critical instructions within processor circuitry
    10.
    发明授权
    Identifying and prioritizing critical instructions within processor circuitry 有权
    识别处理器电路中关键指令的优先级

    公开(公告)号:US09323678B2

    公开(公告)日:2016-04-26

    申请号:US13993376

    申请日:2011-12-30

    Abstract: In one embodiment, the present invention includes a method for identifying a memory request corresponding to a load instruction as a critical transaction if an instruction pointer of the load instruction is present in a critical instruction table associated with a processor core, sending the memory request to a system agent of the processor with a critical indicator to identify the memory request as a critical transaction, and prioritizing the memory request ahead of other pending transactions responsive to the critical indicator. Other embodiments are described and claimed.

    Abstract translation: 在一个实施例中,本发明包括一种用于将与加载指令相对应的存储器请求识别为关键事务的方法,如果加载指令的指令指针存在于与处理器核心相关联的关键指令表中,则将存储器请求发送到 所述处理器的系统代理具有关键指示符以将所述存储器请求识别为关键事务,以及响应于所述关键指示符在其他待处理事务之前优先处理所述存储器请求。 描述和要求保护其他实施例。

Patent Agency Ranking