Superpage coalescing which supports read/write access to a new virtual superpage mapping during copying of physical pages
    21.
    发明授权
    Superpage coalescing which supports read/write access to a new virtual superpage mapping during copying of physical pages 失效
    Superpage coalescing在复制物理页面期间支持对新的虚拟超级页面映射的读/写访问

    公开(公告)号:US08417913B2

    公开(公告)日:2013-04-09

    申请号:US10713733

    申请日:2003-11-13

    IPC分类号: G06F12/00

    CPC分类号: G06F12/1045

    摘要: A method of assigning virtual memory to physical memory in a data processing system allocates a set of contiguous physical memory pages for a new page mapping, instructs the memory controller to move the virtual memory pages according to the new page mapping, and then allows access to the virtual memory pages using the new page mapping while the memory controller is still copying the virtual memory pages to the set of physical memory pages. The memory controller can use a mapping table which temporarily stores entries of the old and new page addresses, and releases the entries as copying for each entry is completed. The translation lookaside buffer (TLB) entries in the processor cores are updated for the new page addresses prior to completion of copying of the memory pages by the memory controller. The invention can be extended to non-uniform memory array (NUMA) systems. For systems with cache memory, any cache entry which is affected by the page move can be updated by modifying its address tag according to the new page mapping. This tag modification may be limited to cache entries in a dirty coherency state. The cache can further relocate a cache entry based on a changed congruence class for any modified address tag.

    摘要翻译: 将虚拟存储器分配给数据处理系统中的物理存储器的方法为新的页面映射分配一组连续的物理存储器页面,指示存储器控制器根据新的页面映射移动虚拟存储器页面,然后允许访问 虚拟内存页面使用新页面映射,而内存控制器仍将虚拟内存页面复制到物理内存页面集合。 存储器控制器可以使用临时存储旧页面地址和新页面地址的条目的映射表,并且对于每个条目的拷贝完成,释放条目。 在由存储器控制器完成对存储器页面的复制之前,处理器核心中的翻译后备缓冲器(TLB)条目针对新的页地址进行更新。 本发明可以扩展到非均匀存储器阵列(NUMA)系统。 对于具有缓存内存的系统,可以通过根据新页面映射修改其地址标签来更新受页面移动影响的任何缓存条目。 该标签修改可能被限制在脏相关性状态下的高速缓存条目。 高速缓存可以根据修改后的地址标签的改变的一致性类别进一步重新定位缓存条目。

    Method and system for managing cache injection in a multiprocessor system
    22.
    发明授权
    Method and system for managing cache injection in a multiprocessor system 有权
    在多处理器系统中管理缓存注入的方法和系统

    公开(公告)号:US08255591B2

    公开(公告)日:2012-08-28

    申请号:US10948407

    申请日:2004-09-23

    IPC分类号: G06F13/28

    CPC分类号: G06F13/28

    摘要: A method and apparatus for managing cache injection in a multiprocessor system reduces processing time associated with direct memory access transfers in a symmetrical multiprocessor (SMP) or a non-uniform memory access (NUMA) multiprocessor environment. The method and apparatus either detect the target processor for DMA completion or direct processing of DMA completion to a particular processor, thereby enabling cache injection to a cache that is coupled with processor that executes the DMA completion routine processing the data injected into the cache. The target processor may be identified by determining the processor handling the interrupt that occurs on completion of the DMA transfer. Alternatively or in conjunction with target processor identification, an interrupt handler may queue a deferred procedure call to the target processor to process the transferred data. In NUMA multiprocessor systems, the completing processor/target memory is chosen for accessibility of the target memory to the processor and associated cache.

    摘要翻译: 用于管理多处理器系统中的高速缓存注入的方法和装置减少与对称多处理器(SMP)或非均匀存储器访问(NUMA)多处理器环境中的直接存储器访问传输相关联的处理时间。 该方法和装置可以检测目标处理器用于DMA完成或直接处理DMA完成到特定处理器,从而使高速缓存注入与执行DMA完成例程的处理器处理注入高速缓存的数据的处理器相连的高速缓存。 可以通过确定处理器处理在DMA传输完成时发生的中断来识别目标处理器。 或者或与目标处理器识别结合,中断处理程序可以将延迟过程调用排队到目标处理器以处理传送的数据。 在NUMA多处理器系统中,选择完成的处理器/目标存储器,以便可访问目标存储器到处理器和相关联的高速缓存。

    Dynamically adjusting a pre-fetch distance to enable just-in-time prefetching within a processing system
    23.
    发明授权
    Dynamically adjusting a pre-fetch distance to enable just-in-time prefetching within a processing system 失效
    动态调整预取距离,以便在处理系统中实现即时预取

    公开(公告)号:US07487297B2

    公开(公告)日:2009-02-03

    申请号:US11422459

    申请日:2006-06-06

    IPC分类号: G06F13/00

    CPC分类号: G06F12/0862

    摘要: A method and an apparatus for performing just-in-time data prefetching within a data processing system comprising a processor, a cache or prefetch buffer, and at least one memory storage device. The apparatus comprises a prefetch engine having means for issuing a data prefetch request for prefetching a data cache line from the memory storage device for utilization by the processor. The apparatus further comprises logic/utility for dynamically adjusting a prefetch distance between issuance by the prefetch engine of the data prefetch request and issuance by the processor of a demand (load request) targeting the data/cache line being returned by the data prefetch request, so that a next data prefetch request for a subsequent cache line completes the return of the data/cache line at effectively the same time that a demand for that subsequent data/cache line is issued by the processor.

    摘要翻译: 一种用于在包括处理器,高速缓存或预取缓冲器的数据处理系统中执行即时数据预取的方法和装置,以及至少一个存储器存储装置。 该装置包括预取引擎,具有用于发出数据预取请求的装置,用于从存储器存储装置预取数据高速缓存行以供处理器利用。 该装置还包括逻辑/实用程序,用于动态地调整数据预取请求的预取引擎的发布之间的预取距离,并且由处理器发出针对由数据预取请求返回的数据/高速缓存线的需求(加载请求) 使得对于后续高速缓存行的下一个数据预取请求在处理器发出对后续数据/高速缓存行的请求的同时有效地完成数据/高速缓存行的返回。

    Efficient Multiple-Table Reference Prediction Mechanism
    24.
    发明申请
    Efficient Multiple-Table Reference Prediction Mechanism 失效
    高效多表参考预测机制

    公开(公告)号:US20080016330A1

    公开(公告)日:2008-01-17

    申请号:US11457178

    申请日:2006-07-13

    IPC分类号: G06F9/44

    摘要: A method and an apparatus for enabling a prefetch engine to detect and support hardware prefetching with different streams in received accesses. Multiple (simple) history tables are provided within (or associated with) the prefetch engine. Each of the multiple tables is utilized to detect different access patterns. The tables are indexed by different parts of the address and are accessed in a preset order to reduce the interference between different patterns. When an address does not fit the patterns of a first table, the address is passed to the next table to be checked for a match of different patterns. In this manner, different patterns may be detected at different tables within a single prefetch engine.

    摘要翻译: 一种用于使预取引擎能够在接收的访问中检测和支持不同流的硬件预取的方法和装置。 在预取引擎(或与其相关联)中提供了多个(简单)历史表。 多个表中的每一个用于检测不同的访问模式。 这些表由地址的不同部分索引,并以预设顺序访问,以减少不同模式之间的干扰。 当地址不符合第一个表的模式时,该地址将传递给下一个表,以便检查不同模式的匹配。 以这种方式,可以在单个预取引擎内的不同表处检测不同的模式。

    HARDWARE SUPPORT FOR SUPERPAGE COALESCING
    25.
    发明申请
    HARDWARE SUPPORT FOR SUPERPAGE COALESCING 审中-公开
    硬件支持超级加煤

    公开(公告)号:US20070067604A1

    公开(公告)日:2007-03-22

    申请号:US11551168

    申请日:2006-10-19

    IPC分类号: G06F12/00

    CPC分类号: G06F12/1045

    摘要: A method of assigning virtual memory to physical memory in a data processing system allocates a set of contiguous physical memory pages for a new page mapping, instructs the memory controller to move the virtual memory pages according to the new page mapping, and then allows access to the virtual memory pages using the new page mapping while the memory controller is still copying the virtual memory pages to the set of physical memory pages. The memory controller can use a mapping table which temporarily stores entries of the old and new page addresses, and releases the entries as copying for each entry is completed. The translation look aside buffer (TLB) entries in the processor cores are updated for the new page addresses prior to completion of copying of the memory pages by the memory controller. The invention can be extended to non-uniform memory array (NUMA) systems. For systems with cache memory, any cache entry which is affected by the page move can be updated by modifying its address tag according to the new page mapping. This tag modification may be limited to cache entries in a dirty coherency state. The cache can further relocate a cache entry based on a changed congruence class for any modified address tag.

    摘要翻译: 将虚拟存储器分配给数据处理系统中的物理存储器的方法为新的页面映射分配一组连续的物理存储器页面,指示存储器控制器根据新的页面映射移动虚拟存储器页面,然后允许访问 虚拟内存页面使用新页面映射,而内存控制器仍将虚拟内存页面复制到物理内存页面集合。 存储器控制器可以使用临时存储旧页面地址和新页面地址的条目的映射表,并且对于每个条目的拷贝完成,释放条目。 在存储器控制器完成内存页复制之前,处理器核心中的缓冲区(TLB)条目将被更新为新页面地址。 本发明可以扩展到非均匀存储器阵列(NUMA)系统。 对于具有缓存内存的系统,可以通过根据新页面映射修改其地址标签来更新受页面移动影响的任何缓存条目。 该标签修改可能被限制在脏相关性状态下的高速缓存条目。 高速缓存可以根据修改后的地址标签的改变的一致性类别进一步重新定位缓存条目。

    Method and memory controller for adaptive row management within a memory subsystem
    26.
    发明授权
    Method and memory controller for adaptive row management within a memory subsystem 失效
    方法和内存控制器,用于存储器子系统内的自适应行管理

    公开(公告)号:US07082514B2

    公开(公告)日:2006-07-25

    申请号:US10666814

    申请日:2003-09-18

    IPC分类号: G06F12/00

    CPC分类号: G06F13/1631 G06F12/0215

    摘要: A method and memory controller for adaptive row management within a memory subsystem provides metrics for evaluating row access behavior and dynamically adjusting the row management policy of the memory subsystem in conformity with measured metrics to reduce the average latency of the memory subsystem. Counters provided within the memory controller track the number of consecutive row accesses and optionally the number of total accesses over a measurement interval. The number of counted consecutive row accesses can be used to control the closing of rows for subsequent accesses, reducing memory latency. The count may be validated using a second counter or storage for improved accuracy and alternatively the row close count may be set via program or logic control in conformity with a count of consecutive row hits in ratio with a total access count. The control of row closure may be performed by a mode selection between always closing a row (non-page mode) or always holding a row open (page mode) or by intelligently closing rows after a count interval (row hold count) determined from the consecutive row access measurements. The logic and counters may be incorporated within the memory controller or within the memory devices and the controller/memory devices may provide I/O ports or memory locations for reading the count values and/or setting a row management mode or row hold count.

    摘要翻译: 用于存储器子系统内的自适应行管理的方法和存储器控制器提供用于评估行访问行为的度量,并且根据所测量的度量来动态调整存储器子系统的行管理策略,以减少存储器子系统的平均等待时间。 存储器控制器内提供的计数器跟踪连续行访问的次数,以及可选的测量间隔内总访问次数。 可以使用计数的连续行访问次数来控制后续访问的行关闭,从而减少内存延迟。 可以使用第二计数器或存储器来对计数进行验证,以提高精确度,或者可以通过程序或逻辑控制来设置行关闭计数,这与根据总接入计数的比率的连续行命中的计数一致。 行闭合的控制可以通过总是关闭行(非页面模式)或总是保持行打开(页面模式)之间的模式选择来执行,或者通过在从所述第一模式确定的计数间隔(行保持计数)之后智能地关闭行来执行 连续行访问测量。 逻辑和计数器可以并入存储器控制器内或存储器件内,并且控制器/存储器设备可以提供用于读取计数值的I / O端口或存储器位置和/或设置行管理模式或行保持计数。

    Directory based support for function shipping in a multiprocessor system
    27.
    发明授权
    Directory based support for function shipping in a multiprocessor system 失效
    基于目录的多处理器系统中功能运输的支持

    公开(公告)号:US07080214B2

    公开(公告)日:2006-07-18

    申请号:US10687261

    申请日:2003-10-16

    IPC分类号: G06F12/08

    CPC分类号: G06F12/0813 G06F12/0817

    摘要: A multiprocessor system includes a plurality of data processing nodes. Each node has a processor coupled to a system memory, a cache memory, and a cache directory. The cache directory contains cache coherency information for a predetermined range of system memory addresses. An interconnection enables the nodes to exchange messages. A node initiating a function shipping request identifies an intermediate destination directory based on a list of the function's operands and sends a message indicating the function and its corresponding operands to the identified destination directory. The destination cache directory determines a target node based, at least in part, on its cache coherency status information to reduce memory access latency by selecting a target node where all or some of the operands are valid in the local cache memory. The destination directory then ships the function to the target node over the interconnection.

    摘要翻译: 多处理器系统包括多个数据处理节点。 每个节点具有耦合到系统存储器,高速缓存存储器和高速缓存目录的处理器。 缓存目录包含用于系统存储器地址的预定范围的高速缓存一致性信息。 互连使得节点能够交换消息。 启动功能运送请求的节点基于功能的操作数的列表来识别中间目的地目录,并将指示该功能及其对应的操作数的消息发送到所识别的目的地目录。 目的地缓存目录至少部分地基于其高速缓存一致性状态信息来确定目标节点,以通过选择其中全部或某些操作数在本地高速缓冲存储器中有效的目标节点来减少存储器访问等待时间。 目的地目录然后通过互连将功能发送到目标节点。

    Analysis and visualization of application concurrency and processor resource utilization
    28.
    发明授权
    Analysis and visualization of application concurrency and processor resource utilization 有权
    应用程序并发和处理器资源利用的分析和可视化

    公开(公告)号:US09594656B2

    公开(公告)日:2017-03-14

    申请号:US12605932

    申请日:2009-10-26

    申请人: Hazim Shafi

    发明人: Hazim Shafi

    摘要: An analysis and visualization depicts how an application is leveraging computer processor cores in time. The analysis and visualization enables a developer to readily identify the degree of concurrency exploited by an application at runtime. Information regarding processes or threads running on the processor cores over time is received, analyzed, and presented to indicate portions of processor cores that are used by the application, idle, or used by other processes in the system. The analysis and visualization can help a developer understand contention for processor resources, confirm the degree of concurrency, or identify serial regions of execution that might provide opportunities for exploiting parallelism.

    摘要翻译: 分析和可视化描述了应用程序如何及时利用计算机处理器内核。 分析和可视化使开发人员能够在运行时轻松识别应用程序利用的并发程度。 接收,分析和呈现关于在时间上在处理器核上运行的进程或线程的信息,以指示应用程序使用的处理器核心部分,空闲或由系统中的其他进程使用。 分析和可视化可以帮助开发人员了解处理器资源的争用,确认并发程度,或确定可能提供利用并行性的机会的连续执行区域。

    Data processing system and method for reducing cache pollution by write stream memory access patterns
    29.
    发明授权
    Data processing system and method for reducing cache pollution by write stream memory access patterns 有权
    用于通过写入流存储器访问模式减少高速缓存污染的数据处理系统和方法

    公开(公告)号:US08909871B2

    公开(公告)日:2014-12-09

    申请号:US11462115

    申请日:2006-08-03

    IPC分类号: G06F12/02 G06F12/08

    CPC分类号: G06F12/0888

    摘要: A data processing system includes a system memory and a cache hierarchy that caches contents of the system memory. According to one method of data processing, a storage modifying operation having a cacheable target real memory address is received. A determination is made whether or not the storage modifying operation has an associated bypass indication. In response to determining that the storage modifying operation has an associated bypass indication, the cache hierarchy is bypassed, and an update indicated by the storage modifying operation is performed in the system memory. In response to determining that the storage modifying operation does not have an associated bypass indication, the update indicated by the storage modifying operation is performed in the cache hierarchy.

    摘要翻译: 数据处理系统包括缓存系统存储器的内容的系统存储器和高速缓存层级。 根据一种数据处理方法,接收具有可缓存目标实际存储器地址的存储修改操作。 确定存储修改操作是否具有相关的旁路指示。 响应于确定存储修改操作具有相关联的旁路指示,忽略高速缓存层级,并且在系统存储器中执行由存储修改操作指示的更新。 响应于确定存储修改操作没有相关联的旁路指示,在高速缓存层级中执行由存储修改操作指示的更新。

    Measurement and reporting of performance event rates
    30.
    发明授权
    Measurement and reporting of performance event rates 有权
    绩效事件发生率的测量和报告

    公开(公告)号:US08572581B2

    公开(公告)日:2013-10-29

    申请号:US12411435

    申请日:2009-03-26

    IPC分类号: G06F9/44

    摘要: Methods and systems are disclosed for measuring performance event rates at a computer and reporting the performance event rates using timelines. A particular method tracks, for a time period, the occurrences of a particular event at a computer. Event rates corresponding to different time segments within the time period are calculated, and the time segments are assigned colors based on their associated event rates. The event rates are used to display a colored timeline for the time period, including displaying a colored timeline portion for each time segment in its associated color.

    摘要翻译: 公开了用于测量计算机的性能事件发生率并使用时间表报告性能事件发生率的方法和系统。 特定方法在一段时间内跟踪计算机上特定事件的发生。 计算对应于该时间段内的不同时间段的事件速率,并且基于它们相关联的事件发生率来分配时间段的颜色。 事件速率用于显示该时间段的彩色时间线,包括在其相关联的颜色中显示每个时间段的彩色时间线部分。