MECHANISM FOR TRACKING AGE OF COMMON RESOURCE REQUESTS WITHIN A RESOURCE MANAGEMENT SUBSYSTEM
    1.
    发明申请
    MECHANISM FOR TRACKING AGE OF COMMON RESOURCE REQUESTS WITHIN A RESOURCE MANAGEMENT SUBSYSTEM 有权
    跟踪资源管理子系统中共同资源年龄的机制

    公开(公告)号:US20130311686A1

    公开(公告)日:2013-11-21

    申请号:US13476825

    申请日:2012-05-21

    IPC分类号: G06F5/00

    CPC分类号: H04L49/254 G06F9/46

    摘要: One embodiment of the present disclosure sets forth an effective way to maintain fairness and order in the scheduling of common resource access requests related to replay operations. Specifically, a streaming multiprocessor (SM) includes a total order queue (TOQ) configured to schedule the access requests over one or more execution cycles. Access requests are allowed to make forward progress when needed common resources have been allocated to the request. Where multiple access requests require the same common resource, priority is given to the older access request. Access requests may be placed in a sleep state pending availability of certain common resources. Deadlock may be avoided by allowing an older access request to steal resources from a younger resource request. One advantage of the disclosed technique is that older common resource access requests are not repeatedly blocked from making forward progress by newer access requests.

    摘要翻译: 本公开的一个实施例阐述了在与重放操作相关的公共资源访问请求的调度中维持公平性和顺序的有效方式。 具体地说,流式多处理器(SM)包括配置成通过一个或多个执行周期调度访问请求的总顺序队列(TOQ)。 访问请求被允许在需要时将共同资源分配给该请求来进行进展。 在多个访问请求需要相同的公共资源的情况下,优先级被赋予较旧的访问请求。 访问请求可能处于睡眠状态,等待某些公共资源的可用性。 可以通过允许较旧的访问请求从较年轻的资源请求中窃取资源来避免死锁。 所公开的技术的一个优点是较旧的公共资源访问请求不被重复阻止以通过较新的访问请求提前进展。

    RESOURCE MANAGEMENT SUBSYSTEM THAT MAINTAINS FAIRNESS AND ORDER
    2.
    发明申请
    RESOURCE MANAGEMENT SUBSYSTEM THAT MAINTAINS FAIRNESS AND ORDER 有权
    资源管理子系统维护公平和秩序

    公开(公告)号:US20130311999A1

    公开(公告)日:2013-11-21

    申请号:US13476791

    申请日:2012-05-21

    IPC分类号: G06F9/50

    CPC分类号: G06F9/5011 G06F2209/507

    摘要: One embodiment of the present disclosure sets forth an effective way to maintain fairness and order in the scheduling of common resource access requests related to replay operations. Specifically, a streaming multiprocessor (SM) includes a total order queue (TOQ) configured to schedule the access requests over one or more execution cycles. Access requests are allowed to make forward progress when needed common resources have been allocated to the request. Where multiple access requests require the same common resource, priority is given to the older access request. Access requests may be placed in a sleep state pending availability of certain common resources. Deadlock may be avoided by allowing an older access request to steal resources from a younger resource request. One advantage of the disclosed technique is that older common resource access requests are not repeatedly blocked from making forward progress by newer access requests.

    摘要翻译: 本公开的一个实施例阐述了在与重放操作相关的公共资源访问请求的调度中维持公平性和顺序的有效方式。 具体地说,流式多处理器(SM)包括配置成通过一个或多个执行周期调度访问请求的总顺序队列(TOQ)。 访问请求被允许在需要时将共同资源分配给该请求来进行进展。 在多个访问请求需要相同的公共资源的情况下,优先级被赋予较旧的访问请求。 访问请求可能处于睡眠状态,等待某些公共资源的可用性。 可以通过允许较旧的访问请求从较年轻的资源请求中窃取资源来避免死锁。 所公开的技术的一个优点是较旧的公共资源访问请求不被重复阻止以通过较新的访问请求提前进展。

    System and method for cleaning dirty data in a cache via frame buffer logic
    6.
    发明授权
    System and method for cleaning dirty data in a cache via frame buffer logic 有权
    通过帧缓冲区逻辑清理缓存中的脏数据的系统和方法

    公开(公告)号:US08341358B1

    公开(公告)日:2012-12-25

    申请号:US12562989

    申请日:2009-09-18

    IPC分类号: G06F13/00

    CPC分类号: G06F12/0846 G06F12/0804

    摘要: One embodiment of the invention sets forth a mechanism for efficiently write dirty data from the L2 cache to a DRAM. A dirty data notification, including a memory address of the dirty data, is transmitted by the L2 cache to a frame buffer logic when dirty data is stored in the L2 cache. The frame buffer logic uses a page-stream sorter to organize dirty data notifications based on the bank page associated with the memory addresses included in the dirty data notifications. The page-stream sorter includes multiple sets with entries that may be associated with different bank pages in the DRAM. The frame buffer logic transmits dirty data associated with an entry that has a maximum threshold of dirty data notifications to the DRAM. The frame buffer logic also transmits dirty data associated with the oldest entry when the number of entries in a set reaches a maximum threshold.

    摘要翻译: 本发明的一个实施例提出了一种用于将有害数据从L2高速缓存写入DRAM的机制。 当脏数据存储在L2高速缓存中时,包含脏数据的存储器地址的脏数据通知由L2高速缓存发送到帧缓冲器逻辑。 帧缓冲器逻辑使用页面流排序器来基于与包含在脏数据通知中的存储器地址相关联的存储体页来组织脏数据通知。 页面流分类器包括具有与DRAM中的不同存储体页面相关联的条目的多个集合。 帧缓冲器逻辑将具有与脏数据通知的最大阈值的条目相关联的脏数据发送到DRAM。 当一组中的条目数达到最大阈值时,帧缓冲器逻辑还发送与最早条目相关联的脏数据。

    Configurable cache occupancy policy
    7.
    发明授权
    Configurable cache occupancy policy 有权
    可配置缓存占用策略

    公开(公告)号:US08131931B1

    公开(公告)日:2012-03-06

    申请号:US12256378

    申请日:2008-10-22

    IPC分类号: G06F12/00

    CPC分类号: G06F12/121

    摘要: One embodiment of the invention is a method for evicting data from an intermediary cache that includes the steps of receiving a command from a client, determining that there is a cache miss relative to the intermediary cache, identifying one or more cache lines within the intermediary cache to store data associated with the command, determining whether any of data residing in the one or more cache lines includes raster operations data or normal data, and causing the data residing in the one or more cache lines to be evicted or stalling the command based, at least in part, on whether the data includes raster operations data or normal data. Advantageously, the method allows a series of cache eviction policies based on how cached data is categorized and the eviction classes of the data. Consequently, more optimized eviction decisions may be made, leading to fewer command stalls and improved performance.

    摘要翻译: 本发明的一个实施例是一种用于从中间缓存中取出数据的方法,包括以下步骤:从客户机接收命令,确定相对于中间缓存存在高速缓存未命中,识别中间缓存内的一个或多个高速缓存行 存储与所述命令相关联的数据,确定驻留在所述一个或多个高速缓存行中的数据中的任何一个是否包括光栅操作数据或正常数据,以及使驻留在所述一个或多个高速缓存行中的数据被驱逐或停止所述命令, 至少部分地关于数据是否包括光栅操作数据或正常数据。 有利地,该方法允许基于缓存数据被分类和数据的逐出类别的一系列缓存驱逐策略。 因此,可以进行更优化的驱逐决定,导致更少的命令停顿和改进的性能。

    System, method and frame buffer logic for evicting dirty data from a cache using counters and data types
    8.
    发明授权
    System, method and frame buffer logic for evicting dirty data from a cache using counters and data types 有权
    使用计数器和数据类型从缓存中排除脏数据的系统,方法和帧缓冲区逻辑

    公开(公告)号:US08060700B1

    公开(公告)日:2011-11-15

    申请号:US12330469

    申请日:2008-12-08

    IPC分类号: G06F13/00 G06F12/12

    摘要: A system and method for cleaning dirty data in an intermediate cache are disclosed. A dirty data notification, including a memory address and a data class, is transmitted by a level 2 (L2) cache to frame buffer logic when dirty data is stored in the L2 cache. The data classes include evict first, evict normal and evict last. In one embodiment, data belonging to the evict first data class is raster operations data with little reuse potential. The frame buffer logic uses a notification sorter to organize dirty data notifications, where an entry in the notification sorter stores the DRAM bank page number, a first count of cache lines that have resident dirty data and a second count of cache lines that have resident evict_first dirty data associated with that DRAM bank. The frame buffer logic transmits dirty data associated with an entry when the first count reaches a threshold.

    摘要翻译: 公开了一种用于清除中间缓存中的脏数据的系统和方法。 当脏数据存储在L2高速缓存中时,包含存储器地址和数据类的脏数据通知由级别2(L2)高速缓存发送到帧缓冲器逻辑。 数据类包括先驱逐出,最后逐出。 在一个实施例中,属于第一数据类别的数据是具有很少重用潜力的光栅操作数据。 帧缓冲器逻辑使用通知排序器来组织脏数据通知,其中通知分类器中的条目存储DRAM存储体页面编号,具有驻留脏数据的高速缓存行的第一计数和具有居民驱逐器的第一高速缓存行计数 与该DRAM库相关联的脏数据。 当第一个计数达到阈值时,帧缓冲器逻辑发送与条目相关联的脏数据。

    Memory addressing controlled by PTE fields
    9.
    发明授权
    Memory addressing controlled by PTE fields 有权
    由PTE字段控制的存储器寻址

    公开(公告)号:US07805587B1

    公开(公告)日:2010-09-28

    申请号:US11555628

    申请日:2006-11-01

    IPC分类号: G06F9/34 G06F12/00

    CPC分类号: G06F12/10 G06F12/0607

    摘要: Embodiments of the present invention enable virtual-to-physical memory address translation using optimized bank and partition interleave patterns to improve memory bandwidth by distributing data accesses over multiple banks and multiple partitions. Each virtual page has a corresponding page table entry that specifies the physical address of the virtual page in linear physical address space. The page table entry also includes a data kind field that is used to guide and optimize the mapping process from the linear physical address space to the DRAM physical address space, which is used to directly access one or more DRAM. The DRAM physical address space includes a row, bank and column address. The data kind field is also used to optimize the starting partition number and partition interleave pattern that defines the organization of the selected physical page of memory within the DRAM memory system.

    摘要翻译: 本发明的实施例使得能够使用优化的存储体和分区交织模式进行虚拟到物理存储器地址转换,以通过在多个存储体和多个分区上分配数据访问来提高存储器带宽。 每个虚拟页面都有一个对应的页表项,它指定了线性物理地址空间中的虚拟页面的物理地址。 页表条目还包括数据类型字段,用于引导和优化从线性物理地址空间到用于直接访问一个或多个DRAM的DRAM物理地址空间的映射处理。 DRAM物理地址空间包括一行,一行和一列地址。 数据类型字段还用于优化起始分区号和分区交织模式,其定义DRAM存储器系统内存储器的选定物理页面的组织。

    Mapping memory partitions to virtual memory pages
    10.
    发明授权
    Mapping memory partitions to virtual memory pages 有权
    将内存分区映射到虚拟内存页面

    公开(公告)号:US07620793B1

    公开(公告)日:2009-11-17

    申请号:US11467679

    申请日:2006-08-28

    摘要: Systems and methods for addressing memory using non-power-of-two virtual memory page sizes improve graphics memory bandwidth by distributing graphics data for efficient access during rendering. Various partition strides may be selected for each virtual memory page to modify the number of sequential addresses mapped to each physical memory partition and change the interleaving granularity. The addressing scheme allows for modification of a bank interleave pattern for each virtual memory page to reduce bank conflicts and improve memory bandwidth utilization. The addressing scheme also allows for modification of a partition interleave pattern for each virtual memory page to distribute accesses amongst multiple partitions and improve memory bandwidth utilization.

    摘要翻译: 使用非二功能虚拟内存页大小寻址内存的系统和方法通过在渲染过程中分配图形数据进行高效访问来提高图形内存带宽。 可以为每个虚拟存储器页面选择各种分段步长,以修改映射到每个物理存储器分区的顺序地址的数量并改变交织粒度。 寻址方案允许修改每个虚拟存储器页面的存储体交织模式以减少存储体冲突并提高存储器带宽利用率。 寻址方案还允许修改每个虚拟存储器页面的分区交织模式以分布多个分区之间的访问并提高存储器带宽利用率。