Patent search ap:("NVIDIA CORPORATION") AND inv:"Brian FAHS" Page 2

11.

发明申请
REPLAYING MEMORY TRANSACTIONS WHILE RESOLVING MEMORY ACCESS FAULTS 有权

公开(公告)号：US20170161206A1

公开(公告)日：2017-06-08

申请号：US15437400

申请日：2017-02-20

Applicant: NVIDIA Corporation

Inventor： James Leroy DEMING , Jerome F. DULUK, JR. , John MASHEY , Mark HAIRGROVE , Lucien DUNNING , Jonathon Stuart Ramsey EVANS , Samuel H. DUNCAN , Cameron BUSCHARDT , Brian FAHS

IPC: G06F12/1027 , G06F9/46

CPC classification number: G06F12/1027 , G06F9/467 , G06F12/08 , G06F2212/301 , G06F2212/684

Abstract: One embodiment of the present invention is a parallel processing unit (PPU) that includes one or more streaming multiprocessors (SMs) and implements a replay unit per SM. Upon detecting a page fault associated with a memory transaction issued by a particular SM, the corresponding replay unit causes the SM, but not any unaffected SMs, to cease issuing new memory transactions. The replay unit then stores the faulting memory transaction and any faulting in-flight memory transaction in a replay buffer. As page faults are resolved, the replay unit replays the memory transactions in the replay buffer—removing successful memory transactions from the replay buffer—until all of the stored memory transactions have successfully executed. Advantageously, the overall performance of the PPU is improved compared to conventional PPUs that, upon detecting a page fault, stop performing memory transactions across all SMs included in the PPU until the fault is resolved.

12.

发明申请
COOPERATIVE THREAD ARRAY REDUCTION AND SCAN OPERATIONS 有权
Title translation: 合作螺线减排和扫描作业

公开(公告)号：US20160357560A1

公开(公告)日：2016-12-08

申请号：US15238428

申请日：2016-08-16

Applicant: NVIDIA Corporation

Inventor： Brian FAHS , Ming Y. SIU , Brett W. Coon , John R. NICKOLLS , Lars NYLAND

IPC: G06F9/30 , G06F9/38

CPC classification number: G06F9/522 , G06F8/458 , G06F9/3004 , G06F9/30087 , G06F9/30145 , G06F9/3851

Abstract: One embodiment of the present invention sets forth a technique for performing aggregation operations across multiple threads that execute independently. Aggregation is specified as part of a barrier synchronization or barrier arrival instruction, where in addition to performing the barrier synchronization or arrival, the instruction aggregates (using reduction or scan operations) values supplied by each thread. When a thread executes the barrier aggregation instruction the thread contributes to a scan or reduction result, and waits to execute any more instructions until after all of the threads have executed the barrier aggregation instruction. A reduction result is communicated to each thread after all of the threads have executed the barrier aggregation instruction and a scan result is communicated to each thread as the barrier aggregation instruction is executed by the thread.

Abstract translation: 本发明的一个实施例提出了一种用于跨独立执行的多个线程执行聚合操作的技术。聚合被指定为屏障同步或屏障到达指令的一部分，其中除了执行屏障同步或到达之外，指令聚合（使用缩减或扫描操作）由每个线程提供的值。当线程执行屏障聚合指令时，线程有助于扫描或缩小结果，并等待执行任何更多指令，直到所有线程都执行了阻挡聚合指令为止。在所有线程执行了屏障聚合指令之后，向每个线程传递减少结果，并且当线程执行屏障聚合指令时，将扫描结果传送给每个线程。

13.

发明申请
MIGRATING PAGES OF DIFFERENT SIZES BETWEEN HETEROGENEOUS PROCESSORS 有权
Title translation: 异构处理器之间的不同尺寸的移动页

公开(公告)号：US20160357482A1

公开(公告)日：2016-12-08

申请号：US15243909

申请日：2016-08-22

Applicant: NVIDIA Corporation

Inventor： Jerome F. DULUK, JR. , Cameron BUSCHARDT , James Leroy DEMING , Lucien DUNNING , Brian FAHS , Mark HAIRGROVE , Chenghuan JIA , John MASHEY , James M. VAN DYKE

IPC: G06F3/06 , G06F12/1009

CPC classification number: G06F3/0647 , G06F3/061 , G06F3/0655 , G06F3/0683 , G06F12/08 , G06F12/1009 , G06F12/122 , G06F2212/652

Abstract: One embodiment of the present invention sets forth a computer-implemented method for migrating a memory page from a first memory to a second memory. The method includes determining a first page size supported by the first memory. The method also includes determining a second page size supported by the second memory. The method further includes determining a use history of the memory page based on an entry in a page state directory associated with the memory page. The method also includes migrating the memory page between the first memory and the second memory based on the first page size, the second page size, and the use history.

Abstract translation: 本发明的一个实施例提出了一种用于将存储器页从第一存储器迁移到第二存储器的计算机实现的方法。该方法包括确定由第一存储器支持的第一页大小。该方法还包括确定由第二存储器支持的第二页大小。该方法还包括基于与存储器页相关联的页面状态目录中的条目来确定存储器页面的使用历史。该方法还包括基于第一页面大小，第二页面大小和使用历史来在第一存储器和第二存储器之间迁移存储器页面。

14.

发明申请
FRAME BUFFER ACCESS TRACKING VIA A SLIDING WINDOW IN A UNIFIED VIRTUAL MEMORY SYSTEM 有权
Title translation: 通过一个统一的虚拟内存系统中的滑动窗口来进行帧缓冲器访问

公开(公告)号：US20140281365A1

公开(公告)日：2014-09-18

申请号：US14105015

申请日：2013-12-12

Applicant: NVIDIA CORPORATION

Inventor： John MASHEY , Cameron BUSCHARDT , James Leroy DEMING , Jerome F. DULUK, JR. , Brian FAHS

IPC: G06F12/10

CPC classification number: G06F3/0622 , G06F3/0631 , G06F3/0647 , G06F3/0685 , G06F12/1009 , G06F12/1027 , G06F2212/656 , G06F2212/684

Abstract: One embodiment of the present invention is a memory subsystem that includes a sliding window tracker that tracks memory accesses associated with a sliding window of memory page groups. When the sliding window tracker detects an access operation associated with a memory page group within the sliding window, the sliding window tracker sets a reference bit that is associated with the memory page group and is included in a reference vector that represents accesses to the memory page groups within the sliding window. Based on the values of the reference bits, the sliding window tracker causes the selection a memory page in a memory page group that has fallen into disuse from a first memory to a second memory. Because the sliding window tracker tunes the memory pages that are resident in the first memory to reflect memory access patterns, the overall performance of the memory subsystem is improved.

Abstract translation: 本发明的一个实施例是一种存储器子系统，其包括跟踪与存储器页组的滑动窗口相关联的存储器访问的滑动窗口跟踪器。当滑动窗口跟踪器检测到与滑动窗口内的存储器页面组相关联的访问操作时，滑动窗口跟踪器设置与存储器页面组相关联的参考位，并且被包括在表示对存储器页面的访问的参考向量中在滑动窗口内的组。基于参考位的值，滑动窗口跟踪器使选择已经从第一存储器废弃到第二存储器的存储器页组中的存储器页。因为滑动窗口跟踪器调谐驻留在第一存储器中的存储器页面以反映存储器访问模式，所以提高了存储器子系统的整体性能。

15.

发明申请
MIGRATION COUNTERS FOR HYBRID MEMORIES IN A UNIFIED VIRTUAL MEMORY SYSTEM 有权
Title translation: 混合虚拟存储系统中混合存储器的移动计数器

公开(公告)号：US20140281264A1

公开(公告)日：2014-09-18

申请号：US14133488

申请日：2013-12-18

Applicant: NVIDIA CORPORATION

Inventor： Jerome F. DULUK, JR. , Cameron BUSCHARDT , James Leroy DEMING , Brian FAHS

IPC: G06F12/08

CPC classification number: G06F12/08 , G06F11/3037 , G06F11/3442 , G06F11/3471 , G06F2201/81 , G06F2201/815 , G06F2201/88 , G06F2212/205

Abstract: Embodiments of the approaches disclosed herein include a subsystem that includes an access tracking mechanism configured to monitor access operations directed to a first memory and a second memory. The access tracking mechanism detects an access operation generated by a processor for accessing a first memory page residing on the second memory. The access tracking mechanism further determines that the first memory page is included in a first subset of memory pages residing on the second memory. The access tracking mechanism further locates, within a reference vector, a reference bit that corresponds to the first memory page, and sets the reference bit. One advantage of the present invention is that memory pages in a hybrid system migrate as needed to increase overall memory performance.

Abstract translation: 本文公开的方法的实施例包括子系统，其包括被配置为监视针对第一存储器和第二存储器的访问操作的访问跟踪机制。访问跟踪机制检测由处理器生成的用于访问驻留在第二存储器上的第一存储器页面的访问操作。访问跟踪机制还确定第一存储器页被包括在驻留在第二存储器上的存储器页的第一子集中。访问跟踪机制进一步在参考向量内定位与第一存储器页对应的参考位，并设置参考位。本发明的一个优点是混合系统中的存储器页面根据需要迁移以增加总体存储器性能。

16.

发明申请
PCIE TRAFFIC TRACKING HARDWARE IN A UNIFIED VIRTUAL MEMORY SYSTEM 有权
Title translation: PCIE交通跟踪硬件在一个统一的虚拟内存系统

公开(公告)号：US20140281110A1

公开(公告)日：2014-09-18

申请号：US14101246

申请日：2013-12-09

Applicant: NVIDIA CORPORATION

Inventor： Jerome F. DULUK, Jr. , Cameron BUSCHARDT , James Leroy DEMING , Brian FAHS , Mark HAIRGROVE , John MASHEY

IPC: G06F12/08 , G06F13/40

CPC classification number: G06F13/404 , G06F12/0806 , G06F12/0864 , G06F12/0875 , G06F12/123 , G06F2212/1016 , G06F2212/1041

Abstract: Techniques are disclosed for tracking memory page accesses in a unified virtual memory system. An access tracking unit detects a memory page access generated by a first processor for accessing a memory page in a memory system of a second processor. The access tracking unit determines whether a cache memory includes an entry for the memory page. If so, then the access tracking unit increments an associated access counter. Otherwise, the access tracking unit attempts to find an unused entry in the cache memory that is available for allocation. If so, then the access tracking unit associates the second entry with the memory page, and sets an access counter associated with the second entry to an initial value. Otherwise, the access tracking unit selects a valid entry in the cache memory; clears an associated valid bit; associates the entry with the memory page; and initializes an associated access counter.

Abstract translation: 公开了用于跟踪统一虚拟存储器系统中的存储器页面访问的技术。访问跟踪单元检测由第一处理器生成的用于访问第二处理器的存储器系统中的存储器页面的存储器页面访问。访问跟踪单元确定高速缓存存储器是否包括用于存储器页面的条目。如果是这样，则访问跟踪单元增加相关联的访问计数器。否则，访问跟踪单元尝试在高速缓冲存储器中找到可用于分配的未使用的条目。如果是，则访问跟踪单元将第二条目与存储器页面相关联，并将与第二条目相关联的访问计数器设置为初始值。否则，访问跟踪单元选择高速缓冲存储器中的有效条目; 清除相关的有效位; 将条目与记忆页相关联; 并初始化相关的访问计数器。

17.

发明申请
COOPERATIVE THREAD ARRAY REDUCTION AND SCAN OPERATIONS 有权
Title translation: 合作螺线减排和扫描作业

公开(公告)号：US20140019724A1

公开(公告)日：2014-01-16

申请号：US14025482

申请日：2013-09-12

Applicant: NVIDIA Corporation

Inventor： Brian FAHS , Ming Y. SIU , Brett W. COON , John R. NICKOLLS , Lars NYLAND

IPC: G06F9/30

CPC classification number: G06F9/522 , G06F8/458 , G06F9/3004 , G06F9/30087 , G06F9/30145 , G06F9/3851

Abstract: One embodiment of the present invention sets forth a technique for performing aggregation operations across multiple threads that execute independently. Aggregation is specified as part of a barrier synchronization or barrier arrival instruction, where in addition to performing the barrier synchronization or arrival, the instruction aggregates (using reduction or scan operations) values supplied by each thread. When a thread executes the barrier aggregation instruction the thread contributes to a scan or reduction result, and waits to execute any more instructions until after all of the threads have executed the barrier aggregation instruction. A reduction result is communicated to each thread after all of the threads have executed the barrier aggregation instruction and a scan result is communicated to each thread as the barrier aggregation instruction is executed by the thread.

Abstract translation: 本发明的一个实施例提出了一种用于跨独立执行的多个线程执行聚合操作的技术。聚合被指定为屏障同步或屏障到达指令的一部分，其中除了执行屏障同步或到达之外，指令聚合（使用缩减或扫描操作）由每个线程提供的值。当线程执行屏障聚合指令时，线程有助于扫描或缩小结果，并等待执行任何更多指令，直到所有线程都执行了阻挡聚合指令为止。在所有线程执行了屏障聚合指令之后，向每个线程传送减少结果，并且当线程执行屏障聚合指令时，将扫描结果传送给每个线程。

18.

发明申请
PCIE TRAFFIC TRACKING HARDWARE IN A UNIFIED VIRTUAL MEMORY SYSTEM 审中-公开

公开(公告)号：US20190340145A1

公开(公告)日：2019-11-07

申请号：US16450830

申请日：2019-06-24

Applicant: NVIDIA CORPORATION

Inventor： Jerome F. DULUK, JR. , Cameron BUSCHARDT , James Leroy DEMING , Brian FAHS , Mark HAIRGROVE , John MASHEY

IPC: G06F13/40 , G06F12/123

Abstract: Techniques are disclosed for tracking memory page accesses in a unified virtual memory system. An access tracking unit detects a memory page access generated by a first processor for accessing a memory page in a memory system of a second processor. The access tracking unit determines whether a cache memory includes an entry for the memory page. If so, then the access tracking unit increments an associated access counter. Otherwise, the access tracking unit attempts to find an unused entry in the cache memory that is available for allocation. If so, then the access tracking unit associates the second entry with the memory page, and sets an access counter associated with the second entry to an initial value. Otherwise, the access tracking unit selects a valid entry in the cache memory; clears an associated valid bit; associates the entry with the memory page; and initializes an associated access counter.

19.

发明申请
FRAME BUFFER ACCESS TRACKING VIA A SLIDING WINDOW IN A UNIFIED VIRTUAL MEMORY SYSTEM 审中-公开

公开(公告)号：US20170199689A1

公开(公告)日：2017-07-13

申请号：US15169532

申请日：2016-05-31

Applicant: NVIDIA Corporation

Inventor： John MASHEY , Cameron BUSCHARDT , James Leroy DEMING , Jerome F. DULUK, JR. , Brian FAHS

IPC: G06F3/06 , G06F12/1009 , G06F12/1027

CPC classification number: G06F3/0622 , G06F3/0631 , G06F3/0647 , G06F3/0685 , G06F12/1009 , G06F12/1027 , G06F2212/656 , G06F2212/684

Abstract: One embodiment of the present invention is a memory subsystem that includes a sliding window tracker that tracks memory accesses associated with a sliding window of memory page groups. When the sliding window tracker detects an access operation associated with a memory page group within the sliding window, the sliding window tracker sets a reference bit that is associated with the memory page group and is included in a reference vector that represents accesses to the memory page groups within the sliding window. Based on the values of the reference bits, the sliding window tracker causes the selection a memory page in a memory page group that has fallen into disuse from a first memory to a second memory. Because the sliding window tracker tunes the memory pages that are resident in the first memory to reflect memory access patterns, the overall performance of the memory subsystem is improved.

20.

发明申请
MICROCONTROLLER FOR MEMORY MANAGEMENT UNIT 有权
Title translation: 内存管理单元的微控制器

公开(公告)号：US20140281356A1

公开(公告)日：2014-09-18

申请号：US14011655

申请日：2013-08-27

Applicant: NVIDIA CORPORATION

Inventor： Cameron BUSCHARDT , Jerome F. DULUK, JR. , John MASHEY , Mark HAIRGROVE , James Leroy DEMING , Brian FAHS

IPC: G06F12/10

CPC classification number: G06F12/1009 , G06F2212/301

Abstract: One embodiment of the present invention includes a microcontroller coupled to a memory management unit (MMU). The MMU is coupled to a page table included in a physical memory, and the microcontroller is configured to perform one or more virtual memory operations associated with the physical memory and the page table. In operation, the microcontroller receives a page fault generated by the MMU in response to an invalid memory access via a virtual memory address. To remedy such a page fault, the microcontroller performs actions to map the virtual memory address to an appropriate location in the physical memory. By contrast, in prior-art systems, a fault handler would typically remedy the page fault. Advantageously, because the microcontroller executes these tasks locally with respect to the MMU and the physical memory, latency associated with remedying page faults may be decreased. Consequently, overall system performance may be increased.

Abstract translation: 本发明的一个实施例包括耦合到存储器管理单元（MMU）的微控制器。 MMU耦合到包括在物理存储器中的页表，并且微控制器被配置为执行与物理存储器和页表相关联的一个或多个虚拟存储器操作。在操作中，微控制器响应于通过虚拟存储器地址的无效存储器访问而接收由MMU产生的页面错误。为了纠正这种页面错误，微控制器执行操作以将虚拟存储器地址映射到物理存储器中的适当位置。相比之下，在现有技术的系统中，故障处理器通常会补救页面错误。有利地，由于微控制器相对于MMU和物理存储器在本地执行这些任务，所以与补救页错误相关联的延迟可能会降低。因此，整体系统性能可能会增加。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification