-
公开(公告)号:US11768686B2
公开(公告)日:2023-09-26
申请号:US16940363
申请日:2020-07-27
Applicant: NVIDIA Corporation
Inventor: Michael A Fetterman , Mark Gebhart , Shirish Gadre , Mitchell Hayenga , Steven Heinrich , Ramesh Jandhyala , Raghavan Madhavan , Omkar Paranjape , James Robertson , Jeff Schottmiller
IPC: G06F9/38 , G06F9/30 , G06F12/084 , G06F12/0873 , G06F9/54 , G06F12/0842 , G06F12/0846 , G06F5/06
CPC classification number: G06F9/3836 , G06F5/065 , G06F9/30047 , G06F9/3867 , G06F9/546 , G06F12/084 , G06F12/0842 , G06F12/0846 , G06F12/0873 , G06F2212/1021
Abstract: In a streaming cache, multiple, dynamically sized tracking queues are employed. Request tracking information is distributed among the plural tracking queues to selectively enable out-of-order memory request returns. A dynamically controlled policy assigns pending requests to tracking queues, providing for example in-order memory returns in some contexts and/or for some traffic and out of order memory returns in other contexts and/or for other traffic.
-
公开(公告)号:US10705994B2
公开(公告)日:2020-07-07
申请号:US15587213
申请日:2017-05-04
Applicant: NVIDIA Corporation
Inventor: Xiaogang Qiu , Ronny Krashinsky , Steven Heinrich , Shirish Gadre , John Edmondson , Jack Choquette , Mark Gebhart , Ramesh Jandhyala , Poornachandra Rao , Omkar Paranjape , Michael Siu
IPC: G06F12/084 , G06F13/28 , G06F12/0891 , G06F12/0811 , G06F12/0895 , G06F12/122 , G11C7/10
Abstract: A unified cache subsystem includes a data memory configured as both a shared memory and a local cache memory. The unified cache subsystem processes different types of memory transactions using different data pathways. To process memory transactions that target shared memory, the unified cache subsystem includes a direct pathway to the data memory. To process memory transactions that do not target shared memory, the unified cache subsystem includes a tag processing pipeline configured to identify cache hits and cache misses. When the tag processing pipeline identifies a cache hit for a given memory transaction, the transaction is rerouted to the direct pathway to data memory. When the tag processing pipeline identifies a cache miss for a given memory transaction, the transaction is pushed into a first-in first-out (FIFO) until miss data is returned from external memory. The tag processing pipeline is also configured to process texture-oriented memory transactions.
-
3.
公开(公告)号:US20150046662A1
公开(公告)日:2015-02-12
申请号:US13960719
申请日:2013-08-06
Applicant: NVIDIA Corporation
Inventor: Steven James Heinrich , Ramesh Jandhyala , Bengt-Olaf Schneider
CPC classification number: G06F13/1621 , G06F12/00 , G06F13/1626 , Y02D10/14
Abstract: A system, method, and computer program product are provided for coalescing memory access requests. A plurality of memory access requests is received in a thread execution order and a portion of the memory access requests are coalesced into memory order, where memory access requests included in the portion are generated by threads in a thread block. A memory operation is generated that is transmitted to a memory system, where the memory operation represents the coalesced portion of memory access requests.
Abstract translation: 提供了系统,方法和计算机程序产品,用于合并存储器访问请求。 以线程执行顺序接收多个存储器访问请求,并且存储器访问请求的一部分被合并到存储器顺序中,其中包括在该部分中的存储器访问请求由线程块中的线程生成。 产生存储器操作,其被传送到存储器系统,其中存储器操作表示存储器访问请求的合并部分。
-
公开(公告)号:US11379944B2
公开(公告)日:2022-07-05
申请号:US16910029
申请日:2020-06-23
Applicant: NVIDIA CORPORATION
Inventor: Michael Fetterman , Shirish Gadre , Mark Gebhart , Steven J. Heinrich , Ramesh Jandhyala , William Newhall , Omkar Paranjape , Stefano Pescador , Poorna Rao
IPC: G06T1/20 , G06T1/60 , G06F16/245
Abstract: A texture processing pipeline in a graphics processing unit generates the surface appearance for objects in a computer-generated scene. This texture processing pipeline determines, at multiple stages within the texture processing pipeline, whether texture operations and texture loads may be processed at an accelerated rate. At each stage that includes a decision point, the texture processing pipeline assumes that the current texture operation or texture load can be accelerated unless specific, known information indicates that the texture operation or texture load cannot be accelerated. As a result, the texture processing pipeline increases the number of texture operations and texture loads that are accelerated relative to the number of texture operations and texture loads that are not accelerated.
-
公开(公告)号:US09946666B2
公开(公告)日:2018-04-17
申请号:US13960719
申请日:2013-08-06
Applicant: NVIDIA Corporation
Inventor: Steven James Heinrich , Ramesh Jandhyala , Bengt-Olaf Schneider
CPC classification number: G06F13/1621 , G06F12/00 , G06F13/1626 , Y02D10/14
Abstract: A system, method, and computer program product are provided for coalescing memory access requests. A plurality of memory access requests is received in a thread execution order and a portion of the memory access requests are coalesced into memory order, where memory access requests included in the portion are generated by threads in a thread block. A memory operation is generated that is transmitted to a memory system, where the memory operation represents the coalesced portion of memory access requests.
-
公开(公告)号:US11347668B2
公开(公告)日:2022-05-31
申请号:US16921795
申请日:2020-07-06
Applicant: NVIDIA Corporation
Inventor: Xiaogang Qiu , Ronny Krashinsky , Steven Heinrich , Shirish Gadre , John Edmondson , Jack Choquette , Mark Gebhart , Ramesh Jandhyala , Poornachandra Rao , Omkar Paranjape , Michael Siu
IPC: G06F13/28 , G06F12/0891 , G06F12/0811 , G06F12/084 , G06F12/0895 , G06F12/122 , G11C7/10
Abstract: A unified cache subsystem includes a data memory configured as both a shared memory and a local cache memory. The unified cache subsystem processes different types of memory transactions using different data pathways. To process memory transactions that target shared memory, the unified cache subsystem includes a direct pathway to the data memory. To process memory transactions that do not target shared memory, the unified cache subsystem includes a tag processing pipeline configured to identify cache hits and cache misses. When the tag processing pipeline identifies a cache hit for a given memory transaction, the transaction is rerouted to the direct pathway to data memory. When the tag processing pipeline identifies a cache miss for a given memory transaction, the transaction is pushed into a first-in first-out (FIFO) until miss data is returned from external memory. The tag processing pipeline is also configured to process texture-oriented memory transactions.
-
公开(公告)号:US10459861B2
公开(公告)日:2019-10-29
申请号:US15716461
申请日:2017-09-26
Applicant: NVIDIA Corporation
Inventor: Xiaogang Qiu , Ronny Krashinsky , Steven Heinrich , Shirish Gadre , John Edmondson , Jack Choquette , Mark Gebhart , Ramesh Jandhyala , Poornachandra Rao , Omkar Paranjape , Michael Siu
IPC: G06F12/02 , G06F13/28 , G06F12/0891 , G06F12/0811 , G06F12/084
Abstract: A unified cache subsystem includes a data memory configured as both a shared memory and a local cache memory. The unified cache subsystem processes different types of memory transactions using different data pathways. To process memory transactions that target shared memory, the unified cache subsystem includes a direct pathway to the data memory. To process memory transactions that do not target shared memory, the unified cache subsystem includes a tag processing pipeline configured to identify cache hits and cache misses. When the tag processing pipeline identifies a cache hit for a given memory transaction, the transaction is rerouted to the direct pathway to data memory. When the tag processing pipeline identifies a cache miss for a given memory transaction, the transaction is pushed into a first-in first-out (FIFO) until miss data is returned from external memory. The tag processing pipeline is also configured to process texture-oriented memory transactions.
-
公开(公告)号:US09595075B2
公开(公告)日:2017-03-14
申请号:US14038599
申请日:2013-09-26
Applicant: NVIDIA CORPORATION
Inventor: Steven J. Heinrich , Eric T. Anderson , Jeffrey A. Bolz , Jonathan Dunaisky , Ramesh Jandhyala , Joel McCormack , Alexander L. Minkin , Bryon S. Nordquist , Poornachandra Rao
CPC classification number: G06T1/60 , G06F2212/302 , G06T1/20 , G06T15/04 , G09G5/363
Abstract: Approaches are disclosed for performing memory access operations in a texture processing pipeline having a first portion configured to process texture memory access operations and a second portion configured to process non-texture memory access operations. A texture unit receives a memory access request. The texture unit determines whether the memory access request includes a texture memory access operation. If the memory access request includes a texture memory access operation, then the texture unit processes the memory access request via at least the first portion of the texture processing pipeline, otherwise, the texture unit processes the memory access request via at least the second portion of the texture processing pipeline. One advantage of the disclosed approach is that the same processing and cache memory may be used for both texture operations and load/store operations to various other address spaces, leading to reduced surface area and power consumption.
Abstract translation: 公开了用于在具有被配置为处理纹理存储器访问操作的第一部分的纹理处理流水线中执行存储器访问操作的方法和被配置为处理非纹理存储器访问操作的第二部分。 纹理单元接收存储器访问请求。 纹理单元确定存储器访问请求是否包括纹理存储器访问操作。 如果存储器访问请求包括纹理存储器访问操作,则纹理单元至少通过纹理处理流水线的第一部分来处理存储器访问请求,否则,纹理单元至少经由第二部分处理存储器访问请求 纹理处理流水线。 所公开方法的一个优点是可以将相同的处理和高速缓冲存储器用于纹理操作和对各种其他地址空间的加载/存储操作,导致减少的表面积和功率消耗。
-
-
-
-
-
-
-