-
公开(公告)号:US10515011B2
公开(公告)日:2019-12-24
申请号:US14157159
申请日:2014-01-16
Applicant: NVIDIA CORPORATION
Inventor: David B. Glasco , Peter B. Holmqvist , George R. Lynch , Patrick R. Marchand , Karan Mehra , James Roberts
IPC: G06F12/00 , G06F12/0813 , G06F12/0802 , G06F12/1009 , G06F12/1045 , G06F12/0875 , G06F12/128 , G06F12/08
Abstract: One embodiment of the present invention sets forth a technique for increasing available storage space within compressed blocks of memory attached to data processing chips, without requiring a proportional increase in on-chip compression status bits. A compression status bit cache provides on-chip availability of compression status bits used to determine how many bits are needed to access a potentially compressed block of memory. A backing store residing in a reserved region of attached memory provides storage for a complete set of compression status bits used to represent compression status of an arbitrarily large number of blocks residing in attached memory. Physical address remapping (“swizzling”) used to distribute memory access patterns over a plurality of physical memory devices is partially replicated by the compression status bit cache to efficiently integrate allocation and access of the backing store data with other user data.
-
公开(公告)号:US10402323B2
公开(公告)日:2019-09-03
申请号:US14925920
申请日:2015-10-28
Applicant: NVIDIA CORPORATION
Inventor: Praveen Krishnamurthy , Peter B. Holmquist , Wishwesh Gandhi , Timothy Purcell , Karan Mehra , Lacky Shah
IPC: G06F12/08 , G06F12/0802 , G06F3/06
Abstract: In one embodiment of the present invention a cache unit organizes data stored in an attached memory to optimize accesses to compressed data. In operation, the cache unit introduces a layer of indirection between a physical address associated with a memory access request and groups of blocks in the attached memory. The layer of indirection—virtual tiles—enables the cache unit to selectively store compressed data that would conventionally be stored in separate physical tiles included in a group of blocks in a single physical tile. Because the cache unit stores compressed data associated with multiple physical tiles in a single physical tile and, more specifically, in adjacent locations within the single physical tile, the cache unit coalesces the compressed data into contiguous blocks. Subsequently, upon performing a read operation, the cache unit may retrieve the compressed data conventionally associated with separate physical tiles in a single read operation.
-
公开(公告)号:US09639466B2
公开(公告)日:2017-05-02
申请号:US13664387
申请日:2012-10-30
Applicant: NVIDIA Corporation
Inventor: James Patrick Robertson , Gregory Alan Muthler , Hemayet Hossain , Timothy John Purcell , Karan Mehra , Peter B. Holmqvist , George R. Lynch
IPC: G06F12/08 , G06F12/0804 , G06F12/084 , G06F12/0895 , G06F12/0868 , G06F12/0866
CPC classification number: G06F12/0804 , G06F12/084 , G06F12/0866 , G06F12/0868 , G06F12/0895
Abstract: One embodiment of the present invention sets forth a technique for processing commands received by an intermediary cache from one or more clients. The technique involves receiving a first write command from an arbiter unit, where the first write command specifies a first memory address, determining that a first cache line related to a set of cache lines included in the intermediary cache is associated with the first memory address, causing data associated with the first write command to be written into the first cache line, and marking the first cache line as dirty. The technique further involves determining whether a total number of cache lines marked as dirty in the set of cache lines is less than, equal to, or greater than a first threshold value, and: not transmitting a dirty data notification to the frame buffer logic when the total number is less than the threshold value, or transmitting a dirty data notification to the frame buffer logic when the total number is equal to or greater than the first threshold value.
-
公开(公告)号:US09110809B2
公开(公告)日:2015-08-18
申请号:US13935414
申请日:2013-07-03
Applicant: NVIDIA Corporation
Inventor: Peter B. Holmqvist , Karan Mehra , George R. Lynch , James Patrick Robertson , Gregory Alan Muthler , Wishwesh Anil Gandhi , Nick Barrow-Williams
CPC classification number: G06F12/0842 , G06F11/1004 , G06F12/0886 , G06F2212/1016 , G11C7/1006 , G11C7/1072
Abstract: A method for managing memory traffic includes causing first data to be written to a data cache memory, where a first write request comprises a partial write and writes the first data to a first portion of the data cache memory, and further includes tracking the number of partial writes in the data cache memory. The method further includes issuing a fill request for one or more partial writes in the data cache memory if the number of partial writes in the data cache memory is greater than a predetermined first threshold.
Abstract translation: 一种用于管理存储器流量的方法包括使第一数据被写入数据高速缓冲存储器,其中第一写入请求包括部分写入,并将第一数据写入数据高速缓冲存储器的第一部分,并且还包括跟踪数据高速缓冲存储器的数量 部分写入数据高速缓冲存储器。 该方法还包括如果数据高速缓冲存储器中的部分写入数大于预定的第一阈值,则向数据高速缓冲存储器发出一个或多个部分写入的填充请求。
-
公开(公告)号:US10338820B2
公开(公告)日:2019-07-02
申请号:US15176082
申请日:2016-06-07
Applicant: NVIDIA Corporation
Inventor: Rouslan Dimitrov , Jeff Pool , Praveen Krishnamurthy , Chris Amsinck , Karan Mehra , Scott Cutler
Abstract: A system architecture conserves memory bandwidth by including compression utility to process data transfers from the cache into external memory. The cache decompresses transfers from external memory and transfers full format data to naive clients that lack decompression capability and directly transfers compressed data to savvy clients that include decompression capability. An improved compression algorithm includes software that computes the difference between the current data word and each of a number of prior data words. Software selects the prior data word with the smallest difference as the nearest match and encodes the bit width of the difference to this data word. Software then encodes the difference between the current stride and the closest previous stride. Software combines the stride, bit width, and difference to yield final encoded data word. Software may encode the stride of one data word as a value relative to the stride of a previous data word.
-
公开(公告)号:US09934145B2
公开(公告)日:2018-04-03
申请号:US14925922
申请日:2015-10-28
Applicant: NVIDIA CORPORATION
Inventor: Praveen Krishnamurthy , Peter B. Holmquist , Wishwesh Gandhi , Timothy Purcell , Karan Mehra , Lacky Shah
IPC: G06F12/08 , G06F12/0802 , G06F3/06
CPC classification number: G06F12/0802 , G06F3/0608 , G06F3/064 , G06F3/0673 , G06F12/0842 , G06F12/0844 , G06F12/0848 , G06F12/0851 , G06F12/0853 , G06F12/0895 , G06F2212/1016 , G06F2212/401 , G06F2212/608
Abstract: In one embodiment of the present invention a cache unit organizes data stored in an attached memory to optimize accesses to compressed data. In operation, the cache unit introduces a layer of indirection between a physical address associated with a memory access request and groups of blocks in the attached memory. The layer of indirection—virtual tiles—enables the cache unit to selectively store compressed data that would conventionally be stored in separate physical tiles included in a group of blocks in a single physical tile. Because the cache unit stores compressed data associated with multiple physical tiles in a single physical tile and, more specifically, in adjacent locations within the single physical tile, the cache unit coalesces the compressed data into contiguous blocks. Subsequently, upon performing a read operation, the cache unit may retrieve the compressed data conventionally associated with separate physical tiles in a single read operation.
-
-
-
-
-