-
公开(公告)号:US20140168245A1
公开(公告)日:2014-06-19
申请号:US13720745
申请日:2012-12-19
Applicant: NVIDIA CORPORATION
Inventor: Brian Fahs , Eric T. Anderson , Nick Barrow-Williams , Shirish Gadre , Joel James McCormack , Bryon S. Nordquist , Nirmal Raj Saxena , Lacky V. Shah
IPC: G06F13/14
CPC classification number: G06F13/14 , G06T1/20 , G06T1/60 , G06T15/005 , G06T2210/36
Abstract: A texture processing pipeline can be configured to service memory access requests that represent texture data access operations or generic data access operations. When the texture processing pipeline receives a memory access request that represents a texture data access operation, the texture processing pipeline may retrieve texture data based on texture coordinates. When the memory access request represents a generic data access operation, the texture pipeline extracts a virtual address from the memory access request and then retrieves data based on the virtual address. The texture processing pipeline is also configured to cache generic data retrieved on behalf of a group of threads and to then invalidate that generic data when the group of threads exits.
-
公开(公告)号:US10037228B2
公开(公告)日:2018-07-31
申请号:US13660763
申请日:2012-10-25
Applicant: NVIDIA Corporation
Inventor: Nick Barrow-Williams , Brian Fahs , Jerome F. Duluk, Jr. , James Leroy Deming , Timothy John Purcell , Lucien Dunning , Mark Hairgrove
IPC: G06F12/00 , G06F9/50 , G06F12/1045 , G06F12/109
CPC classification number: G06F9/5027 , G06F12/1036 , G06F12/1045 , G06F12/109
Abstract: A technique for simultaneously executing multiple tasks, each having an independent virtual address space, involves assigning an address space identifier (ASID) to each task and constructing each virtual memory access request to include both a virtual address and the ASID. During virtual to physical address translation, the ASID selects a corresponding page table, which includes virtual to physical address mappings for the ASID and associated task. Entries for a translation look-aside buffer (TLB) include both the virtual address and ASID to complete each mapping to a physical address. Deep scheduling of tasks sharing a virtual address space may be implemented to improve cache affinity for both TLB and data caches.
-
公开(公告)号:US09697006B2
公开(公告)日:2017-07-04
申请号:US13720746
申请日:2012-12-19
Applicant: NVIDIA CORPORATION
Inventor: Brian Fahs , Eric T. Anderson , Nick Barrow-Williams , Shirish Gadre , Joel James McCormack , Bryon S. Nordquist , Nirmal Raj Saxena , Lacky V. Shah
IPC: G06F12/10 , G06F9/38 , G06F12/0844 , G06F12/0815 , G06F12/02
CPC classification number: G06F9/3887 , G06F9/3851 , G06F12/0284 , G06F12/0815 , G06F12/0844 , G06F2209/5018 , G06F2212/604
Abstract: A texture processing pipeline can be configured to service memory access requests that represent texture data access operations or generic data access operations. When the texture processing pipeline receives a memory access request that represents a texture data access operation, the texture processing pipeline may retrieve texture data based on texture coordinates. When the memory access request represents a generic data access operation, the texture pipeline extracts a virtual address from the memory access request and then retrieves data based on the virtual address. The texture processing pipeline is also configured to cache generic data retrieved on behalf of a group of threads and to then invalidate that generic data when the group of threads exits.
-
公开(公告)号:US09348762B2
公开(公告)日:2016-05-24
申请号:US13720755
申请日:2012-12-19
Applicant: NVIDIA CORPORATION
Inventor: Brian Fahs , Eric T. Anderson , Nick Barrow-Williams , Shirish Gadre , Joel James McCormack , Bryon S. Nordquist , Nirmal Raj Saxena , Lacky V. Shah
IPC: G06F12/10
CPC classification number: G06F12/1027 , G06F12/1018
Abstract: A tag unit configured to manage a cache unit includes a coalescer that implements a set hashing function. The set hashing function maps a virtual address to a particular content-addressable memory unit (CAM). The coalescer implements the set hashing function by splitting the virtual address into upper, middle, and lower portions. The upper portion is further divided into even-indexed bits and odd-indexed bits. The even-indexed bits are reduced to a single bit using a XOR tree, and the odd-indexed are reduced in like fashion. Those single bits are combined with the middle portion of the virtual address to provide a CAM number that identifies a particular CAM. The identified CAM is queried to determine the presence of a tag portion of the virtual address, indicating a cache hit or cache miss.
Abstract translation: 配置为管理高速缓存单元的标签单元包括实现集合散列函数的聚结器。 集合散列函数将虚拟地址映射到特定的内容可寻址存储器单元(CAM)。 聚合器通过将虚拟地址分割成上部,中部和下部来实现集合散列函数。 上部分进一步分为偶数位和奇数索引位。 使用XOR树将偶数索引位减少到单个位,并且奇数索引以类似的方式减少。 这些单个位与虚拟地址的中间部分组合以提供识别特定CAM的CAM号码。 查询所识别的CAM以确定虚拟地址的标签部分的存在,指示高速缓存命中或高速缓存未命中。
-
公开(公告)号:US20140173193A1
公开(公告)日:2014-06-19
申请号:US13720755
申请日:2012-12-19
Applicant: NVIDIA CORPORATION
Inventor: Brian Fahs , Eric T. ANDERSON , Nick Barrow-Williams , Shirish GADRE , Joel James MCCORMACK , Bryon S. NORDQUIST , Nirmal Raj Saxena , Lacky V. Shah
CPC classification number: G06F12/1027 , G06F12/1018
Abstract: A tag unit configured to manage a cache unit includes a coalescer that implements a set hashing function. The set hashing function maps a virtual address to a particular content-addressable memory unit (CAM). The coalescer implements the set hashing function by splitting the virtual address into upper, middle, and lower portions. The upper portion is further divided into even-indexed bits and odd-indexed bits. The even-indexed bits are reduced to a single bit using a XOR tree, and the odd-indexed are reduced in like fashion. Those single bits are combined with the middle portion of the virtual address to provide a CAM number that identifies a particular CAM. The identified CAM is queried to determine the presence of a tag portion of the virtual address, indicating a cache hit or cache miss.
Abstract translation: 配置为管理高速缓存单元的标签单元包括实现集合散列函数的聚结器。 集合散列函数将虚拟地址映射到特定的内容可寻址存储器单元(CAM)。 聚合器通过将虚拟地址分割成上部,中部和下部来实现集合散列函数。 上部分进一步分为偶数位和奇数索引位。 使用XOR树将偶数索引位减少到单个位,并且奇数索引以类似的方式减少。 这些单个位与虚拟地址的中间部分组合以提供识别特定CAM的CAM号码。 查询所识别的CAM以确定虚拟地址的标签部分的存在,指示高速缓存命中或高速缓存未命中。
-
公开(公告)号:US10310973B2
公开(公告)日:2019-06-04
申请号:US13660815
申请日:2012-10-25
Applicant: NVIDIA Corporation
Inventor: Nick Barrow-Williams , Brian Fahs , Jerome F. Duluk, Jr. , James Leroy Deming , Timothy John Purcell , Lucien Dunning , Mark Hairgrove
IPC: G06F12/00 , G06F12/08 , G06F12/1009
Abstract: A technique for simultaneously executing multiple tasks, each having an independent virtual address space, involves assigning an address space identifier (ASID) to each task and constructing each virtual memory access request to include both a virtual address and the ASID. During virtual to physical address translation, the ASID selects a corresponding page table, which includes virtual to physical address mappings for the ASID and associated task. Entries for a translation look-aside buffer (TLB) include both the virtual address and ASID to complete each mapping to a physical address. Deep scheduling of tasks sharing a virtual address space may be implemented to improve cache affinity for both TLB and data caches.
-
公开(公告)号:US10169091B2
公开(公告)日:2019-01-01
申请号:US13660799
申请日:2012-10-25
Applicant: NVIDIA Corporation
Inventor: Nick Barrow-Williams , Brian Fahs , Jerome F. Duluk, Jr. , James Leroy Deming , Timothy John Purcell , Lucien Dunning , Mark Hairgrove
IPC: G06F9/46 , G06F15/173 , G06F9/50 , G06F12/1045 , G06F9/48 , G06F9/455 , G06F12/109 , G06F12/1036
Abstract: A technique for simultaneously executing multiple tasks, each having an independent virtual address space, involves assigning an address space identifier (ASID) to each task and constructing each virtual memory access request to include both a virtual address and the ASID. During virtual to physical address translation, the ASID selects a corresponding page table, which includes virtual to physical address mappings for the ASID and associated task. Entries for a translation look-aside buffer (TLB) include both the virtual address and ASID to complete each mapping to a physical address. Deep scheduling of tasks sharing a virtual address space may be implemented to improve cache affinity for both TLB and data caches.
-
公开(公告)号:US09720858B2
公开(公告)日:2017-08-01
申请号:US13720745
申请日:2012-12-19
Applicant: NVIDIA CORPORATION
Inventor: Brian Fahs , Eric T. Anderson , Nick Barrow-Williams , Shirish Gadre , Joel James McCormack , Bryon S. Nordquist , Nirmal Raj Saxena , Lacky V. Shah
CPC classification number: G06F13/14 , G06T1/20 , G06T1/60 , G06T15/005 , G06T2210/36
Abstract: A texture processing pipeline can be configured to service memory access requests that represent texture data access operations or generic data access operations. When the texture processing pipeline receives a memory access request that represents a texture data access operation, the texture processing pipeline may retrieve texture data based on texture coordinates. When the memory access request represents a generic data access operation, the texture pipeline extracts a virtual address from the memory access request and then retrieves data based on the virtual address. The texture processing pipeline is also configured to cache generic data retrieved on behalf of a group of threads and to then invalidate that generic data when the group of threads exits.
-
公开(公告)号:US09110809B2
公开(公告)日:2015-08-18
申请号:US13935414
申请日:2013-07-03
Applicant: NVIDIA Corporation
Inventor: Peter B. Holmqvist , Karan Mehra , George R. Lynch , James Patrick Robertson , Gregory Alan Muthler , Wishwesh Anil Gandhi , Nick Barrow-Williams
CPC classification number: G06F12/0842 , G06F11/1004 , G06F12/0886 , G06F2212/1016 , G11C7/1006 , G11C7/1072
Abstract: A method for managing memory traffic includes causing first data to be written to a data cache memory, where a first write request comprises a partial write and writes the first data to a first portion of the data cache memory, and further includes tracking the number of partial writes in the data cache memory. The method further includes issuing a fill request for one or more partial writes in the data cache memory if the number of partial writes in the data cache memory is greater than a predetermined first threshold.
Abstract translation: 一种用于管理存储器流量的方法包括使第一数据被写入数据高速缓冲存储器,其中第一写入请求包括部分写入,并将第一数据写入数据高速缓冲存储器的第一部分,并且还包括跟踪数据高速缓冲存储器的数量 部分写入数据高速缓冲存储器。 该方法还包括如果数据高速缓冲存储器中的部分写入数大于预定的第一阈值,则向数据高速缓冲存储器发出一个或多个部分写入的填充请求。
-
-
-
-
-
-
-
-