Patent search ap:("NVIDIA Corporation") AND inv:"Wishwesh Anil Gandhi" Page 1

1.

发明授权
Combined on-package and off-package memory system 有权

公开(公告)号：US12001725B2

公开(公告)日：2024-06-04

申请号：US18454693

申请日：2023-08-23

Applicant: NVIDIA Corporation

Inventor： Niladrish Chatterjee , James Michael O'Connor , Donghyuk Lee , Gaurav Uttreja , Wishwesh Anil Gandhi

IPC: G06F3/06 , G06F12/06 , G06F12/10 , H01L25/18

CPC classification number: G06F3/0659 , G06F3/0604 , G06F3/0673 , G06F12/0607 , G06F12/10 , G06F2212/151 , G06F2212/154 , G06F2212/657 , H01L25/18

Abstract: A combined on-package and off-package memory system uses a custom base-layer within which are fabricated one or more dedicated interfaces to off-package memories. An on-package processor and on-package memories are also directly coupled to the custom base-layer. The custom base-layer includes memory management logic between the processor and memories (both off and on package) to steer requests. The memories are exposed as a combined memory space having greater bandwidth and capacity compared with either the off-package memories or the on-package memories alone. The memory management logic services requests while maintaining quality of service (QoS) to satisfy bandwidth requirements for each allocation. An allocation may include any combination of the on and/or off package memories. The memory management logic also manages data migration between the on and off package memories.

2.

发明授权
Combined on-package and off-package memory system 有权

公开(公告)号：US11789649B2

公开(公告)日：2023-10-17

申请号：US17237165

申请日：2021-04-22

Applicant: NVIDIA Corporation

Inventor： Niladrish Chatterjee , James Michael O'Connor , Donghyuk Lee , Gaurav Uttreja , Wishwesh Anil Gandhi

IPC: G06F3/06 , G06F12/06 , G06F12/10 , H01L25/18

CPC classification number: G06F3/0659 , G06F3/0604 , G06F3/0673 , G06F12/0607 , G06F12/10 , G06F2212/151 , G06F2212/154 , G06F2212/657 , H01L25/18

Abstract: A combined on-package and off-package memory system uses a custom base-layer within which are fabricated one or more dedicated interfaces to off-package memories. An on-package processor and on-package memories are also directly coupled to the custom base-layer. The custom base-layer includes memory management logic between the processor and memories (both off and on package) to steer requests. The memories are exposed as a combined memory space having greater bandwidth and capacity compared with either the off-package memories or the on-package memories alone. The memory management logic services requests while maintaining quality of service (QoS) to satisfy bandwidth requirements for each allocation. An allocation may include any combination of the on and/or off package memories. The memory management logic also manages data migration between the on and off package memories.

3.

发明授权
Techniques for configuring a processor to function as multiple, separate processors 有权

公开(公告)号：US11635986B2

公开(公告)日：2023-04-25

申请号：US16562359

申请日：2019-09-05

Applicant: NVIDIA CORPORATION

Inventor： Jerome F. Duluk, Jr. , Gregory Scott Palmer , Jonathon Stuart Ramsey Evans , Shailendra Singh , Samuel H. Duncan , Wishwesh Anil Gandhi , Lacky V. Shah , Eric Rock , Feiqi Su , James Leroy Deming , Alan Menezes , Pranav Vaidya , Praveen Joginipally , Timothy John Purcell , Manas Mandal

IPC: G06F9/48 , G06F9/46 , G06T1/20

Abstract: A parallel processing unit (PPU) can be divided into partitions. Each partition is configured to operate similarly to how the entire PPU operates. A given partition includes a subset of the computational and memory resources associated with the entire PPU. Software that executes on a CPU partitions the PPU for an admin user. A guest user is assigned to a partition and can perform processing tasks within that partition in isolation from any other guest users assigned to any other partitions. Because the PPU can be divided into isolated partitions, multiple CPU processes can efficiently utilize PPU resources.

4.

发明授权
Techniques for scaling dictionary-based compression 有权

公开(公告)号：US11263051B2

公开(公告)日：2022-03-01

申请号：US16866811

申请日：2020-05-05

Applicant: NVIDIA Corporation

Inventor： Ram Rangan , Suryakant Patidar , Praveen Krishnamurthy , Wishwesh Anil Gandhi

IPC: G06F9/50 , G06F9/30 , G06F12/02 , G06F9/38 , H03M7/30 , H03M7/42

Abstract: Accesses between a processor and its external memory is reduced when the processor internally maintains a compressed version of values stored in the external memory. The processor can then refer to the compressed version rather than access the external memory. One compression technique involves maintaining a dictionary on the processor mapping portions of a memory to values. When all of the values of a portion of memory are uniform (e.g., the same), the value is stored in the dictionary for that portion of memory. Thereafter, when the processor needs to access that portion of memory, the value is retrieved from the dictionary rather than from external memory. Techniques are disclosed herein to extend, for example, the capabilities of such dictionary-based compression so that the amount of accesses between the processor and its external memory are further reduced.

5.

发明授权
Reducing memory traffic in DRAM ECC mode 有权
Title translation: 降低DRAM ECC模式下的内存流量

公开(公告)号：US09110809B2

公开(公告)日：2015-08-18

申请号：US13935414

申请日：2013-07-03

Applicant: NVIDIA Corporation

Inventor： Peter B. Holmqvist , Karan Mehra , George R. Lynch , James Patrick Robertson , Gregory Alan Muthler , Wishwesh Anil Gandhi , Nick Barrow-Williams

IPC: G06F12/00 , G06F12/08 , G11C7/10

CPC classification number: G06F12/0842 , G06F11/1004 , G06F12/0886 , G06F2212/1016 , G11C7/1006 , G11C7/1072

Abstract: A method for managing memory traffic includes causing first data to be written to a data cache memory, where a first write request comprises a partial write and writes the first data to a first portion of the data cache memory, and further includes tracking the number of partial writes in the data cache memory. The method further includes issuing a fill request for one or more partial writes in the data cache memory if the number of partial writes in the data cache memory is greater than a predetermined first threshold.

Abstract translation: 一种用于管理存储器流量的方法包括使第一数据被写入数据高速缓冲存储器，其中第一写入请求包括部分写入，并将第一数据写入数据高速缓冲存储器的第一部分，并且还包括跟踪数据高速缓冲存储器的数量部分写入数据高速缓冲存储器。该方法还包括如果数据高速缓冲存储器中的部分写入数大于预定的第一阈值，则向数据高速缓冲存储器发出一个或多个部分写入的填充请求。

6.

发明授权
Techniques for configuring a processor to function as multiple, separate processors 有权

公开(公告)号：US11893423B2

公开(公告)日：2024-02-06

申请号：US16562367

申请日：2019-09-05

Applicant: NVIDIA CORPORATION

Inventor： Jerome F. Duluk, Jr. , Gregory Scott Palmer , Jonathon Stuart Ramsey Evans , Shailendra Singh , Samuel H. Duncan , Wishwesh Anil Gandhi , Lacky V. Shah , Sonata Gale Wen , Feiqi Su , James Leroy Deming , Alan Menezes , Pranav Vaidya , Praveen Joginipally , Timothy John Purcell , Manas Mandal

IPC: G06F9/50 , G06F9/38 , G06F1/3296 , G06F1/04

CPC classification number: G06F9/5061 , G06F1/04 , G06F1/3296 , G06F9/3877 , G06F9/5027

Abstract: A parallel processing unit (PPU) can be divided into partitions. Each partition is configured to operate similarly to how the entire PPU operates. A given partition includes a subset of the computational and memory resources associated with the entire PPU. Software that executes on a CPU partitions the PPU for an admin user. A guest user is assigned to a partition and can perform processing tasks within that partition in isolation from any other guest users assigned to any other partitions. Because the PPU can be divided into isolated partitions, multiple CPU processes can efficiently utilize PPU resources.

7.

发明授权
Techniques for configuring a processor to function as multiple, separate processors 有权

公开(公告)号：US11663036B2

公开(公告)日：2023-05-30

申请号：US16562359

申请日：2019-09-05

Applicant: NVIDIA CORPORATION

Inventor： Jerome F. Duluk, Jr. , Gregory Scott Palmer , Jonathon Stuart Ramsey Evans , Shailendra Singh , Samuel H. Duncan , Wishwesh Anil Gandhi , Lacky V. Shah , Eric Rock , Feiqi Su , James Leroy Deming , Alan Menezes , Pranav Vaidya , Praveen Joginipally , Timothy John Purcell , Manas Mandal

IPC: G06F9/48 , G06F9/46 , G06T1/20

CPC classification number: G06F9/485 , G06F9/461 , G06T1/20

Abstract: A parallel processing unit (PPU) can be divided into partitions. Each partition is configured to operate similarly to how the entire PPU operates. A given partition includes a subset of the computational and memory resources associated with the entire PPU. Software that executes on a CPU partitions the PPU for an admin user. A guest user is assigned to a partition and can perform processing tasks within that partition in isolation from any other guest users assigned to any other partitions. Because the PPU can be divided into isolated partitions, multiple CPU processes can efficiently utilize PPU resources.

8.

发明申请
COMBINED ON-PACKAGE AND OFF-PACKAGE MEMORY SYSTEM 有权

公开(公告)号：US20220342595A1

公开(公告)日：2022-10-27

申请号：US17237165

申请日：2021-04-22

Applicant: NVIDIA Corporation

Inventor： Niladrish Chatterjee , James Michael O'Connor , Donghyuk Lee , Gaurav Uttreja , Wishwesh Anil Gandhi

IPC: G06F3/06 , G06F12/06 , G06F12/10

Abstract: A combined on-package and off-package memory system uses a custom base-layer within which are fabricated one or more dedicated interfaces to off-package memories. An on-package processor and on-package memories are also directly coupled to the custom base-layer. The custom base-layer includes memory management logic between the processor and memories (both off and on package) to steer requests. The memories are exposed as a combined memory space having greater bandwidth and capacity compared with either the off-package memories or the on-package memories alone. The memory management logic services requests while maintaining quality of service (QoS) to satisfy bandwidth requirements for each allocation. An allocation may include any combination of the on and/or off package memories. The memory management logic also manages data migration between the on and off package memories.

9.

发明申请
Coherent Caching of Data for High Bandwidth Scaling 审中-公开

公开(公告)号：US20200089611A1

公开(公告)日：2020-03-19

申请号：US16134379

申请日：2018-09-18

Applicant: NVIDIA Corporation

Inventor： Wishwesh Anil Gandhi , Tanmoy Mandal , Ravi Kiran Manyam , Supriya Shrihari Rao

IPC: G06F12/0815 , G06F12/0808 , G06F12/0813 , G06F13/16 , G06F13/40

Abstract: A method, computer readable medium, and system are disclosed for a distributed cache that provides multiple processing units with fast access to a portion of data, which is stored in local memory. The distributed cache is composed of multiple smaller caches, and each of the smaller caches is associated with at least one processing unit. In addition to a shared crossbar network through which data is transferred between processing units and the smaller caches, a dedicated connection is provided between two or more smaller caches that form a partner cache set. Transferring data through the dedicated connections reduces congestion on the shared crossbar network. Reducing congestion on the shared crossbar network increases the available bandwidth and allows the number of processing units to increase. A coherence protocol is defined for accessing data stored in the distributed cache and for transferring data between the smaller caches of a partner cache set.

10.

发明授权
Techniques for storing ECC checkbits in a level two cache 有权
Title translation: 将ECC校验位存储在二级缓存中的技术

公开(公告)号：US08984372B2

公开(公告)日：2015-03-17

申请号：US13683599

申请日：2012-11-21

Applicant: NVIDIA Corporation

Inventor： Wishwesh Anil Gandhi , Nirmal Raj Saxena

IPC: G11C29/00 , G06F11/10

CPC classification number: G06F11/10 , G06F11/1064

Abstract: A partition unit that includes a cache for storing both data and error-correcting code (ECC) checkbits associated with the data is disclosed. When a read command corresponding to particular data stored in a memory unit results in a cache miss, the partition unit transmits a read request to the memory unit to fetch the data and store the data in the cache. The partition unit checks the cache to determine if ECC checkbits associated with the data are stored in the cache and, if the ECC checkbits are not in the cache, the partition unit transmits a read request to the memory unit to fetch the ECC checkbits and store the ECC checkbits in the cache. The ECC checkbits and the data may then be compared to determine the reliability of the data using an error-correcting scheme such as SEC-DED (i.e., single error-correcting, double error-detecting).

Abstract translation: 公开了一种分区单元，其包括用于存储与数据相关联的数据和纠错码（ECC）校验码的高速缓存。当对应于存储在存储单元中的特定数据的读取命令导致高速缓存未命中时，分区单元向存储器单元发送读取请求以获取数据并将数据存储在高速缓存中。分区单元检查高速缓存以确定与数据相关联的ECC校验位是否存储在高速缓存中，并且如果ECC校验位不在高速缓存中，则分区单元向存储器单元发送读取请求以获取ECC校验位并存储缓存中的ECC校验位。然后可以比较ECC校验位和数据，以使用诸如SEC-DED的纠错方案（即，单错误校正，双重错误检测）来确定数据的可靠性。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification