-
公开(公告)号:US10468093B2
公开(公告)日:2019-11-05
申请号:US15448416
申请日:2017-03-02
Applicant: NVIDIA Corporation
Inventor: Niladrish Chatterjee , James Michael O'Connor , Daniel Robert Johnson
IPC: G06F12/02 , G11C11/4091 , G11C7/10 , G11C8/12 , G11C11/408 , G11C11/4093 , G11C11/4096 , G06F12/06
Abstract: A method and system for a DRAM having a first bank that includes a first sub-array (SA) and a second SA. The first SA includes a first storage unit coupled to a first row-buffer in a first sub-channel (FSC) and a second storage unit in a second sub-channel (SSC). The second SA includes a third storage unit and a fourth storage unit coupled to a second row-buffer. The first SA is associated with a first row address (RA) and the FSC is associated with a first column address (CA) stored in the FSC. The second SA is associated with a second RA and the SSC is associated with a second CA stored in the SSC. The first and second CAs are used to select portions of data from the first and second row-buffers, respectively, for output to a data bus.
-
公开(公告)号:US20240281300A1
公开(公告)日:2024-08-22
申请号:US18528333
申请日:2023-12-04
Applicant: NVIDIA Corporation
Inventor: Donghyuk Lee , Leul Wuletaw Belayneh , Niladrish Chatterjee , James Michael O'Connor
CPC classification number: G06F9/5083 , G06F9/542 , G06F2209/509
Abstract: An initiating processing tile generates an offload request that may include a processing tile ID, source data needed for the computation, program counter, and destination location where the computation result is stored. The offload processing tile may execute the offloaded computation. Alternatively, the offload processing tile may deny the offload request based on congestion criteria. The congestion criteria may include a processing workload measure, whether a resource needed to perform the computation is available, and an offload request buffer fullness. In an embodiment, the denial message that is returned to the initiating processing tile may include the data needed to perform the computation (read from the local memory of the offload processing tile). Returning the data with the denial message results in the same inter-processing tile traffic that would occur if no attempt to offload the computation were initiated.
-
公开(公告)号:US20240256153A1
公开(公告)日:2024-08-01
申请号:US18163167
申请日:2023-02-01
Applicant: NVIDIA Corporation
IPC: G06F3/06
CPC classification number: G06F3/0625 , G06F3/0644 , G06F3/0659 , G06F3/0673
Abstract: Embodiments of the present disclosure relate to memory page access instrumentation for generating a memory access profile. The memory access profile may be used to co-locate data near the processing unit that accesses the data, reducing memory access energy by minimizing distances to access data that is co-located with a different processing unit (i.e., remote data). Execution thread arrays and memory pages for execution of a program are partitioned across multiple processing units. The partitions are then each mapped to a specific processing unit to minimize inter-partition traffic given the processing unit physical topology.
-
公开(公告)号:US12141451B2
公开(公告)日:2024-11-12
申请号:US18163167
申请日:2023-02-01
Applicant: NVIDIA Corporation
Abstract: Embodiments of the present disclosure relate to memory page access instrumentation for generating a memory access profile. The memory access profile may be used to co-locate data near the processing unit that accesses the data, reducing memory access energy by minimizing distances to access data that is co-located with a different processing unit (i.e., remote data). Execution thread arrays and memory pages for execution of a program are partitioned across multiple processing units. The partitions are then each mapped to a specific processing unit to minimize inter-partition traffic given the processing unit physical topology.
-
公开(公告)号:US11609879B2
公开(公告)日:2023-03-21
申请号:US17365315
申请日:2021-07-01
Applicant: NVIDIA CORPORATION
Inventor: Yaosheng Fu , Evgeny Bolotin , Niladrish Chatterjee , Stephen William Keckler , David Nellans
IPC: G06F15/78 , G06F12/0811 , G06F12/12 , G06F13/40
Abstract: In various embodiments, a parallel processor includes a parallel processor module implemented within a first die and a memory system module implemented within a second die. The memory system module is coupled to the parallel processor module via an on-package link. The parallel processor module includes multiple processor cores and multiple cache memories. The memory system module includes a memory controller for accessing a DRAM. Advantageously, the performance of the parallel processor module can be effectively tailored for memory bandwidth demands that typify one or more application domains via the memory system module.
-
公开(公告)号:US20220342595A1
公开(公告)日:2022-10-27
申请号:US17237165
申请日:2021-04-22
Applicant: NVIDIA Corporation
Inventor: Niladrish Chatterjee , James Michael O'Connor , Donghyuk Lee , Gaurav Uttreja , Wishwesh Anil Gandhi
Abstract: A combined on-package and off-package memory system uses a custom base-layer within which are fabricated one or more dedicated interfaces to off-package memories. An on-package processor and on-package memories are also directly coupled to the custom base-layer. The custom base-layer includes memory management logic between the processor and memories (both off and on package) to steer requests. The memories are exposed as a combined memory space having greater bandwidth and capacity compared with either the off-package memories or the on-package memories alone. The memory management logic services requests while maintaining quality of service (QoS) to satisfy bandwidth requirements for each allocation. An allocation may include any combination of the on and/or off package memories. The memory management logic also manages data migration between the on and off package memories.
-
公开(公告)号:US20170255552A1
公开(公告)日:2017-09-07
申请号:US15448416
申请日:2017-03-02
Applicant: NVIDIA Corporation
Inventor: Niladrish Chatterjee , James Michael O'Connor , Daniel Robert Johnson
IPC: G06F12/06 , G11C11/4091
CPC classification number: G11C11/4091 , G06F12/0215 , G06F12/0607 , G06F2212/1028 , G06F2212/1041 , G11C7/1012 , G11C8/12 , G11C11/4087 , G11C11/4093 , G11C11/4096 , Y02D10/13
Abstract: A method and system for a DRAM having a first bank that includes a first sub-array (SA) and a second SA. The first SA includes a first storage unit coupled to a first row-buffer in a first sub-channel (FSC) and a second storage unit in a second sub-channel (SSC). The second SA includes a third storage unit and a fourth storage unit coupled to a second row-buffer. The first SA is associated with a first row address (RA) and the FSC is associated with a first column address (CA) stored in the FSC. The second SA is associated with a second RA and the SSC is associated with a second CA stored in the SSC. The first and second CAs are used to select portions of data from the first and second row-buffers, respectively, for output to a data bus.
-
公开(公告)号:US12001725B2
公开(公告)日:2024-06-04
申请号:US18454693
申请日:2023-08-23
Applicant: NVIDIA Corporation
Inventor: Niladrish Chatterjee , James Michael O'Connor , Donghyuk Lee , Gaurav Uttreja , Wishwesh Anil Gandhi
CPC classification number: G06F3/0659 , G06F3/0604 , G06F3/0673 , G06F12/0607 , G06F12/10 , G06F2212/151 , G06F2212/154 , G06F2212/657 , H01L25/18
Abstract: A combined on-package and off-package memory system uses a custom base-layer within which are fabricated one or more dedicated interfaces to off-package memories. An on-package processor and on-package memories are also directly coupled to the custom base-layer. The custom base-layer includes memory management logic between the processor and memories (both off and on package) to steer requests. The memories are exposed as a combined memory space having greater bandwidth and capacity compared with either the off-package memories or the on-package memories alone. The memory management logic services requests while maintaining quality of service (QoS) to satisfy bandwidth requirements for each allocation. An allocation may include any combination of the on and/or off package memories. The memory management logic also manages data migration between the on and off package memories.
-
公开(公告)号:US11789649B2
公开(公告)日:2023-10-17
申请号:US17237165
申请日:2021-04-22
Applicant: NVIDIA Corporation
Inventor: Niladrish Chatterjee , James Michael O'Connor , Donghyuk Lee , Gaurav Uttreja , Wishwesh Anil Gandhi
CPC classification number: G06F3/0659 , G06F3/0604 , G06F3/0673 , G06F12/0607 , G06F12/10 , G06F2212/151 , G06F2212/154 , G06F2212/657 , H01L25/18
Abstract: A combined on-package and off-package memory system uses a custom base-layer within which are fabricated one or more dedicated interfaces to off-package memories. An on-package processor and on-package memories are also directly coupled to the custom base-layer. The custom base-layer includes memory management logic between the processor and memories (both off and on package) to steer requests. The memories are exposed as a combined memory space having greater bandwidth and capacity compared with either the off-package memories or the on-package memories alone. The memory management logic services requests while maintaining quality of service (QoS) to satisfy bandwidth requirements for each allocation. An allocation may include any combination of the on and/or off package memories. The memory management logic also manages data migration between the on and off package memories.
-
-
-
-
-
-
-
-