-
公开(公告)号:US20230297269A1
公开(公告)日:2023-09-21
申请号:US17683292
申请日:2022-02-28
Applicant: NVIDIA Corporation
Inventor: William James Dally , Carl Thomas Gray , Stephen W. Keckler , James Michael O’Connor
IPC: G06F3/06
CPC classification number: G06F3/0655 , G06F3/0604 , G06F3/0679
Abstract: A hierarchical network enables access for a stacked memory system including or more memory dies that each include multiple memory tiles. The processor die includes multiple processing tiles that are stacked with the one or more memory die. The memory tiles that are vertically aligned with a processing tile are directly coupled to the processing tile and comprise the local memory block for the processing tile. The hierarchical network provides access paths for each processing tile to access the processing tile’s local memory block, the local memory block coupled to a different processing tile within the same processing die, memory tiles in a different die stack, and memory tiles in a different device. The ratio of memory bandwidth (byte) to floating-point operation (B:F) may improve 50x for accessing the local memory block compared with conventional memory. Additionally, the energy consumed to transfer each bit may be reduced by 10x.
-
公开(公告)号:US20230275068A1
公开(公告)日:2023-08-31
申请号:US17683290
申请日:2022-02-28
Applicant: NVIDIA Corporation
Inventor: William James Dally , Carl Thomas Gray , Stephen W. Keckler , James Michael O'Connor
IPC: H01L25/065
CPC classification number: H01L25/0657 , H01L2225/06565 , H01L27/11517
Abstract: Embodiments of the present disclosure relate to memory stacked on processor for high bandwidth. Systems and methods are disclosed for providing a one-level memory for a processing system by stacking bulk memory on a processor die. In an embodiment, one or more memory dies are stacked on the processor die. The processor die includes multiple processing tiles, where each tile includes a processing unit, mapper, and tile network. Each memory die includes multiple memory tiles. The processing tile is coupled to each memory tile that is above or below the processing tile. The vertically aligned memory tiles comprise the local memory block for the processing tile. The ratio of memory bandwidth (byte) to floating-point operation (B:F) may improve 50× for accessing the local memory block compared with conventional memory. Additionally, the energy consumed to transfer each bit may be reduced by 10×.
-
公开(公告)号:US11476852B2
公开(公告)日:2022-10-18
申请号:US17324941
申请日:2021-05-19
Applicant: NVIDIA Corporation
Inventor: William James Dally
IPC: H03K19/003 , G06N3/063
Abstract: When a signal glitches, logic receiving the signal may change in response, thereby charging and/or discharging nodes within the logic and dissipating power. Providing a glitch-free signal may reduce the number of times the nodes are charged and/or discharged, thereby reducing the power dissipation. A technique for eliminating glitches in a signal is to insert a storage element that samples the signal after it is done changing to produce a glitch-free output signal. The storage element is enabled by a “ready” signal having a delay that matches the delay of circuitry generating the signal. The technique prevents the output signal from changing until the final value of the signal is achieved. The output signal changes only once, typically reducing the number of times nodes in the logic receiving the signal are charged and/or discharged so that power dissipation is also reduced.
-
公开(公告)号:US20220006457A1
公开(公告)日:2022-01-06
申请号:US17324941
申请日:2021-05-19
Applicant: NVIDIA Corporation
Inventor: William James Dally
IPC: H03K19/003
Abstract: When a signal glitches, logic receiving the signal may change in response, thereby charging and/or discharging nodes within the logic and dissipating power. Providing a glitch-free signal may reduce the number of times the nodes are charged and/or discharged, thereby reducing the power dissipation. A technique for eliminating glitches in a signal is to insert a storage element that samples the signal after it is done changing to produce a glitch-free output signal. The storage element is enabled by a “ready” signal having a delay that matches the delay of circuitry generating the signal. The technique prevents the output signal from changing until the final value of the signal is achieved. The output signal changes only once, typically reducing the number of times nodes in the logic receiving the signal are charged and/or discharged so that power dissipation is also reduced.
-
公开(公告)号:US20210056397A1
公开(公告)日:2021-02-25
申请号:US16549683
申请日:2019-08-23
Applicant: NVIDIA Corporation
Inventor: William James Dally , Rangharajan Venkatesan , Brucek Kurdo Khailany
Abstract: Neural networks, in many cases, include convolution layers that are configured to perform many convolution operations that require multiplication and addition operations. Compared with performing multiplication on integer, fixed-point, or floating-point format values, performing multiplication on logarithmic format values is straightforward and energy efficient as the exponents are simply added. However, performing addition on logarithmic format values is more complex. Conventionally, addition is performed by converting the logarithmic format values to integers, computing the sum, and then converting the sum back into the logarithmic format. Instead, logarithmic format values may be added by decomposing the exponents into separate quotient and remainder components, sorting the quotient components based on the remainder components, summing the sorted quotient components to produce partial sums, and multiplying the partial sums by the remainder components to produce a sum. The sum may then be converted back into the logarithmic format.
-
公开(公告)号:US09287778B2
公开(公告)日:2016-03-15
申请号:US13647202
申请日:2012-10-08
Applicant: NVIDIA Corporation
Inventor: William James Dally
CPC classification number: H02M3/158 , H02M1/15 , H02M3/1582 , H02M3/1588 , H02M2003/1566 , Y02B70/1425
Abstract: Embodiments are disclosed relating to an electric power conversion device and methods for controlling the operation thereof. One disclosed embodiment provides an electric power conversion device comprising a first current control mechanism coupled to an electric power source and an upstream end of an inductor, where the first current control mechanism is operable to control inductor current. The electric power conversion device further comprises a second current control mechanism coupled between the downstream end of the inductor and a load, where the second current control mechanism is operable to control how much of the inductor current is delivered to the load.
Abstract translation: 公开了关于电力转换装置的实施例以及用于控制其操作的方法。 一个公开的实施例提供一种电力转换装置,其包括耦合到电源和电感器的上游端的第一电流控制机构,其中第一电流控制机构可操作以控制电感器电流。 电力转换装置还包括耦合在电感器的下游端和负载之间的第二电流控制机构,其中第二电流控制机构可操作以控制电感器电流传送到负载的电流。
-
公开(公告)号:US09178421B2
公开(公告)日:2015-11-03
申请号:US13663903
申请日:2012-10-30
Applicant: NVIDIA Corporation
Inventor: William James Dally
CPC classification number: H02M3/158 , H02M3/1582 , H02M2001/007 , H02M2003/1566
Abstract: Embodiments are disclosed relating to an electric power conversion device and methods for controlling the operation thereof. One disclosed embodiment provides a multi-stage electric power conversion device including a first regulator stage including a first stage energy storage device and a second regulator stage including a second stage energy storage device, the second stage energy storage device being operatively coupled between the first stage energy storage device and the load. The device further includes a control mechanism operative to control (i) a first stage output voltage on a node between the first stage energy storage device and the second stage energy storage device and (ii) a second stage output voltage on a node between the second stage energy storage device and the load.
Abstract translation: 公开了关于电力转换装置的实施例以及用于控制其操作的方法。 一个公开的实施例提供了一种多级电力转换装置,其包括第一调节器级,其包括第一级储能装置和包括第二级储能装置的第二调节器级,第二级储能装置可操作地耦合在第一级 储能装置和负载。 该装置还包括控制机构,其操作以控制(i)第一级能量存储装置与第二级能量存储装置之间的节点上的第一级输出电压和(ii)第二级能量存储装置之间的节点上的第二级输出电压 阶段储能装置和负载。
-
公开(公告)号:US20140219007A1
公开(公告)日:2014-08-07
申请号:US13761996
申请日:2013-02-07
Applicant: NVIDIA CORPORATION
Inventor: William James Dally
IPC: G11C11/4063
CPC classification number: G11C11/4091 , G11C11/4063 , G11C11/4085 , G11C11/4087 , G11C11/4099
Abstract: This description is directed to a dynamic random access memory (DRAM) array having a plurality of rows and a plurality of columns. The array further includes a plurality of cells, each of which are associated with one of the columns and one of the rows. Each cell includes a capacitor that is selectively coupled to a bit line of its associate column so as to share charge with the bit line when the cell is selected. There is a segmented word line circuit for each row, which is controllable to cause selection of only a portion of the cells in the row.
Abstract translation: 该描述涉及具有多行和多列的动态随机存取存储器(DRAM)阵列。 阵列还包括多个单元,每个单元与列中的一个和行中的一个相关联。 每个单元包括选择性地耦合到其相关列的位线的电容器,以便当选择单元时与位线共享电荷。 每行具有分段字线电路,其可控制以仅选择行中的一部分单元。
-
-
-
-
-
-
-