-
公开(公告)号:US20210247978A1
公开(公告)日:2021-08-12
申请号:US16859829
申请日:2020-04-27
Applicant: Samsung Electronics Co., Ltd.
Inventor: Krishna T. Malladi , Wenqin Huangfu
Abstract: According to one embodiment, a memory module includes: a memory die including a dynamic random access memory (DRAM) banks, each including: an array of DRAM cells arranged in pages; a row buffer to store values of one of the pages; an input/output (IO) module; and an in-memory compute (IMC) module including: an arithmetic logic unit (ALU) to receive operands from the row buffer or the IO module and to compute an output based on the operands and one of a plurality of ALU operations; and a result register to store the output of the ALU; and a controller to: receive, from a host processor, operands and an instruction; determine, based on the instruction, a data layout; supply the operands to the DRAM banks in accordance with the data layout; and control an IMC module to perform one of the ALU operations on the operands in accordance with the instruction.
-
公开(公告)号:US20200334012A1
公开(公告)日:2020-10-22
申请号:US16919043
申请日:2020-07-01
Applicant: Samsung Electronics Co., Ltd.
Inventor: Krishna T. Malladi , Peng Gu , Hongzhong Zheng , Robert Brennan
Abstract: A computing accelerator using a lookup table. The accelerator may accelerate floating point multiplications by retrieving the fraction portion of the product of two floating-point operands from a lookup table, or by retrieving the product of two floating-point operands of two floating-point operands from a lookup table, or it may retrieve dot products of floating point vectors from a lookup table. The accelerator may be implemented in a three-dimensional memory assembly. It may use approximation, the symmetry of a multiplication lookup table, and zero-skipping to improve performance.
-
公开(公告)号:US10732929B2
公开(公告)日:2020-08-04
申请号:US15916196
申请日:2018-03-08
Applicant: Samsung Electronics Co., Ltd.
Inventor: Krishna T. Malladi , Peng Gu , Hongzhong Zheng , Robert Brennan
Abstract: A computing accelerator using a lookup table. The accelerator may accelerate floating point multiplications by retrieving the fraction portion of the product of two floating-point operands from a lookup table, or by retrieving the product of two floating-point operands of two floating-point operands from a lookup table, or it may retrieve dot products of floating point vectors from a lookup table. The accelerator may be implemented in a three-dimensional memory assembly. It may use approximation, the symmetry of a multiplication lookup table, and zero-skipping to improve performance.
-
公开(公告)号:US10545860B2
公开(公告)日:2020-01-28
申请号:US15796743
申请日:2017-10-27
Applicant: Samsung Electronics Co., Ltd.
Inventor: Krishna T. Malladi , Hongzhong Zheng , Robert Brennan , Hyungseuk Kim , Jinhyun Kim
IPC: G06F12/02 , G06F12/0862 , G06F9/30
Abstract: Inventive aspects include An HBM+ system, comprising a host including at least one of a CPU, a GPU, an ASIC, or an FPGA; and an HBM+ stack including a plurality of HBM modules arranged one atop another, and a logic die disposed beneath the plurality of HBM modules. The logic die is configured to offload processing operations from the host. A system architecture is disclosed that provides specific compute capabilities in the logic die of high bandwidth memory along with the supporting hardware and software architectures, logic die microarchitecture, and memory interface signaling options. Various new methods are provided for using in-memory processing abilities of the logic die beneath an HBM memory stack. In addition, various new signaling protocols are disclosed to use an HBM interface. The logic die microarchitecture and supporting system framework are also described.
-
公开(公告)号:US10515006B2
公开(公告)日:2019-12-24
申请号:US15663619
申请日:2017-07-28
Applicant: Samsung Electronics Co., Ltd.
Inventor: Krishna T. Malladi , Jongmin Gim , Hongzhong Zheng
IPC: G06F12/02 , G06F13/16 , G06F12/121
Abstract: A pseudo main memory system. The system includes a memory adapter circuit for performing memory augmentation using compression, deduplication, and/or error correction. The memory adapter circuit is connected to a memory, and employs the memory augmentation methods to increase the effective storage capacity of the memory. The memory adapter circuit is also connected to a memory bus and implements an NVDIMM-F or modified NVDIMM-F interface for connecting to the memory bus.
-
公开(公告)号:US20190187898A1
公开(公告)日:2019-06-20
申请号:US15916228
申请日:2018-03-08
Applicant: Samsung Electronics Co., Ltd.
Inventor: Peng Gu , Krishna T. Malladi , Hongzhong Zheng
CPC classification number: G06F3/064 , G06F3/0604 , G06F3/0673 , G06N3/08
Abstract: A storage device and method of controlling a storage device are disclosed. The storage device includes a host, a logic die, and a high bandwidth memory stack including a memory die. A computation lookup table is stored on a memory array of the memory die. The host sends a command to perform an operation utilizing a kernel and a plurality of input feature maps, includes finding the product of a weight of the kernel and values of multiple input feature maps. The computation lookup table includes a row corresponding to a weight of the kernel, and a column corresponding to a value of the input feature maps. A result value stored at a position corresponding to a row and a column is the product of the weight corresponding to the row and the value corresponding to the column.
-
公开(公告)号:US20180188968A1
公开(公告)日:2018-07-05
申请号:US15470709
申请日:2017-03-27
Applicant: Samsung Electronics Co., Ltd.
Inventor: Krishna T. Malladi , Hongzhong Zheng
CPC classification number: G06F17/30312 , G06F17/30985
Abstract: A method of searching for data stored in a memory, the method including receiving a regex search request, generating a parse tree including fundamental regex operations corresponding to the regex search request, individually analyzing each of the fundamental regex operations of the generated parse tree in a respective time-step, determining a memory address location of data corresponding to the analyzed fundamental regex operations by using a translation table to determine whether the data exists, and using a reverse translation table to determine the memory address location of the data, and outputting data matching the regex search request after analyzing all of the fundamental regex operations of the generated parse tree.
-
公开(公告)号:US12056388B2
公开(公告)日:2024-08-06
申请号:US17898207
申请日:2022-08-29
Applicant: Samsung Electronics Co., Ltd.
Inventor: Krishna T. Malladi
IPC: G06F3/06
CPC classification number: G06F3/0655 , G06F3/0604 , G06F3/0673
Abstract: A method for in-memory computing. In some embodiments, the method includes: executing, by a first function-in-memory circuit, a first instruction, to produce, as a result, a first value, wherein a first computing task includes a second computing task and a third computing task, the second computing task including the first instruction; storing, by the first function-in-memory circuit, the first value in a first buffer; reading, by a second function-in-memory circuit, the first value from the first buffer; and executing, by a second function-in-memory circuit, a second instruction, the second instruction using the first value as an argument, the third computing task including the second instruction, wherein: the storing, by the first function-in-memory circuit, of the first value in the first buffer includes directly storing the first value in the first buffer.
-
公开(公告)号:US12056379B2
公开(公告)日:2024-08-06
申请号:US18315821
申请日:2023-05-11
Applicant: Samsung Electronics Co., Ltd.
Inventor: Peng Gu , Krishna T. Malladi , Hongzhong Zheng
CPC classification number: G06F3/064 , G06F3/0604 , G06F3/0673 , G06N3/08
Abstract: A storage device and method of controlling a storage device are disclosed. The storage device includes a host, a logic die, and a high bandwidth memory stack including a memory die. A computation lookup table is stored on a memory array of the memory die. The host sends a command to perform an operation utilizing a kernel and a plurality of input feature maps, includes finding the product of a weight of the kernel and values of multiple input feature maps. The computation lookup table includes a row corresponding to a weight of the kernel, and a column corresponding to a value of the input feature maps. A result value stored at a position corresponding to a row and a column is the product of the weight corresponding to the row and the value corresponding to the column.
-
公开(公告)号:US20240211149A1
公开(公告)日:2024-06-27
申请号:US18593885
申请日:2024-03-02
Applicant: Samsung Electronics Co., Ltd.
Inventor: Dimin Niu , Shuangchen Li , Bob Brennan , Krishna T. Malladi , Hongzhong Zheng
IPC: G06F3/06 , G06F15/78 , G11C11/4096
CPC classification number: G06F3/0631 , G06F3/0604 , G06F3/067 , G06F15/7821 , G11C11/4096
Abstract: A processor includes a plurality of memory units, each of the memory units including a plurality of memory cells, wherein each of the memory units is configurable to operate as memory, as a computation unit, or as a hybrid memory-computation unit.
-
-
-
-
-
-
-
-
-