-
公开(公告)号:US11995009B2
公开(公告)日:2024-05-28
申请号:US17469769
申请日:2021-09-08
Applicant: Samsung Electronics Co., Ltd.
Inventor: Krishna T. Malladi , Hongzhong Zheng , Dimin Niu , Peng Gu
CPC classification number: G06F13/1652 , G06F7/5443 , G06F9/30014 , G06F9/30036 , G06F13/1694
Abstract: A high bandwidth memory (HBM) system includes a first HBM+ card. The first HBM+ card includes a plurality of HBM+ cubes. Each HBM+ cube has a logic die and a memory die. The first HBM+ card also includes a HBM+ card controller coupled to each of the plurality of HBM+ cubes and configured to interface with a host, a pin connection configured to connect to the host, and a fabric connection configured to connect to at least one HBM+ card.
-
公开(公告)号:US11921656B2
公开(公告)日:2024-03-05
申请号:US17577370
申请日:2022-01-17
Applicant: Samsung Electronics Co., Ltd.
Inventor: Krishna T. Malladi , Hongzhong Zheng
CPC classification number: G06F13/28 , G06F9/445 , G06F9/4806 , G06F2015/768
Abstract: An apparatus may include a heterogeneous computing environment that may be controlled, at least in part, by a task scheduler in which the heterogeneous computing environment may include a processing unit having fixed logical circuits configured to execute instructions; a reprogrammable processing unit having reprogrammable logical circuits configured to execute instructions that include instructions to control processing-in-memory functionality; and a stack of high-bandwidth memory dies in which each may be configured to store data and to provide processing-in-memory functionality controllable by the reprogrammable processing unit such that the reprogrammable processing unit is at least partially stacked with the high-bandwidth memory dies. The task scheduler may be configured to schedule computational tasks between the processing unit, and the reprogrammable processing unit.
-
公开(公告)号:US11914903B2
公开(公告)日:2024-02-27
申请号:US17497882
申请日:2021-10-08
Applicant: Samsung Electronics Co., Ltd.
Inventor: Yang Seok Ki , Krishna T. Malladi , Rekha Pitchumani
IPC: G06F3/06
CPC classification number: G06F3/0664 , G06F3/0604 , G06F3/0679
Abstract: A device may include an interconnect interface, a memory system including one or more first type memory devices to receive first data, one or more second type memory devices to receive second data, and an accelerator configured to perform an operation using the first data and the second data. The memory system may further include a cache configured to cache the second data for the one or more second type memory devices. A device may include an interconnect interface, a memory system coupled to the interconnect interface to receive data, an accelerator coupled to the memory system, and virtualization logic configured to partition one or more resources of the accelerator into one or more virtual accelerators, wherein a first one of the one or more virtual accelerators may be configured to perform a first operation on a first portion of the data.
-
公开(公告)号:US11775294B2
公开(公告)日:2023-10-03
申请号:US17538556
申请日:2021-11-30
Applicant: Samsung Electronics Co. Ltd.
Inventor: Peng Gu , Krishna T. Malladi , Hongzhong Zheng
CPC classification number: G06F9/3001 , G06F7/00 , G06F7/4876 , G06F9/3004 , G06F12/0207 , G06F17/16 , G06F2212/1024
Abstract: According to some example embodiments of the present disclosure, in a method for a memory lookup mechanism in a high-bandwidth memory system, the method includes: using a memory die to conduct a multiplication operation using a lookup table (LUT) methodology by accessing a LUT, which includes floating point operation results, stored on the memory die; sending, by the memory die, a result of the multiplication operation to a logic die including a processor and a buffer; and conducting, by the logic die, a matrix multiplication operation using computation units.
-
公开(公告)号:US11681451B2
公开(公告)日:2023-06-20
申请号:US17473532
申请日:2021-09-13
Applicant: Samsung Electronics Co., Ltd.
Inventor: Peng Gu , Krishna T. Malladi , Hongzhong Zheng
CPC classification number: G06F3/064 , G06F3/0604 , G06F3/0673 , G06N3/08
Abstract: A storage device and method of controlling a storage device are disclosed. The storage device includes a host, a logic die, and a high bandwidth memory stack including a memory die. A computation lookup table is stored on a memory array of the memory die. The host sends a command to perform an operation utilizing a kernel and a plurality of input feature maps, includes finding the product of a weight of the kernel and values of multiple input feature maps. The computation lookup table includes a row corresponding to a weight of the kernel, and a column corresponding to a value of the input feature maps. A result value stored at a position corresponding to a row and a column is the product of the weight corresponding to the row and the value corresponding to the column.
-
公开(公告)号:US20230119291A1
公开(公告)日:2023-04-20
申请号:US18081488
申请日:2022-12-14
Applicant: Samsung Electronics Co., Ltd.
Inventor: Mu-Tien Chang , Krishna T. Malladi , Dimin Niu , Hongzhong Zheng
IPC: G06F12/0875 , G06F13/16 , G06F13/12
Abstract: A method of processing in-memory commands in a high-bandwidth memory (HBM) system includes sending a function-in-HBM instruction to the HBM by a HBM memory controller of a GPU. A logic component of the HBM receives the FIM instruction and coordinates the instructions execution using the controller, an ALU, and a SRAM located on the logic component.
-
公开(公告)号:US20220164187A1
公开(公告)日:2022-05-26
申请号:US17538556
申请日:2021-11-30
Applicant: Samsung Electronics Co. Ltd.
Inventor: Peng Gu , Krishna T. Malladi , Hongzhong Zheng
Abstract: According to some example embodiments of the present disclosure, in a method for a memory lookup mechanism in a high-bandwidth memory system, the method includes: using a memory die to conduct a multiplication operation using a lookup table (LUT) methodology by accessing a LUT, which includes floating point operation results, stored on the memory die; sending, by the memory die, a result of the multiplication operation to a logic die including a processor and a buffer; and conducting, by the logic die, a matrix multiplication operation using computation units.
-
公开(公告)号:US11226816B2
公开(公告)日:2022-01-18
申请号:US16859829
申请日:2020-04-27
Applicant: Samsung Electronics Co., Ltd.
Inventor: Krishna T. Malladi , Wenqin Huangfu
Abstract: According to one embodiment, a memory module includes: a memory die including a dynamic random access memory (DRAM) banks, each including: an array of DRAM cells arranged in pages; a row buffer to store values of one of the pages; an input/output (IO) module; and an in-memory compute (IMC) module including: an arithmetic logic unit (ALU) to receive operands from the row buffer or the IO module and to compute an output based on the operands and one of a plurality of ALU operations; and a result register to store the output of the ALU; and a controller to: receive, from a host processor, operands and an instruction; determine, based on the instruction, a data layout; supply the operands to the DRAM banks in accordance with the data layout; and control an IMC module to perform one of the ALU operations on the operands in accordance with the instruction.
-
公开(公告)号:US11030088B2
公开(公告)日:2021-06-08
申请号:US16600313
申请日:2019-10-11
Applicant: Samsung Electronics Co., Ltd.
Inventor: Krishna T. Malladi , Jongmin Gim , Hongzhong Zheng
IPC: G06F12/02 , G06F13/16 , G06F12/121
Abstract: A pseudo main memory system. The system includes a memory adapter circuit for performing memory augmentation using compression, deduplication, and/or error correction. The memory adapter circuit is connected to a memory, and employs the memory augmentation methods to increase the effective storage capacity of the memory. The memory adapter circuit is also connected to a memory bus and implements an NVDIMM-F or modified NVDIMM-F interface for connecting to the memory bus.
-
公开(公告)号:US10908820B2
公开(公告)日:2021-02-02
申请号:US15821686
申请日:2017-11-22
Applicant: Samsung Electronics Co., Ltd.
Inventor: Krishna T. Malladi , Hongzhong Zheng , Robert Brennan
Abstract: A high-bandwidth memory (HBM) system includes an HBM device and a logic circuit. The logic circuit receives a first command from the host device and converts the received first command to a processing-in-memory (PIM) command that is sent to the HBM device through the second interface. A time between when the first command is received from the host device and when the HBM system is ready to receive another command from the host device is deterministic. The logic circuit further receives a fourth command and a fifth command from the host device. The fifth command requests time-estimate information relating to a time between when the fifth command is received and when the HBM system is ready to receive another command from the host device. The time-estimate information includes a deterministic period of time and an estimated period of time for a non-deterministic period of time.
-
-
-
-
-
-
-
-
-