-
公开(公告)号:US20230146611A1
公开(公告)日:2023-05-11
申请号:US17668345
申请日:2022-02-09
Applicant: Samsung Electronics Co., Ltd.
Inventor: Shiyu LI , Krishna T. MALLADI , Andrew CHANG , Yang Seok KI
CPC classification number: G06T1/20 , G06T1/60 , G06K9/6257 , G06F9/30036
Abstract: A system and method for training a neural network. In some embodiments, the system includes a computational storage device including a backing store. The computational storage device may be configured to: store, in the backing store, an embedding table for a neural network embedding operation; receive a first index vector including a first index and a second index; retrieve, from the backing store: a first row of the embedding table, corresponding to the first index, and a second row of the embedding table, corresponding to the second index; and calculate a first embedded vector based on the first row and the second row.
-
公开(公告)号:US20220113915A1
公开(公告)日:2022-04-14
申请号:US17497882
申请日:2021-10-08
Applicant: Samsung Electronics Co., Ltd.
Inventor: Yang Seok KI , Krishna T. MALLADI , Rekha PITCHUMANI
IPC: G06F3/06
Abstract: A device may include an interconnect interface, a memory system including one or more first type memory devices to receive first data, one or more second type memory devices to receive second data, and an accelerator configured to perform an operation using the first data and the second data. The memory system may further include a cache configured to cache the second data for the one or more second type memory devices. A device may include an interconnect interface, a memory system coupled to the interconnect interface to receive data, an accelerator coupled to the memory system, and virtualization logic configured to partition one or more resources of the accelerator into one or more virtual accelerators, wherein a first one of the one or more virtual accelerators may he configured to perform a first operation on a first portion of the data.
-
公开(公告)号:US20200174676A1
公开(公告)日:2020-06-04
申请号:US16787002
申请日:2020-02-10
Applicant: Samsung Electronics Co., Ltd.
Inventor: Krishna T. MALLADI , Hongzhong ZHENG
Abstract: A high-bandwidth memory (HBM) system includes an HBM device and a logic circuit. The logic circuit includes a first interface coupled to a host device and a second interface coupled to the HBM device. The logic circuit receives a first command from the host device through the first interface and converts the received first command to a first processing-in-memory (PIM) command that is sent to the HBM device through the second interface. The first PIM command has a deterministic latency for completion. The logic circuit further receives a second command from the host device through the first interface and converting the received second command to a second PIM command that is sent to the HBM device through the second interface. The second PIM command has a non-deterministic latency for completion.
-
公开(公告)号:US20190050325A1
公开(公告)日:2019-02-14
申请号:US15796743
申请日:2017-10-27
Applicant: Samsung Electronics Co., Ltd.
Inventor: Krishna T. MALLADI , Hongzhong ZHENG , Robert BRENNAN , Hyungseuk KIM , Jinhyun KIM
IPC: G06F12/02 , G06F12/0862
Abstract: Inventive aspects include An HBM+ system, comprising a host including at least one of a CPU, a GPU, an ASIC, or an FPGA; and an HBM+ stack including a plurality of HBM modules arranged one atop another, and a logic die disposed beneath the plurality of HBM modules. The logic die is configured to offload processing operations from the host. A system architecture is disclosed that provides specific compute capabilities in the logic die of high bandwidth memory along with the supporting hardware and software architectures, logic die microarchitecture, and memory interface signaling options. Various new methods are provided for using in-memory processing abilities of the logic die beneath an HBM memory stack. In addition, various new signaling protocols are disclosed to use an HBM interface. The logic die microarchitecture and supporting system framework are also described.
-
公开(公告)号:US20170242822A1
公开(公告)日:2017-08-24
申请号:US15136775
申请日:2016-04-22
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Krishna T. MALLADI , Hongzhong ZHENG
IPC: G06F15/173 , H04L29/08 , G06F1/30 , G06F3/06
CPC classification number: G06F15/17331 , G06F1/30 , G06F3/0619 , G06F3/065 , G06F3/067 , G06F3/0685 , H04L67/1095 , H04L67/1097 , H04L69/16
Abstract: A memory device includes: a plurality of volatile memories for storing data; a non-volatile memory buffer configured to store data associated with workloads received from a host computer; and a memory controller configured to store the data to both the plurality of volatile memories and the non-volatile memory buffer and replicate the data to a remote node. The non-volatile memory buffer is configured to store the data in a table including an acknowledgement bit that is set by the remote node.
-
公开(公告)号:US20240201909A1
公开(公告)日:2024-06-20
申请号:US18587929
申请日:2024-02-26
Applicant: Samsung Electronics Co., Ltd.
Inventor: Yang Seok KI , Krishna T. MALLADI , Rekha PITCHUMANI
IPC: G06F3/06
CPC classification number: G06F3/0664 , G06F3/0604 , G06F3/0679
Abstract: A device may include an interconnect interface, a memory system including one or more first type memory devices to receive first data, one or more second type memory devices to receive second data, and an accelerator configured to perform an operation using the first data and the second data. The memory system may further include a cache configured to cache the second data for the one or more second type memory devices. A device may include an interconnect interface, a memory system coupled to the interconnect interface to receive data, an accelerator coupled to the memory system, and virtualization logic configured to partition one or more resources of the accelerator into one or more virtual accelerators, wherein a first one of the one or more virtual accelerators may be configured to perform a first operation on a first portion of the data.
-
公开(公告)号:US20230087747A1
公开(公告)日:2023-03-23
申请号:US18070328
申请日:2022-11-28
Applicant: Samsung Electronics Co., Ltd.
Inventor: Krishna T. MALLADI , Mu-Tien CHANG , Dimin NIU , Hongzhong ZHENG
IPC: G06F12/0879 , G11C11/417
Abstract: A high bandwidth memory system. In some embodiments, the system includes: a memory stack having a plurality of memory dies and eight 128-bit channels; and a logic die, the memory dies being stacked on, and connected to, the logic die; wherein the logic die may be configured to operate a first channel of the 128-bit channels in: a first mode, in which a first 64 bits operate in pseudo-channel mode, and a second 64 bits operate as two 32-bit fine-grain channels, or a second mode, in which the first 64 bits operate as two 32-bit fine-grain channels, and the second 64 bits operate as two 32-bit fine-grain channels.
-
公开(公告)号:US20220414030A1
公开(公告)日:2022-12-29
申请号:US17901846
申请日:2022-09-01
Applicant: Samsung Electronics Co., Ltd.
Inventor: Krishna T. MALLADI , Dimin NIU , Hongzhong ZHENG
Abstract: A high-bandwidth memory (HBM) includes a memory and a controller. The controller receives a data write request from a processor external to the HBM and the controller stores an entry in the memory indicating at least one address of data of the data write request and generates an indication that a data bus is available for an operation during a cycle time of the data write request based on the data write request comprising sparse data or data-value similarity. Sparse data includes a predetermined percentage of data values equal to zero, and data-value similarity includes a predetermined amount of spatial value locality of the data values. The predetermined percentage of data values equal to zero of sparse data and the predetermined amount of spatial value locality of the special-value pattern are both based on a predetermined data granularity.
-
公开(公告)号:US20220358042A1
公开(公告)日:2022-11-10
申请号:US17372309
申请日:2021-07-09
Applicant: Samsung Electronics Co., Ltd.
Inventor: Krishna T. MALLADI , Andrew Z. CHANG , Ehsan NAJAFABADI
IPC: G06F12/0817
Abstract: A coherent memory system. In some embodiments, the coherent memory system includes a first memory device. The first memory device may include a cache coherent controller; a volatile memory controller; a volatile memory; a nonvolatile memory controller; and a nonvolatile memory. The first memory device may be configured to receive a quality of service requirement and to selectively enable a first feature in response to the quality of service requirement.
-
公开(公告)号:US20220138132A1
公开(公告)日:2022-05-05
申请号:US17577370
申请日:2022-01-17
Applicant: Samsung Electronics Co., Ltd.
Inventor: Krishna T. MALLADI , Hongzhong ZHENG
Abstract: An apparatus may include a heterogeneous computing environment that may be controlled, at least in part, by a task scheduler in which the heterogeneous computing environment may include a processing unit having fixed logical circuits configured to execute instructions; a reprogrammable processing unit having reprogrammable logical circuits configured to execute instructions that include instructions to control processing-in-memory functionality; and a stack of high-bandwidth memory dies in which each may be configured to store data and to provide processing-in-memory functionality controllable by the reprogrammable processing unit such that the reprogrammable processing unit is at least partially stacked with the high-bandwidth memory dies. The task scheduler may be configured to schedule computational tasks between the processing unit, and the reprogrammable processing unit.
-
-
-
-
-
-
-
-
-