-
公开(公告)号:US20240428853A1
公开(公告)日:2024-12-26
申请号:US18825829
申请日:2024-09-05
Applicant: Micron Technology, Inc.
Inventor: Aliasger Tayeb Zaidy , Patrick Alan Estep , David Andrew Roberts
IPC: G11C11/54 , G06F12/0862 , G06F12/0897 , G06N3/063 , G06N3/08
Abstract: Systems, devices, and methods related to a deep learning accelerator and memory are described. For example, the accelerator can have processing units to perform at least matrix computations of an artificial neural network via execution of instructions. The processing units have a local memory store operands of the instructions. The accelerator can access a random access memory via a system buffer, or without going through the system buffer. A fetch instruction can request an item, available at a memory address in the random access memory, to be loaded into the local memory at a local address. The fetch instruction can include a hint for the caching of the item in the system buffer. During execution of the instruction, the hint can be used to determine whether to load the item through the system buffer or to bypass the system buffer in loading the item.
-
公开(公告)号:US12094531B2
公开(公告)日:2024-09-17
申请号:US17146314
申请日:2021-01-11
Applicant: Micron Technology, Inc.
Inventor: Aliasger Tayeb Zaidy , Patrick Alan Estep , David Andrew Roberts
IPC: G06F3/06 , G06F12/0862 , G06F12/0897 , G06N3/063 , G06N3/08 , G11C11/54
CPC classification number: G11C11/54 , G06F12/0862 , G06F12/0897 , G06N3/063 , G06N3/08
Abstract: Systems, devices, and methods related to a Deep Learning Accelerator and memory are described. For example, the accelerator can have processing units to perform at least matrix computations of an artificial neural network via execution of instructions. The processing units have a local memory store operands of the instructions. The accelerator can access a random access memory via a system buffer, or without going through the system buffer. A fetch instruction can request an item, available at a memory address in the random access memory, to be loaded into the local memory at a local address. The fetch instruction can include a hint for the caching of the item in the system buffer. During execution of the instruction, the hint can be used to determine whether to load the item through the system buffer or to bypass the system buffer in loading the item.
-
公开(公告)号:US12007899B2
公开(公告)日:2024-06-11
申请号:US17867371
申请日:2022-07-18
Applicant: Micron Technology, Inc.
IPC: G06F12/0882 , G06F12/06
CPC classification number: G06F12/0882 , G06F12/0646
Abstract: Disclosed in some examples are improved address prediction and memory preloading that leverages next-delta prediction and/or far-delta prediction for scheduling using a DNN. Previous memory access sequence data that identify one or more memory addresses previously accessed by one or more processors of a system may be processed and then converted into a sequence of delta values. The sequence of delta values are then mapped to one or more classes that are then input to a DNN. The DNN then outputs a predicted future class identifier sequence that represents addresses that the DNN predicts will be accessed by the processor in the future. The predicted future class identifier sequence is then converted back to a predicted delta value sequence and back into a set of one or more predicted addresses.
-
公开(公告)号:US20220223201A1
公开(公告)日:2022-07-14
申请号:US17146314
申请日:2021-01-11
Applicant: Micron Technology, Inc.
Inventor: Aliasger Tayeb Zaidy , Patrick Alan Estep , David Andrew Roberts
IPC: G11C11/54 , G06N3/063 , G06N3/08 , G06F12/0862 , G06F12/0897
Abstract: Systems, devices, and methods related to a Deep Learning Accelerator and memory are described. For example, the accelerator can have processing units to perform at least matrix computations of an artificial neural network via execution of instructions. The processing units have a local memory store operands of the instructions. The accelerator can access a random access memory via a system buffer, or without going through the system buffer. A fetch instruction can request an item, available at a memory address in the random access memory, to be loaded into the local memory at a local address. The fetch instruction can include a hint for the caching of the item in the system buffer. During execution of the instruction, the hint can be used to determine whether to load the item through the system buffer or to bypass the system buffer in loading the item.
-
公开(公告)号:US20220147812A1
公开(公告)日:2022-05-12
申请号:US17092040
申请日:2020-11-06
Applicant: Micron Technology, Inc.
Inventor: Andre Xian Ming Chang , Aliasger Tayeb Zaidy , Marko Vitez , Michael Cody Glapa , Abhishek Chaurasia , Eugenio Culurciello
Abstract: Systems, devices, and methods related to a Deep Learning Accelerator and memory are described. For example, an integrated circuit device may be configured to execute instructions with matrix operands and configured with random access memory (RAM). A compiler has an artificial neural network configured to identify an optimized compilation option for an artificial neural network to be compiled by the compiler and/or for a hardware platform of Deep Learning Accelerators. The artificial neural network of the compiler can be trained via machine learning to identify the optimized compilation option based on the features of the artificial neural network to be compiled and/or features of the hardware platform on which the compiler output will be executed.
-
16.
公开(公告)号:US20220147811A1
公开(公告)日:2022-05-12
申请号:US17092038
申请日:2020-11-06
Applicant: Micron Technology, Inc.
Inventor: Jaime Cummins , Marko Vitez , Eugenio Culurciello , Andre Xian Ming Chang , Aliasger Tayeb Zaidy
Abstract: Systems, devices, and methods related to a Deep Learning Accelerator and memory are described. For example, an integrated circuit device may be configured to execute instructions with matrix operands and configured with random access memory (RAM). A compiler can identify a plurality of portions of an artificial neural network for implementation on a plurality of such integrated circuit devices respectively. The compiler converts a description of the artificial neural network into a plurality of compiler outputs executable on the plurality of devices to generate an output of the artificial neural network response to an input to the artificial neural network. Intermediate results are communicated among the devices in generating the output of the artificial neural network.
-
-
-
-
-