Patent search ap:("Micron Technology Page Inc.") AND inv:"Marko Vitez"

1.

发明申请
RUNTIME OPTIMIZATION OF COMPUTATIONS OF AN ARTIFICIAL NEURAL NETWORK COMPILED FOR EXECUTION ON A DEEP LEARNING ACCELERATOR 有权

公开(公告)号：US20220147813A1

公开(公告)日：2022-05-12

申请号：US17092044

申请日：2020-11-06

Applicant: Micron Technology, Inc.

Inventor： Andre Xian Ming Chang , Aliasger Tayeb Zaidy , Marko Vitez , Eugenio Culurciello

IPC: G06N3/08 , G06F8/41 , G06N3/04

Abstract: Systems, devices, and methods related to a Deep Learning Accelerator and memory are described. For example, an integrated circuit device may be configured to execute instructions with matrix operands and configured with random access memory (RAM). A compiler is configured to generate instructions executable by the Deep Learning Accelerator from a description of a target artificial neural network. The instructions may call routines in a runtime library that has an embedded artificial neural network configured to predict optimized execution options available to implement the routines. The prediction is based at least in part on a pattern of data being processed in the target artificial neural network and/or a pattern of usages of the routines by the instructions.

2.

发明申请
COMPILER WITH AN ARTIFICIAL NEURAL NETWORK TO OPTIMIZE INSTRUCTIONS GENERATED FOR EXECUTION ON A DEEP LEARNING ACCELERATOR OF ARTIFICIAL NEURAL NETWORKS 有权

公开(公告)号：US20220147812A1

公开(公告)日：2022-05-12

申请号：US17092040

申请日：2020-11-06

Applicant: Micron Technology, Inc.

Inventor： Andre Xian Ming Chang , Aliasger Tayeb Zaidy , Marko Vitez , Michael Cody Glapa , Abhishek Chaurasia , Eugenio Culurciello

IPC: G06N3/08 , G06F8/41 , G06N3/04

Abstract: Systems, devices, and methods related to a Deep Learning Accelerator and memory are described. For example, an integrated circuit device may be configured to execute instructions with matrix operands and configured with random access memory (RAM). A compiler has an artificial neural network configured to identify an optimized compilation option for an artificial neural network to be compiled by the compiler and/or for a hardware platform of Deep Learning Accelerators. The artificial neural network of the compiler can be trained via machine learning to identify the optimized compilation option based on the features of the artificial neural network to be compiled and/or features of the hardware platform on which the compiler output will be executed.

3.

发明申请
IMPLEMENT THE COMPUTATION OF AN ARTIFICIAL NEURAL NETWORK USING MULTIPLE DEEP LEARNING ACCELERATORS 有权

公开(公告)号：US20220147811A1

公开(公告)日：2022-05-12

申请号：US17092038

申请日：2020-11-06

Applicant: Micron Technology, Inc.

Inventor： Jaime Cummins , Marko Vitez , Eugenio Culurciello , Andre Xian Ming Chang , Aliasger Tayeb Zaidy

IPC: G06N3/08 , G06N3/04 , G06F8/41 , G11C11/54

Abstract: Systems, devices, and methods related to a Deep Learning Accelerator and memory are described. For example, an integrated circuit device may be configured to execute instructions with matrix operands and configured with random access memory (RAM). A compiler can identify a plurality of portions of an artificial neural network for implementation on a plurality of such integrated circuit devices respectively. The compiler converts a description of the artificial neural network into a plurality of compiler outputs executable on the plurality of devices to generate an output of the artificial neural network response to an input to the artificial neural network. Intermediate results are communicated among the devices in generating the output of the artificial neural network.

4.

发明授权
Discovery of hardware characteristics of deep learning accelerators for optimization via compiler 有权

公开(公告)号：US12118460B2

公开(公告)日：2024-10-15

申请号：US17092033

申请日：2020-11-06

Applicant: Micron Technology, Inc.

Inventor： Aliasger Tayeb Zaidy , Marko Vitez , Eugenio Culurciello , Jaime Cummins , Andre Xian Ming Chang

IPC: G06N3/08 , G06F8/41 , G06N3/04

CPC classification number: G06N3/08 , G06F8/41 , G06N3/04

Abstract: Systems, devices, and methods related to a Deep Learning Accelerator and memory are described. For example, an integrated circuit device may be configured to execute instructions with matrix operands and configured with random access memory. A computing device running a compiler can interact and/or probe an integrated circuit device to identify hardware characteristics of the integrated circuit device in performing matrix computations. The compiler can generate and optimize a result of compilation from a description of an artificial neural network based at least in part on the hardware characteristics of the integrated circuit device. The result of compilation can include first data representative of parameters of the artificial neural network and second data representative of instructions executable by the integrated circuit device to generate an output of the artificial neural network based on the first data and an input to the artificial neural network.

5.

发明授权
Deep neural networks compiler for a trace-based accelerator 有权

公开(公告)号：US11861337B2

公开(公告)日：2024-01-02

申请号：US17003476

申请日：2020-08-26

Applicant: Micron Technology, Inc.

Inventor： Andre Xian Ming Chang , Aliasger Zaidy , Eugenio Culurciello , Marko Vitez

IPC: G06F8/41 , G06F9/30 , G06F9/50 , G06N3/02

CPC classification number: G06F8/458 , G06F9/30087 , G06F9/5027 , G06N3/02

Abstract: A method of compiling neural network code to executable instructions for execution by a computational acceleration system having a memory circuit and one or more acceleration circuits having a maps data buffer and a kernel data buffer is disclosed, such as for execution by an inference engine circuit architecture which includes a matrix-matrix (MM) accelerator circuit having multiple operating modes to provide a complete matrix multiplication. A representative compiling method includes generating a list of neural network layer model objects; fusing available functions and layers in the list; selecting a cooperative mode, an independent mode, or a combined cooperative and independent mode for execution; selecting a data movement mode and an ordering of computations which reduces usage of the memory circuit; generating an ordered sequence of load objects, compute objects, and store objects; and converting the ordered sequence of load objects, compute objects, and store objects into the executable instructions.

6.

发明申请
DEEP LEARNING ACCELERATORS WITH CONFIGURABLE HARDWARE OPTIONS OPTIMIZABLE VIA COMPILER 有权

公开(公告)号：US20220147809A1

公开(公告)日：2022-05-12

申请号：US17092023

申请日：2020-11-06

Applicant: Micron Technology, Inc.

Inventor： Aliasger Tayeb Zaidy , Marko Vitez , Eugenio Culurciello , Jaime Cummins , Andre Xian Ming Chang

IPC: G06N3/08 , G06N3/04

Abstract: Systems, devices, and methods related to a Deep Learning Accelerator and memory are described. For example, an integrated circuit device may be configured to execute instructions with matrix operands and configured with random access memory. A compiler can convert a description of an artificial neural network into a compiler output through optimization and/or selection of hardware options of the integrated circuit device. The compiler output can include parameters of the artificial neural network, instructions executable by processing units of the Deep Learning Accelerator to generate an output of the artificial neural network responsive to an input to the artificial neural network, and hardware options to be stored in registers connected to control hardware configurations of the processing units.

7.

发明申请
DISCOVERY OF HARDWARE CHARACTERISTICS OF DEEP LEARNING ACCELERATORS FOR OPTIMIZATION VIA COMPILER 有权

公开(公告)号：US20250036950A1

公开(公告)日：2025-01-30

申请号：US18912182

申请日：2024-10-10

Applicant: Micron Technology, Inc.

Inventor： Aliasger Tayeb Zaidy , Marko Vitez , Eugenio Culurciello , Jaime Cummins , Andre Xian Ming Chang

IPC: G06N3/08 , G06F8/41 , G06N3/04

Abstract: Systems, devices, and methods related to a Deep Learning Accelerator and memory are described. For example, an integrated circuit device may be configured to execute instructions with matrix operands and configured with random access memory. A computing device running a compiler can interact and/or probe an integrated circuit device to identify hardware characteristics of the integrated circuit device in performing matrix computations. The compiler can generate and optimize a result of compilation from a description of an artificial neural network based at least in part on the hardware characteristics of the integrated circuit device. The result of compilation can include first data representative of parameters of the artificial neural network and second data representative of instructions executable by the integrated circuit device to generate an output of the artificial neural network based on the first data and an input to the artificial neural network.

8.

发明申请
DISCOVERY OF HARDWARE CHARACTERISTICS OF DEEP LEARNING ACCELERATORS FOR OPTIMIZATION VIA COMPILER 有权

公开(公告)号：US20220147810A1

公开(公告)日：2022-05-12

申请号：US17092033

申请日：2020-11-06

Applicant: Micron Technology, Inc.

Inventor： Aliasger Tayeb Zaidy , Marko Vitez , Eugenio Culurciello , Jaime Cummins , Andre Xian Ming Chang

IPC: G06N3/08 , G06N3/04 , G06F8/41

Abstract: Systems, devices, and methods related to a Deep Learning Accelerator and memory are described. For example, an integrated circuit device may be configured to execute instructions with matrix operands and configured with random access memory. A computing device running a compiler can interact and/or probe an integrated circuit device to identify hardware characteristics of the integrated circuit device in performing matrix computations. The compiler can generate and optimize a result of compilation from a description of an artificial neural network based at least in part on the hardware characteristics of the integrated circuit device. The result of compilation can include first data representative of parameters of the artificial neural network and second data representative of instructions executable by the integrated circuit device to generate an output of the artificial neural network based on the first data and an input to the artificial neural network.

9.

发明申请
COMPILER CONFIGURABLE TO GENERATE INSTRUCTIONS EXECUTABLE BY DIFFERENT DEEP LEARNING ACCELERATORS FROM A DESCRIPTION OF AN ARTIFICIAL NEURAL NETWORK 有权

公开(公告)号：US20220147808A1

公开(公告)日：2022-05-12

申请号：US17092013

申请日：2020-11-06

Applicant: Micron Technology, Inc.

Inventor： Andre Xian Ming Chang , Aliasger Tayeb Zaidy , Eugenio Culurciello , Jaime Cummins , Marko Vitez

IPC: G06N3/08 , G06N3/063 , G06F9/50 , G06F17/16 , G06F7/544 , G06F7/523 , G06F7/50

Abstract: Systems, devices, and methods related to a Deep Learning Accelerator and memory are described. For example, an integrated circuit device may be configured to execute instructions with matrix operands and configured with random access memory (RAM). A compiler can convert a description of an artificial neural network into a generic result of compilation according to a specification of a generic Deep Learning Accelerator and then map the first result of compilation into a platform-specific result according to a specification of a specific hardware platform of Deep Learning Accelerators. The platform-specific result can be stored into the RAM of the integrated circuit device to enable the integrated circuit device to autonomously perform the computation of the artificial neural network in generating an output in response to an input to the artificial neural network.

10.

发明申请
Deep Neural Networks Compiler for a Trace-Based Accelerator 有权

公开(公告)号：US20220066760A1

公开(公告)日：2022-03-03

申请号：US17003476

申请日：2020-08-26

Applicant: Micron Technology, Inc.

Inventor： Andre Xian Ming Chang , Aliasger Zaidy , Eugenio Culurciello , Marko Vitez

IPC: G06F8/41 , G06F9/30 , G06F9/50 , G06N3/02

Abstract: A method of compiling neural network code to executable instructions for execution by a computational acceleration system having a memory circuit and one or more acceleration circuits having a maps data buffer and a kernel data buffer is disclosed, such as for execution by an inference engine circuit architecture which includes a matrix-matrix (MM) accelerator circuit having multiple operating modes to provide a complete matrix multiplication. A representative compiling method includes generating a list of neural network layer model objects; fusing available functions and layers in the list; selecting a cooperative mode, an independent mode, or a combined cooperative and independent mode for execution; selecting a data movement mode and an ordering of computations which reduces usage of the memory circuit; generating an ordered sequence of load objects, compute objects, and store objects; and converting the ordered sequence of load objects, compute objects, and store objects into the executable instructions.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification