Patent search ap:("INTEL CORPORATION") AND inv:"Moshe Maor" Page 2

11.

发明授权
Methods and apparatus to tile walk a tensor for convolution operations 有权

公开(公告)号：US11494608B2

公开(公告)日：2022-11-08

申请号：US16540581

申请日：2019-08-14

Applicant: Intel Corporation

Inventor： Yaniv Fais , Moshe Maor

IPC: G06N3/02 , G06N3/04 , G06F17/15 , G06F8/41 , G06N3/063

Abstract: An example apparatus to perform a convolution on an input tensor includes a parameters generator to: generate a horizontal hardware execution parameter for a horizontal dimension of the input tensor based on a kernel parameter and a layer parameter; and generate a vertical hardware execution parameter for a vertical dimension of the input tensor based on the kernel parameter and the layer parameter; an accelerator interface to configure a hardware accelerator circuitry based on the horizontal and vertical hardware execution parameters; a horizontal Iterator controller to determine when the hardware accelerator circuitry completes the first horizontal iteration of the convolution; and a vertical Iterator controller to determine when the hardware accelerator circuitry completes the first vertical iteration of the convolution.

12.

发明申请
METHODS AND APPARATUS TO ENABLE OUT-OF-ORDER PIPELINED EXECUTION OF STATIC MAPPING OF A WORKLOAD 有权

公开(公告)号：US20220197703A1

公开(公告)日：2022-06-23

申请号：US17561500

申请日：2021-12-23

Applicant: Intel Corporation

Inventor： Michael Behar , Moshe Maor , Ronen Gabbai , Roni Rosner , Zigi Walter , Oren Agam

IPC: G06F9/50 , G06F3/06

Abstract: Methods, apparatus, systems and articles of manufacture are disclosed that enable out-of-order pipelined execution of static mapping of a workload to one or more computational building blocks of an accelerator. An example apparatus includes an interface to load a first number of credits into memory; a comparator to compare the first number of credits to a threshold number of credits associated with memory availability in a buffer; and a dispatcher to, when the first number of credits meets the threshold number of credits, select a workload node of the workload to be executed at a first one of the one or more computational building blocks.

13.

发明授权
Methods and apparatus to enable out-of-order pipelined execution of static mapping of a workload 有权

公开(公告)号：US11231963B2

公开(公告)日：2022-01-25

申请号：US16542012

申请日：2019-08-15

Applicant: Intel Corporation

Inventor： Michael Behar , Moshe Maor , Ronen Gabbai , Roni Rosner , Zigi Walter , Oren Agam

IPC: G06F9/46 , G06F9/50 , G06F3/06

Abstract: Methods, apparatus, systems and articles of manufacture are disclosed that enable out-of-order pipelined execution of static mapping of a workload to one or more computational building blocks of an accelerator. An example apparatus includes an interface to load a first number of credits into memory; a comparator to compare the first number of credits to a threshold number of credits associated with memory availability in a buffer; and a dispatcher to, when the first number of credits meets the threshold number of credits, select a workload node of the workload to be executed at a first one of the one or more computational building blocks.

14.

发明授权
Methods, systems, and apparatus for a generic firmware-based kernel library mechanism 有权

公开(公告)号：US11093226B2

公开(公告)日：2021-08-17

申请号：US16541131

申请日：2019-08-14

Applicant: Intel Corporation

Inventor： Moshe Maor

IPC: G06F8/54 , G06F11/36 , G06F9/445

Abstract: Apparatus, systems, and methods for a generic firmware-based kernel library mechanism are disclosed. An example apparatus includes a compiler to compile kernels into an executable and linkable format, an image generator to generate library images from executable and linkable format locations, a reducer to retrieve a library image, the library image retrieved starting from a first section of an existing library, the retrieved library image to be used as a platform for developing a new kernel library, a selector to select kernels to include in the new kernel library, one or more libraries organized into a defined number of kernel banks, the kernels combined based on intended application development, and a linker to link a library start function pointer to the library start function, the library start function positioned within the library image, the pointer incorporated in a first section of the library image.

15.

发明授权
Methods and apparatus to implement efficient communications between components of computing systems 有权

公开(公告)号：US10990399B2

公开(公告)日：2021-04-27

申请号：US16539005

申请日：2019-08-13

Applicant: Intel Corporation

Inventor： Moshe Maor , Yaniv Fais

IPC: G06F9/30 , G06F9/54 , G06F9/38

Abstract: Methods and apparatus to implement efficient communications between components of computing systems are disclosed. An example apparatus includes a message generator to: add a first value associated with a first field of a message to a shift register based on a first push operation, the message including multiple fields, at least two of the fields having different bit widths; and add a second value associated with a second field of the message to the shift register based on a second push operation, the second value to be adjacent the first value in the shift register in accordance with a structure of the message. The example apparatus further includes a communications interface to transmit content stored in the shift register to a hardware device via a bus having a width corresponding to a width of the shift register, the content including the message.

16.

发明申请
INNER PRODUCT CONVOLUTIONAL NEURAL NETWORK ACCELERATOR 有权

公开(公告)号：US20250086445A1

公开(公告)日：2025-03-13

申请号：US18888744

申请日：2024-09-18

Applicant: Intel Corporation

Inventor： Ehud Cohen , Moshe Maor , Ashutosh Parkhi , Michael Behar , Yaniv Fais

IPC: G06N3/063 , G06F16/17 , G06F18/21 , G06N3/045 , G06N3/08 , G06V10/44 , G06V10/82 , G06V10/94

Abstract: A convolutional neural network (CNN) accelerator, including: a CNN circuit for performing a multiple-layer CNN computation, wherein the multiple layers are to receive an input feature according to an input feature map (IFM) and a weight matrix per output feature, wherein an output of a first layer provides an input for a next layer; and a mapping circuit to access a three-dimensional input matrix stored as a Z-major matrix; wherein the CNN circuit is to perform an inner-product direct convolution on the Z-major matrix, wherein the direct convolution lacks a lowering operation.

17.

发明授权
Inner product convolutional neural network accelerator 有权

公开(公告)号：US12131250B2

公开(公告)日：2024-10-29

申请号：US15720982

申请日：2017-09-29

Applicant: Intel Corporation

Inventor： Ehud Cohen , Moshe Maor , Ashutosh Parkhi , Michael Behar , Yaniv Fais

IPC: G06N3/063 , G06F16/17 , G06F18/21 , G06N3/045 , G06N3/08 , G06V10/44 , G06V10/82 , G06V10/94

CPC classification number: G06N3/063 , G06F16/17 , G06F18/21 , G06N3/045 , G06N3/08 , G06V10/454 , G06V10/82 , G06V10/955

Abstract: A convolutional neural network (CNN) accelerator, including: a CNN circuit for performing a multiple-layer CNN computation, wherein the multiple layers are to receive an input feature according to an input feature map (IFM) and a weight matrix per output feature, wherein an output of a first layer provides an input for a next layer; and a mapping circuit to access a three-dimensional input matrix stored as a Z-major matrix; wherein the CNN circuit is to perform an inner-product direct convolution on the Z-major matrix, wherein the direct convolution lacks a lowering operation.

18.

发明公开
METHODS AND APPARATUS TO CONFIGURE HETEROGENOUS COMPONENTS IN AN ACCELERATOR 审中-公开

公开(公告)号：US20230333913A1

公开(公告)日：2023-10-19

申请号：US18309650

申请日：2023-04-28

Applicant: INTEL CORPORATION

Inventor： Michael Behar , Moshe Maor , Ronen Gabbai , Roni Rosner , Zigi Walter , Oren Agam

IPC: G06F9/50 , G06F16/901 , G06N3/044 , G06N3/045

CPC classification number: G06F9/5083 , G06F16/9024 , G06N3/044 , G06N3/045

Abstract: Methods, apparatus, systems and articles of manufacture are disclosed to configure heterogenous components in an accelerator. An example apparatus includes a graph compiler to identify a workload node in a workload and generate a selector for the workload node, and the selector to identify an input condition and an output condition of a compute building block, wherein the graph compiler is to, in response to obtaining the identified input condition and output condition from the selector, map the workload node to the compute building block.

19.

发明申请
METHODS AND APPARATUS TO TILE WALK A TENSOR FOR CONVOLUTION OPERATIONS 有权

公开(公告)号：US20230067421A1

公开(公告)日：2023-03-02

申请号：US17954846

申请日：2022-09-28

Applicant: Intel Corporation

Inventor： Yaniv Fais , Moshe Maor

IPC: G06N3/04 , G06F17/15 , G06F8/41 , G06N3/063

Abstract: An example apparatus to perform a convolution on an input tensor includes a parameters generator to: generate a horizontal hardware execution parameter for a horizontal dimension of the input tensor based on a kernel parameter and a layer parameter; and generate a vertical hardware execution parameter for a vertical dimension of the input tensor based on the kernel parameter and the layer parameter; an accelerator interface to configure a hardware accelerator circuitry based on the horizontal and vertical hardware execution parameters; a horizontal Iterator controller to determine when the hardware accelerator circuitry completes the first horizontal iteration of the convolution; and a vertical Iterator controller to determine when the hardware accelerator circuitry completes the first vertical iteration of the convolution.

20.

发明授权
Cyclic buffer pointer fixing 有权

公开(公告)号：US10572404B2

公开(公告)日：2020-02-25

申请号：US15638429

申请日：2017-06-30

Applicant: Intel Corporation

Inventor： Moshe Maor

IPC: G06F12/00 , G06F13/16 , G06F9/355 , G06F13/28

Abstract: A processor device is provided with hardware-implemented logic to receive an instruction including a pointer identifier and a pointer change value, the pointer identifier including a pointer address field encoded with an address of a line of memory corresponding to a location of a pointer of a particular one of the one or more cyclic buffers, one or more cushion bits, and a buffer identifier field encoded with a buffer identifier assigned to the particular cyclic buffer. The logic further enables the processor to identify that the instruction is to apply to the particular cyclic buffer based on the buffer identifier, determine that the pointer change value causes a wraparound of the pointer in the particular cyclic buffer, and fix location of the pointer in the particular cyclic buffer based on the wraparound.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification