Patent search ap:("Facebook Page Inc.") AND inv:"Krishnakumar Narayanan Nair"

1.

发明申请
FLOATING POINT MULTIPLY HARDWARE USING DECOMPOSED COMPONENT NUMBERS 有权

公开(公告)号：US20220107782A1

公开(公告)日：2022-04-07

申请号：US17506506

申请日：2021-10-20

Applicant: Facebook, Inc.

Inventor： Krishnakumar Narayanan Nair , Anup Ramesh Kadkol , Ehsan Khish Ardestani Zadeh , Olivia Wu , Yuchen Hao , Thomas Mark Ulrich , Rakesh Komuravelli

IPC: G06F7/487 , G06N3/02 , G06F17/16 , G06F7/485

Abstract: A processor system comprises one or more logic units configured to receive a processor instruction identifying a first floating point number to be multiplied with a second floating point number. The floating point numbers are each decomposed into a group of a plurality of component numbers, wherein a number of bits used to represent each floating point number is greater than a number of bits used to represent any component number in each group of the plurality of component numbers. The component numbers of the first group are multiplied with the component numbers of the second group to determine intermediate multiplication results that are summed together to determine an effective result that represents a result of multiplying the first floating point number with the second floating point number.

2.

发明申请
MAPPING CONVOLUTION TO A CHANNEL CONVOLUTION ENGINE 有权

公开(公告)号：US20210256363A1

公开(公告)日：2021-08-19

申请号：US16793961

申请日：2020-02-18

Applicant: Facebook, Inc.

Inventor： Krishnakumar Narayanan Nair , Rakesh Komuravelli , Abdulkadir Utku Diril , Ehsan Khish Ardestani Zadeh , Yuchen Hao , Martin Schatz , Thomas Mark Ulrich , Olivia Wu , Anup Ramesh Kadkol , Amin Firoozshahian

IPC: G06N3/063 , G06N3/08 , G06F9/30 , G06F17/16

Abstract: A processor system comprises a first and second group of registers and a hardware channel convolution processor unit. The first group of registers is configured to store data elements of channels of a portion of a convolution data matrix. Each register stores at least one data element from each channel. The second group of registers is configured to store data elements of convolution weight matrices including a separate convolution weight matrix for each channel. Each register stores at least one data element from each convolution weight matrix. The hardware channel convolution processor unit is configured to multiply each data element in the first group of registers with a corresponding data element in the second group of registers and sum together the multiplication results for each specific channel to determine corresponding channel convolution result data elements in a corresponding channel convolution result matrix.

3.

发明申请
HIGH BANDWIDTH MEMORY SYSTEM WITH DISTRIBUTED REQUEST BROADCASTING MASTERS 有权

公开(公告)号：US20210181957A1

公开(公告)日：2021-06-17

申请号：US16712253

申请日：2019-12-12

Applicant: Facebook, Inc.

Inventor： Abdulkadir Utku Diril , Olivia Wu , Krishnakumar Narayanan Nair , Aravind Kalaiah , Anup Ramesh Kadkol , Pankaj Kansal

IPC: G06F3/06 , G06N3/02

Abstract: A system comprises a processor and a plurality of memory units. The processor is coupled to each of the plurality of memory units by a plurality of network connections. The processor includes a plurality of processing elements arranged in a two-dimensional array and a corresponding two-dimensional communication network communicatively connecting each of the plurality of processing elements to other processing elements on same axes of the two-dimensional array. Each processing element that is located along a diagonal of the two-dimensional array is configured as a request broadcasting master for a respective group of processing elements located along a same axis of the two-dimensional array.

4.

发明申请
HIGH BANDWIDTH MEMORY SYSTEM WITH DISTRIBUTED REQUEST BROADCASTING MASTERS 有权

公开(公告)号：US20210326051A1

公开(公告)日：2021-10-21

申请号：US17307828

申请日：2021-05-04

Applicant: Facebook, Inc.

Inventor： Abdulkadir Utku Diril , Olivia Wu , Krishnakumar Narayanan Nair , Aravind Kalaiah , Anup Ramesh Kadkol , Pankaj Kansal

IPC: G06F3/06 , G06F13/16 , G06F15/80 , G06N3/02

Abstract: A system comprises a processor and a plurality of memory units. The processor is coupled to each of the plurality of memory units by a plurality of network connections. The processor includes a plurality of processing elements arranged in a two-dimensional array and a corresponding two-dimensional communication network communicatively connecting each of the plurality of processing elements to other processing elements on same axes of the two-dimensional array. Each processing element that is located along a diagonal of the two-dimensional array is configured as a request broadcasting master for a respective group of processing elements located along a same axis of the two-dimensional array.

5.

发明申请
MAPPING CONVOLUTION TO A PARTITION CHANNEL CONVOLUTION ENGINE 有权

公开(公告)号：US20210271451A1

公开(公告)日：2021-09-02

申请号：US16805339

申请日：2020-02-28

Applicant: Facebook, Inc.

Inventor： Krishnakumar Narayanan Nair , Rakesh Komuravelli , Abdulkadir Utku Diril , Ehsan Khish Ardestani Zadeh , Yuchen Hao , Martin Schatz , Thomas Mark Ulrich , Olivia Wu , Anup Ramesh Kadkol , Amin Firoozshahian

IPC: G06F7/544 , G06F17/15 , G06N20/00

Abstract: A processor system comprises two groups of registers and a hardware channel convolution processor unit. The first group of registers is configured to store data elements of channels of a portion of a convolution data matrix. Each register stores at least one data element from each channel. The second group of registers is configured to store data elements of convolution weight matrices including a separate matrix for each channel. Each register stores at least one data element from each matrix. The hardware channel convolution processor unit is configured to multiply each data element in a first and second portion of the first group of registers with a corresponding data element in the second group of registers to determine corresponding multiplication results and sum together the multiplication results for each specific channel to determine two corresponding channel convolution result data elements in a corresponding channel convolution result matrix.

6.

发明申请
HARDWARE FOR FLOATING-POINT ARITHMETIC IN MULTIPLE FORMATS 有权

公开(公告)号：US20210255830A1

公开(公告)日：2021-08-19

申请号：US16795097

申请日：2020-02-19

Applicant: Facebook, Inc.

Inventor： Thomas Mark Ulrich , Abdulkadir Utku Diril , Krishnakumar Narayanan Nair , Zhao Wang , Rakesh Komuravelli

IPC: G06F7/487 , G06F7/485

Abstract: A floating-point number in a first format representation is received. Based on an identification of a floating-point format type of the floating-point number, different components of the first format representation are identified. The different components of the first format representation are placed in corresponding components of a second format representation of the floating-point number, wherein a total number of bits of the second format representation is larger than a total number of bits of the first format representation. At least one of the components of the second format representation is padded with one or more zero bits. The floating-point number in the second format representation is stored in a register. A multiplication using the second format representation of the floating-point number is performed.

7.

发明授权
High bandwidth memory system with distributed request broadcasting masters 有权

公开(公告)号：US11054998B1

公开(公告)日：2021-07-06

申请号：US16712253

申请日：2019-12-12

Applicant: Facebook, Inc.

Inventor： Abdulkadir Utku Diril , Olivia Wu , Krishnakumar Narayanan Nair , Aravind Kalaiah , Anup Ramesh Kadkol , Pankaj Kansal

IPC: G06F12/00 , G06F3/06 , G06N3/02 , G06F13/16 , G06F15/80

Abstract: A system comprises a processor and a plurality of memory units. The processor is coupled to each of the plurality of memory units by a plurality of network connections. The processor includes a plurality of processing elements arranged in a two-dimensional array and a corresponding two-dimensional communication network communicatively connecting each of the plurality of processing elements to other processing elements on same axes of the two-dimensional array. Each processing element that is located along a diagonal of the two-dimensional array is configured as a request broadcasting master for a respective group of processing elements located along a same axis of the two-dimensional array.

8.

发明申请
SYSTEMS AND METHODS FOR REDUCING DATA MOVEMENT DURING CONVOLUTION OPERATIONS IN ARTIFICIAL NEURAL NETWORKS 有权

公开(公告)号：US20210192359A1

公开(公告)日：2021-06-24

申请号：US16722636

申请日：2019-12-20

Applicant: Facebook, Inc.

Inventor： Ehsan Khish Ardestani Zadeh , Martin Schatz , Krishnakumar Narayanan Nair , Yuchen Hao , Abdulkadir Utku Diril , Rakesh Komuravelli

IPC: G06N3/10 , G06F17/15 , G06N3/04

Abstract: The disclosed computer-implemented method may include (1) receiving, at a hardware accelerator that supports an ANN, an activation data set that is to undergo a convolution operation via a filter kernel of the ANN, (2) receiving, at the hardware accelerator, an argument indicating that the filter kernel exceeds at least one boundary of the activation data set when slid across a certain position during the convolution operation, (3) determining, based at least in part on the argument, that the hardware accelerator is to generate padding data at the boundary of the activation data set in connection with the certain position of the filter kernel, and then (4) performing, at the hardware accelerator, the convolution operation by processing a portion of the activation data set and the padding data when the filter kernel slides across the certain position. Various other systems and methods are also disclosed.

9.

发明申请
HIGH BANDWIDTH MEMORY SYSTEM WITH DYNAMICALLY PROGRAMMABLE DISTRIBUTION SCHEME 有权

公开(公告)号：US20210165691A1

公开(公告)日：2021-06-03

申请号：US16701019

申请日：2019-12-02

Applicant: Facebook, Inc.

Inventor： Abdulkadir Utku Diril , Olivia Wu , Krishnakumar Narayanan Nair , Anup Ramesh Kadkol , Aravind Kalaiah , Pankaj Kansal

IPC: G06F9/50 , G06F9/54 , G06F9/445 , G06F9/38 , G06F12/02

Abstract: A system comprises a processor coupled to a plurality of memory units. Each of the plurality of memory units includes a request processing unit and a plurality of memory banks. The processor includes a plurality of processing elements and a communication network communicatively connecting the plurality of processing elements to the plurality of memory units. At least a first processing element of the plurality of processing elements includes a control logic unit and a matrix compute engine. The control logic unit is configured to access data from the plurality of memory units using a dynamically programmable distribution scheme.

10.

发明授权
Floating point multiply hardware using decomposed component numbers 有权

公开(公告)号：US11188303B2

公开(公告)日：2021-11-30

申请号：US16591042

申请日：2019-10-02

Applicant: Facebook, Inc.

Inventor： Krishnakumar Narayanan Nair , Anup Ramesh Kadkol , Ehsan Khish Ardestani Zadeh , Olivia Wu , Yuchen Hao , Thomas Mark Ulrich , Rakesh Komuravelli

IPC: G06F7/487 , G06F7/485 , G06F17/16 , G06N3/02

Abstract: A processor system comprises one or more logic units configured to receive a processor instruction identifying a first floating point number to be multiplied with a second floating point number. The floating point numbers are each decomposed into a group of a plurality of component numbers, wherein a number of bits used to represent each floating point number is greater than a number of bits used to represent any component number in each group of the plurality of component numbers. The component numbers of the first group are multiplied with the component numbers of the second group to determine intermediate multiplication results that are summed together to determine an effective result that represents a result of multiplying the first floating point number with the second floating point number.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification