-
公开(公告)号:US20240168913A1
公开(公告)日:2024-05-23
申请号:US18518695
申请日:2023-11-24
Applicant: SambaNova Systems, Inc.
Inventor: Tejas Nagendra Babu NAMA , Ruddhi CHAPHEKAR , Ram SIVARAMAKRISHNAN , Raghu PRABHAKAR , Sumti JAIRATH , Junjue WANG , Kaizhao LIANG , Adi FUCHS , Matheen MUSADDIQ , Arvind Krishna SUJEETH
IPC: G06F15/78 , G06F16/901 , G06F17/16
CPC classification number: G06F15/7885 , G06F15/7839 , G06F16/9024 , G06F17/16
Abstract: Disclosed is a method that includes sectioning a graph into a sequence of sections, the sequence of sections including at least a first section followed by a second section. The first section is configured to generate a first output in a first target tiling configuration in response to processing a first input in a first input tiling configuration. The graph is configured to reconfigure the first output in the first target tiling configuration to a second input in a second input tiling configuration. The second section is configured to generate a second output in a second target tiling configuration in response to processing the second input in the second input tiling configuration.
-
公开(公告)号:US20230367844A1
公开(公告)日:2023-11-16
申请号:US18225339
申请日:2023-07-24
Applicant: SambaNova Systems, Inc.
Inventor: Pramod NATARAJA , Raghu PRABHAKAR , David Brian JACKSON , Ram SIVARAMAKRISHNAN
IPC: G06F17/16
CPC classification number: G06F17/16
Abstract: A computing method comprises generating an integrated matrix having (K+P) number of columns, columns 1 through K of the integrated matrix comprising columns 1 through K of a multiplicand matrix and columns (K+1) though P of the integrated matrix comprising addend columns. The method computes K number of products of elements of a row of the integrated matrix multiplied by elements of a column of a second multiplicand matrix; computes a (K+1) product comprising an element of an addend column multiplied by a constant; and, computes a sum of the K number of products added to the (K+1) product. The sum is equivalent to a sum of products of a column of the M×K matrix multiplied by a row of the K×N matrix added to the an element of an addend column of the integrated matrix. A computing system and a computer program product can implement the method.
-
公开(公告)号:US20240378147A1
公开(公告)日:2024-11-14
申请号:US18144819
申请日:2023-05-08
Applicant: SambaNova Systems, Inc.
Inventor: Mark William Gottscho , Ram SIVARAMAKRISHNAN , David Brian JACKSON , Ruddhi CHAPHEKAR , Tuowen Zhao , Lei Xia
Abstract: A convolution calculation engine includes a kernel element counter for a convolution operation between a kernel and an input tensor. The kernel element counter wraps back to an initial kernel count value after reaching a maximum kernel count value. The convolution calculation engine also includes an offset look-up table (LUT) that provides a relative input offset into the input tensor based on an output of the kernel element counter and input location calculation logic that provides an input location within an input tensor for the convolution operation based on the relative input offset provided by the offset LUT.
-
公开(公告)号:US20240070111A1
公开(公告)日:2024-02-29
申请号:US18383744
申请日:2023-10-25
Applicant: SambaNova Systems, Inc.
Inventor: Manish K. SHAH , Ram SIVARAMAKRISHNAN , Gregory Frederick GROHOSKI , Raghu PRABHAKAR
CPC classification number: G06F15/7885 , G06F15/8023
Abstract: A reconfigurable processing unit is disclosed, comprising a first internal network and a second internal network with different protocols, an interface to an external network with a different protocol, a first configurable unit connected to the first internal network, a second configurable unit connected to both the first internal network and the second internal network, and a third configurable unit connected to both the second internal network and the interface to the external network. The third configurable unit is configured to receive a payload from the external network and send the transaction type identifier and the source application ID to the second configurable unit over the second internal network. The second configurable unit sends information to the first configurable unit based on the transaction type identifier and the source application ID matching the local application ID retrieved from the register.
-
公开(公告)号:US20230367845A1
公开(公告)日:2023-11-16
申请号:US18225365
申请日:2023-07-24
Applicant: SambaNova Systems, Inc.
Inventor: Pramod NATARAJA , Raghu PRABHAKAR , David Brian JACKSON , Ram SIVARAMAKRISHNAN
IPC: G06F17/16
CPC classification number: G06F17/16
Abstract: A method comprises executing (K+P) number of transposition cycles to generate a transpose-extended matrix having N rows and (K+P) columns, in which columns 1 to K comprise a transposition of a first matrix having K rows and N columns, and columns (K+1) to (K+P) comprise constants or elements of an N×1 matrix. The method includes computing a sum-product of a row of a second matrix, having M rows and N columns, multiplied by a column among columns 1 to K of the transpose-extended matrix; and, computing a second sum-product of the row of the second matrix multiplied by a column among columns (K+1) to (K+P) of the transpose-extended matrix. The sum-products can comprise gradients of input matrices. A transpose processing unit can execute the transposition cycles to read K rows of the first matrix and insert P number of constant or N×1 columns to generate the transpose-extended matrix.
-
公开(公告)号:US20230195686A1
公开(公告)日:2023-06-22
申请号:US18109817
申请日:2023-02-14
Applicant: SambaNova Systems, Inc.
Inventor: Raghu PRABHAKAR , Manish K. SHAH , Ram SIVARAMAKRISHNAN , Pramod NATARAJA , David Brian JACKSON , Gregory Frederick GROHOSKI
IPC: G06F15/78 , G06F13/20 , G06F15/80 , G06F9/52 , G06F15/173
CPC classification number: G06F15/7867 , G06F13/20 , G06F15/80 , G06F9/522 , G06F15/17325 , G06F2213/40
Abstract: A logic unit in an array of processing units is configurable to consume source tokens and a status signal and to produce barrier tokens and an enable signal based on the source tokens and the status signal.
-
7.
公开(公告)号:US20220309316A1
公开(公告)日:2022-09-29
申请号:US17364110
申请日:2021-06-30
Applicant: SambaNova Systems, Inc.
Inventor: Tejas Nagendra Babu NAMA , Ruddhi CHAPHEKAR , Ram SIVARAMAKRISHNAN , Raghu PRABHAKAR , Sumti JAIRATH , Junjue WANG , Kaizhao LIANG , Adi FUCHS , Matheen MUSADDIQ , Arvind Krishna SUJEETH
IPC: G06N3/04
Abstract: Disclosed is a data processing system that includes compile time logic to section a graph into a sequence of sections including a first section and a second section. The compile time logic is to configure the first section with a first topology of tiling configurations in which to tile inputs, intermediate outputs, and final outputs of the first section, and configure the second section with a second topology of tiling configurations in which to tile inputs, intermediate outputs, and final outputs of the second section. The data processing system further includes runtime logic configured with the compile time logic to execute the first section to generate the inputs, intermediate outputs, and final outputs of the first section in the first topology of tiling configurations, and execute the second section to generate the inputs, intermediate outputs, and final outputs of the second section in the second topology of tiling configurations.
-
公开(公告)号:US20220197709A1
公开(公告)日:2022-06-23
申请号:US17522655
申请日:2021-11-09
Applicant: SambaNova Systems, Inc.
Inventor: Ram SIVARAMAKRISHNAN , Sumti JAIRATH , Emre Ali BURHAN , Manish K. SHAH , Raghu PRABHAKAR , Ravinder KUMAR , Arnav GOEL , Ranen CHATTERJEE , Gregory Frederick GROHOSKI , Kin Hing LEUNG , Dawei HUANG , Manoj UNNIKRISHNAN , Martin Russell RAUMANN , Bandish B. SHAH
Abstract: The technology disclosed relates to runtime execution of configuration files on reconfigurable processors with varying configuration granularity. In particular, the technology disclosed relates to a runtime logic that is configured to receive a set of configuration files for an application, and load and execute a first subset of configuration files in the set of configuration files and associated application data on a first reconfigurable processor. The first reconfigurable processor has a first level of configurable granularity. The runtime logic is further configured to load and execute a second subset of configuration files in the set of configuration files and associated application data on a second reconfigurable processor. The second reconfigurable processor has a second level of configurable granularity that is different from the first level of configurable granularity.
-
公开(公告)号:US20240073129A1
公开(公告)日:2024-02-29
申请号:US18383718
申请日:2023-10-25
Applicant: SambaNova Systems, Inc.
Inventor: Manish K. SHAH , Ram SIVARAMAKRISHNAN , Gregory Frederick GROHOSKI , Raghu PRABHAKAR
IPC: H04L45/00 , H04L45/44 , H04L45/745
CPC classification number: H04L45/566 , H04L45/44 , H04L45/745
Abstract: A computing system is disclosed, comprising a plurality of interconnected reconfigurable dataflow units (RDUs). Each RDU includes configurable units, internal networks, and external interfaces. The first configurable unit of the first RDU sends a request to access an external memory attached to the second RDU over its first internal network. The second configurable unit of the first RDU obtains a memory address for the request, determines an identifier for the second RDU, and sends the request, identifier, and memory address to the third configurable unit of the first RDU over its second internal network. The third configurable unit of the first RDU generates a routable address on the external network, synthesizes a payload, and sends it through an external network interface. The third configurable unit of the second RDU receives the payload, and the fourth configurable unit of the second RDU uses the address to access the external memory.
-
公开(公告)号:US20230289310A1
公开(公告)日:2023-09-14
申请号:US18199361
申请日:2023-05-18
Applicant: SambaNova Systems, Inc.
Inventor: Gregory Frederick GROHOSKI , Sumti JAIRATH , Mark LUTTRELL , Raghu PRABHAKAR , Ram SIVARAMAKRISHNAN , Manish K. SHAH
CPC classification number: G06F13/4027 , G06F9/45533 , G06F12/10 , G06F13/1668 , G06F15/7839 , G06F15/7882 , G06F2212/657
Abstract: A reconfigurable data processor comprises an array of configurable units and a bus system. The bus system is connected to the array of configurable units. The bus system includes a top level network and an array level network. The top level network is connected to an external data interface for communication with memory outside of the array of configurable units. The array level network is connected to configurable units in the array of configurable units.
-
-
-
-
-
-
-
-
-