SEARCHING NEURAL NETWORK PIPELINES BASED ON NAMED MATRIX DIMENSIONS

    公开(公告)号:US20230315414A1

    公开(公告)日:2023-10-05

    申请号:US18128104

    申请日:2023-03-29

    IPC分类号: G06F8/41 G06F16/22

    CPC分类号: G06F8/45 G06F16/2237

    摘要: A method comprises a compiler determining operators and matrices of an application model. The compiler generates a dimension-based search space (DBSS) comprising Named Nodes corresponding to the operators. The Named Nodes comprise a Named DIM corresponding to a matrix associated with an operator. The Named DIM comprises a DIM Name corresponding to a dimension of a row or column of the matrix. The DBSS comprises an application programming interface (API) to determine operators, matrices, and/or attributes of operators/matrices of the application model using the DIM Names. The method includes the compiler determining the operator, the matrix, and the Named DIM and generating an entry in the DBSS that includes a Named Node corresponding to the operator, a Named DIM corresponding to the matrix and including the DIM Name. A computing system and/or a computer program product can implement the method.

    TENSOR CHECKPOINT OPTIMIZATION IN DATAFLOW COMPUTING APPLICATIONS

    公开(公告)号:US20230315407A1

    公开(公告)日:2023-10-05

    申请号:US18129722

    申请日:2023-03-31

    IPC分类号: G06F8/41

    CPC分类号: G06F8/433

    摘要: According to a computing method a compiler determines a recompute node included in a dataflow application and a checkpoint tensor produced by the recompute node. The compiler determines a recompute cost to recompute the checkpoint tensor, and a memory cost to checkpoint the checkpoint tensor in a memory. Based on the recompute cost and/or the memory cost, the compiler determines a solution cost and compares the solution cost to a solution threshold. Based on comparing the solution cost to the solution threshold, the compiler determines a checkpoint solution to execute the dataflow application. The checkpoint solution can comprise recomputing or checkpointing the checkpoint tensor. In some implementations, the compiler can determine a recompute ratio of the recompute cost to the memory cost and can compare the recompute ratio to the solution threshold. A computer program product and a computing system can implement aspects of the method.

    Merging Skip-Buffers
    3.
    发明公开

    公开(公告)号:US20230305823A1

    公开(公告)日:2023-09-28

    申请号:US18126610

    申请日:2023-03-27

    IPC分类号: G06F8/41

    CPC分类号: G06F8/45 G06F8/4434

    摘要: A method in a reconfigurable computing system includes connecting a plurality of tensor consumers to their corresponding tensor producers via skip-buffers, which generates a plurality of skip-buffers. The method includes determining that at least one skip-buffer of the plurality of skip-buffers corresponding to a first set of tensor consumers and at least one skip-buffer of the plurality of skip-buffers corresponding to a second set of tensor consumers, are compatible to wholly or partially merge. The method also includes merging, wholly or partially, the compatible skip-buffers to produce a merged skip-buffer having a minimal buffer depth. The described method may reduce memory unit consumption and latency.

    Bandwidth-Aware Computational Graph Mapping
    4.
    发明公开

    公开(公告)号:US20230297349A1

    公开(公告)日:2023-09-21

    申请号:US18121766

    申请日:2023-03-15

    IPC分类号: G06F8/41

    CPC分类号: G06F8/433

    摘要: A computer-implemented method of transforming a high-level program for mapping onto a coarse-grained reconfigurable (CGR) processor with an array of CGR units, including sectioning a dataflow graph into a plurality of sections; extracting performance information for each of the plurality of sections; on a CGR unit: assigning to a section at least two computations dependent on a first data element; scheduling an additional load of the first data element in response to available memory bandwidth for that section; eliminating a buffer between the additional load of the first data element and one of the two computations, for that section; generating configuration data for the and communication channels, wherein the configuration data, when loaded onto an instance of the array of CGR units, causes the array of CGR units to implement the dataflow graph; and storing the configuration data in a non-transitory computer-readable storage medium.

    Operation Fusion in Nested Meta-pipeline Loops

    公开(公告)号:US20230315411A1

    公开(公告)日:2023-10-05

    申请号:US18130642

    申请日:2023-04-04

    IPC分类号: G06F8/41

    摘要: A method for improving throughput in a reconfigurable computing system includes detecting, in an algebraic representation of a computing task for a reconfigurable dataflow processor, an outer meta-pipeline loop, detecting an inner meta-pipeline loop nested within the outer meta-pipeline loop, and determining that the inner meta-pipeline loop and the outer meta-pipeline loop each conduct a common operation. The method also includes fusing the common operation for the inner meta-pipeline loop and the outer meta-pipeline loop into a single operation within the inner meta-pipeline loop. The instances of the common operation may be fused if the output of a first instance of the common operation is the source for a second instance of the common operation. Examples of the common operation include an accumulator operation, a re-read operation, and a temporal (chip buffer synchronized) operation such as a temporal concatenation operation and a temporal slicing operation.

    OPTIMIZING TENSOR TILING IN NEURAL NETWORKS BASED ON A TILING COST MODEL

    公开(公告)号:US20230315410A1

    公开(公告)日:2023-10-05

    申请号:US18129714

    申请日:2023-03-31

    IPC分类号: G06F8/41

    CPC分类号: G06F8/443

    摘要: A method comprises a compiler analyzing a graph to determine a pipeline of operators based on a shared dimension of input and output tensors among the operators. The operators are included in the graph and the graph corresponds to a dataflow application. The compiler determines a tiling decision associated with the pipeline and a tiling cost associated with the tiling decision. The tiling decision can comprise a tile shape to slice tensors of operators of the pipeline. Based on the tiling cost, the compiler determines that the tiling decision improves an optimization objective and includes the pipeline and tiling decision in mapping decisions associated with executing the application on a computing system. The compiler can apply a tiling cost model to determine the tiling costs. A computer program product and a computing system can implement the method.