SYNTHESIS FOR MATRIX MULTIPLICATION USING A DATA PROCESSING ARRAY

    公开(公告)号:US20240193225A1

    公开(公告)日:2024-06-13

    申请号:US18065491

    申请日:2022-12-13

    Applicant: Xilinx, Inc.

    CPC classification number: G06F17/16 G06F7/4876 G06F7/727

    Abstract: Parameters defining a matrix multiply operation to be implemented in a data processing array can be received. A formulation of the matrix multiply operation is generated based on the parameters. A matrix multiply solution is determined for performing the matrix multiply operation in the data processing array. The matrix multiply solution specifies a spatial and temporal partitioning of the matrix multiply operation for implementation in the data processing array. Synthesizable program code is generated that defines an interface for the data processing array based on the matrix multiply solution. The interface is configured to partition and transfer input data to the data processing array from an external memory and convey output data from the data processing array to the external memory.

    Data flow graph refinement using range set information for improved synthesis

    公开(公告)号:US11755801B1

    公开(公告)日:2023-09-12

    申请号:US17107367

    申请日:2020-11-30

    Applicant: Xilinx, Inc.

    CPC classification number: G06F30/3312

    Abstract: Implementing a circuit design within an integrated circuit can include converting the circuit design, specified in a hardware description language, into a data flow graph and creating range set data structures in a memory. The range set data structures correspond to nodes of the data flow graph. Each range set data structure can be initialized with a range of values the corresponding node can take as specified by the circuit design. The method can include determining actual values the nodes are capable of taking by propagating the values through the data flow graph. The range set data structures are updated to store the actual values for the corresponding nodes. The method also can include modifying a selected node of the data flow graph based on the actual values stored in the range set data structure of the selected node and semantics of the selected node.

    Data flow graph optimization techniques for RTL loops with conditional-exit statements

    公开(公告)号:US10943042B1

    公开(公告)日:2021-03-09

    申请号:US16733568

    申请日:2020-01-03

    Applicant: Xilinx, Inc.

    Abstract: A computer-implemented method includes compiling a Register Transfer Level (RTL) code to form a data flow graph (DFG). The computer-implemented method includes identifying a chain of multiplexers in the DFG, wherein the chain of multiplexers includes exit multiplexers associated with a loop exit path and non-exit multiplexers. The computer-implemented method also includes traversing a topological order of the DFG in reverse. The computer-implemented method also includes computing fanin-cones for each two consecutive exit multiplexers. The computer-implemented method includes generating a truth table responsive to valid fanin-cones and back propagating select conditions for the each two consecutive exit multiplexers. The computer-implemented method includes eliminating an exit multiplexer from the each two consecutive exit multiplexers based on the truth table. The computer-implemented method further includes transforming the DFG to a new DFG based on the truth table.

    Folding multiply-and-accumulate logic

    公开(公告)号:US10789401B1

    公开(公告)日:2020-09-29

    申请号:US16294520

    申请日:2019-03-06

    Applicant: Xilinx, Inc.

    Abstract: Approaches for folding multiply-and-accumulate (MAC) logic in a circuit design involve a design tool recognizing a first instance of the MAC logic and a second instance of the MAC logic. The design tool replaces the first instance of the MAC logic and the second instance of the MAC logic with one instance of pipelined MAC logic. The design tool configures the pipelined MAC logic to input data signals of the first instance of the MAC logic and the second instance of the MAC logic to the pipelined MAC logic at a first clock rate, and switch between selection of the data signals of the first instance of the MAC logic and the second instance of the MAC logic at a second clock rate that is double the first clock rate. The design tool further configures the pipelined MAC logic to pipeline input data signals at the second clock rate, and to capture intermediate results at the second clock rate. The design tool further configures a register to capture output of the pipelined MAC logic at the first clock rate.

    Parallelizing timing-based operations for circuit designs

    公开(公告)号:US10303833B1

    公开(公告)日:2019-05-28

    申请号:US15429014

    申请日:2017-02-09

    Applicant: Xilinx, Inc.

    Abstract: Parallelizing operations for implementing a circuit design can include dividing, using a processor, the circuit design into a plurality of partitions, wherein each partition is stored as a separate file, for each partition, generating, using the processor, a timing arc file specifying boundary delays for the partition, and generating, using the processor, a partition design file specifying interfaces of the partitions. Using the processor, a plurality of processes executing in parallel can be initiated. Each process is adapted to operate on a selected partition using the partition design file and the timing arc files for the other partitions to generate an updated file for the selected partition.

Patent Agency Ranking