SYNTHESIS FLOW FOR DATA PROCESSING ENGINE ARRAY APPLICATIONS RELYING ON HARDWARE LIBRARY PACKAGES

    公开(公告)号:US20230161569A1

    公开(公告)日:2023-05-25

    申请号:US17456002

    申请日:2021-11-22

    申请人: Xilinx, Inc.

    摘要: Implementing an application for a data processing engine (DPE) array can include detecting, using computer hardware, a component of a hardware library package instantiated by an application. The application is specified in source code and is configured to execute on a DPE array. An instance of the component is extracted from the application. The extracted instance specifies values of parameters for the instance of the component. The instance can be partitioned by generating program code defining one or more kernels corresponding to the instance of the component. The partitioning is based on a defined performance metric of the component and a defined performance requirement of the application. The application is transformed by replacing the instance of the component with the program code generated by the partitioning. The application, as transformed, is compiled into program code executable by the DPE array.

    SYNTHESIS FOR MATRIX MULTIPLICATION USING A DATA PROCESSING ARRAY

    公开(公告)号:US20240193225A1

    公开(公告)日:2024-06-13

    申请号:US18065491

    申请日:2022-12-13

    申请人: Xilinx, Inc.

    IPC分类号: G06F17/16 G06F7/487 G06F7/72

    摘要: Parameters defining a matrix multiply operation to be implemented in a data processing array can be received. A formulation of the matrix multiply operation is generated based on the parameters. A matrix multiply solution is determined for performing the matrix multiply operation in the data processing array. The matrix multiply solution specifies a spatial and temporal partitioning of the matrix multiply operation for implementation in the data processing array. Synthesizable program code is generated that defines an interface for the data processing array based on the matrix multiply solution. The interface is configured to partition and transfer input data to the data processing array from an external memory and convey output data from the data processing array to the external memory.

    Folding multiply-and-accumulate logic

    公开(公告)号:US10789401B1

    公开(公告)日:2020-09-29

    申请号:US16294520

    申请日:2019-03-06

    申请人: Xilinx, Inc.

    摘要: Approaches for folding multiply-and-accumulate (MAC) logic in a circuit design involve a design tool recognizing a first instance of the MAC logic and a second instance of the MAC logic. The design tool replaces the first instance of the MAC logic and the second instance of the MAC logic with one instance of pipelined MAC logic. The design tool configures the pipelined MAC logic to input data signals of the first instance of the MAC logic and the second instance of the MAC logic to the pipelined MAC logic at a first clock rate, and switch between selection of the data signals of the first instance of the MAC logic and the second instance of the MAC logic at a second clock rate that is double the first clock rate. The design tool further configures the pipelined MAC logic to pipeline input data signals at the second clock rate, and to capture intermediate results at the second clock rate. The design tool further configures a register to capture output of the pipelined MAC logic at the first clock rate.

    Synthesis flow for data processing engine array applications relying on hardware library packages

    公开(公告)号:US11829733B2

    公开(公告)日:2023-11-28

    申请号:US17456002

    申请日:2021-11-22

    申请人: Xilinx, Inc.

    摘要: Implementing an application for a data processing engine (DPE) array can include detecting, using computer hardware, a component of a hardware library package instantiated by an application. The application is specified in source code and is configured to execute on a DPE array. An instance of the component is extracted from the application. The extracted instance specifies values of parameters for the instance of the component. The instance can be partitioned by generating program code defining one or more kernels corresponding to the instance of the component. The partitioning is based on a defined performance metric of the component and a defined performance requirement of the application. The application is transformed by replacing the instance of the component with the program code generated by the partitioning. The application, as transformed, is compiled into program code executable by the DPE array.

    Compaction of multiplier and adder circuits

    公开(公告)号:US11768663B1

    公开(公告)日:2023-09-26

    申请号:US17014410

    申请日:2020-09-08

    申请人: Xilinx, Inc.

    摘要: Approaches for logic compaction include inputting an optimization directive that specifies one of area optimization or speed optimization to a synthesis tool executing on a computer processor. The synthesis tool identifies a multiplier and/or an adder specified in a circuit design and synthesizing the multiplier into logic having LUT-to-LUT connections between LUTs on separate slices of a programmable integrated circuit (IC) in response to the optimization directive specifying speed optimization. The synthesis tool synthesizes the multiplier and/or adder into logic having LUT-carry connections between LUTs and carry logic within a single slice of the programmable IC in response to the optimization directive specifying area optimization. The method includes implementing a circuit on the programmable IC from the logic having LUT-carry connections in response to the optimization directive specifying area optimization.