-
公开(公告)号:US11204745B2
公开(公告)日:2021-12-21
申请号:US16420831
申请日:2019-05-23
Applicant: Xilinx, Inc.
Inventor: Shail Aditya Gupta , Samuel R. Bayliss , Vinod K. Kathail , Ralph D. Wittig , Philip B. James-Roxby , Akella Sastry
IPC: G06F9/44 , G06F8/41 , G06F16/901 , G06F9/54 , G06F15/78
Abstract: Examples herein describe techniques for generating dataflow graphs using source code for defining kernels and communication links between those kernels. In one embodiment, the graph is formed using nodes (e.g., kernels) which are communicatively coupled by edges (e.g., the communication links between the kernels). A compiler converts the source code into a bit stream and/or binary code which configure a heterogeneous processing system of a SoC to execute the graph. The compiler uses the graph expressed in source code to determine where to assign the kernels in the heterogeneous processing system. Further, the compiler can select the specific communication techniques to establish the communication links between the kernels and whether synchronization should be used in a communication link. Thus, the programmer can express the dataflow graph at a high-level (using source code) without understanding about how the operator graph is implemented using the heterogeneous hardware in the SoC.
-
公开(公告)号:US11422781B1
公开(公告)日:2022-08-23
申请号:US16781323
申请日:2020-02-04
Applicant: Xilinx, Inc.
Inventor: Stephen A. Neuendorffer , Prasanth Chatarasi , Samuel R. Bayliss
Abstract: Disclosed approaches for generating vector codes include inputting tensor processing statements. Each statement specifies an output variable, an initial variable, and multiply-and-accumulate (MAC) operations, and each MAC operation references the output variable, elements of a first tensor, and one or more elements of a second tensor. The MAC operations are organized into groups, and the MAC operations in each group reference the same output variable and have overlapping references to elements of the first tensor. For each group of MAC operations, at least one instruction is generated to load elements of the first tensor into a first register and at least one instruction is generated to load one or more elements of the second tensor into a second register. For each group of MAC operations, instructions are generated to select for each MAC operation in the group for input to an array of MAC circuits, elements from the first register and one or more elements from the second register.
-
公开(公告)号:US20200372200A1
公开(公告)日:2020-11-26
申请号:US16420881
申请日:2019-05-23
Applicant: Xilinx, Inc.
Inventor: Mukund Sivaraman , Shail Aditya Gupta , Akella Sastry , Rishi Surendran , Philip B. James-Roxby , Samuel R. Bayliss , Vinod K. Kathail , Ajit K. Agarwal , Ralph D. Wittig
IPC: G06F30/347 , G06F8/41 , G06F16/901 , G06F12/1081 , G06F30/394
Abstract: An example method of implementing an application for a system-on-chip (SOC) having a data processing engine (DPE) array includes determining a graph representation of the application, the graph representation including nodes representing kernels of the application and edges representing communication between the kernels, mapping, based on the graph, the kernels onto DPEs of the DPE array and data structures of the kernels onto memory in the DPE array, routing communication channels between DPEs and circuitry of the application configured in programmable logic of the SOC, and generating implementation data for programming the SOC to implement the application based on results of the mapping and the routing.
-
公开(公告)号:US11687327B2
公开(公告)日:2023-06-27
申请号:US17695895
申请日:2022-03-16
Applicant: XILINX, INC.
Inventor: Chia-Jui Hsu , Shail Aditya Gupta , Samuel R. Bayliss , Philip B. James-Roxby , Ralph D. Wittig , Vinod Kathail
IPC: G06F8/41 , G06F16/901 , G06F9/54 , G06F11/34
CPC classification number: G06F8/433 , G06F9/54 , G06F11/3495 , G06F16/9024
Abstract: Embodiments herein use control application programming interfaces (APIs) to control the execution of a dataflow graph in a heterogeneous processing system. That is, embodiments herein describe a programming model along with associated APIs and methods that can control, interact, and at least partially reconfigure a user application (e.g., the dataflow graph) executing on the heterogeneous processing system through a local executing control program. Using the control APIs, users can manipulate such remotely executing graphs directly as local objects and perform control operations on them (e.g., for loading and initializing the graphs; dynamically adjusting parameters for adaptive control; monitoring application parameters, system states and events; scheduling operations to read and write data across the distributed memory boundary of the platform; controlling the execution life-cycle of a subsystem; and partially reconfiguring the computing resources for a new subsystem).
-
公开(公告)号:US10860766B1
公开(公告)日:2020-12-08
申请号:US16420881
申请日:2019-05-23
Applicant: Xilinx, Inc.
Inventor: Mukund Sivaraman , Shail Aditya Gupta , Akella Sastry , Rishi Surendran , Philip B. James-Roxby , Samuel R. Bayliss , Vinod K. Kathail , Ajit K. Agarwal , Ralph D. Wittig
IPC: G06F30/347 , G06F8/41 , G06F16/901 , G06F30/394 , G06F12/1081 , G06F115/02
Abstract: An example method of implementing an application for a system-on-chip (SOC) having a data processing engine (DPE) array includes determining a graph representation of the application, the graph representation including nodes representing kernels of the application and edges representing communication between the kernels, mapping, based on the graph, the kernels onto DPEs of the DPE array and data structures of the kernels onto memory in the DPE array, routing communication channels between DPEs and circuitry of the application configured in programmable logic of the SOC, and generating implementation data for programming the SOC to implement the application based on results of the mapping and the routing.
-
公开(公告)号:US10802807B1
公开(公告)日:2020-10-13
申请号:US16420840
申请日:2019-05-23
Applicant: Xilinx, Inc.
Inventor: Chia-Jui Hsu , Shail Aditya Gupta , Samuel R. Bayliss , Philip B. James-Roxby , Ralph D. Wittig , Vinod Kathail
IPC: G06F8/41 , G06F16/901 , G06F9/54 , G06F11/34
Abstract: Embodiments herein use control application programming interfaces (APIs) to control the execution of a dataflow graph in a heterogeneous processing system. That is, embodiments herein describe a programming model along with associated APIs and methods that can control, interact, and at least partially reconfigure a user application (e.g., the dataflow graph) executing on the heterogeneous processing system through a local executing control program. Using the control APIs, users can manipulate such remotely executing graphs directly as local objects and perform control operations on them (e.g., for loading and initializing the graphs; dynamically adjusting parameters for adaptive control; monitoring application parameters, system states and events; scheduling operations to read and write data across the distributed memory boundary of the platform; controlling the execution life-cycle of a subsystem; and partially reconfiguring the computing resources for a new subsystem).
-
公开(公告)号:US11281440B1
公开(公告)日:2022-03-22
申请号:US17065433
申请日:2020-10-07
Applicant: XILINX, INC.
Inventor: Chia-Jui Hsu , Shail Aditya Gupta , Samuel R. Bayliss , Philip B. James-Roxby , Ralph D. Wittig , Vinod Kathail
IPC: G06F8/41 , G06F11/34 , G06F9/54 , G06F16/901
Abstract: Embodiments herein use control application programming interfaces (APIs) to control the execution of a dataflow graph in a heterogeneous processing system. That is, embodiments herein describe a programming model along with associated APIs and methods that can control, interact, and at least partially reconfigure a user application (e.g., the dataflow graph) executing on the heterogeneous processing system through a local executing control program. Using the control APIs, users can manipulate such remotely executing graphs directly as local objects and perform control operations on them (e.g., for loading and initializing the graphs; dynamically adjusting parameters for adaptive control; monitoring application parameters, system states and events; scheduling operations to read and write data across the distributed memory boundary of the platform; controlling the execution life-cycle of a subsystem; and partially reconfiguring the computing resources for a new subsystem).
-
公开(公告)号:US11113030B1
公开(公告)日:2021-09-07
申请号:US16420905
申请日:2019-05-23
Applicant: Xilinx, Inc.
Inventor: Dinesh K. Monga , Shail Aditya Gupta , Samuel R. Bayliss , Kaushik Barman
Abstract: Examples herein describe techniques for generating dataflow graphs using source code for defining kernels and communication links between those kernels. In one embodiment, the graph is formed using nodes (e.g., kernels) which are communicatively coupled by edges (e.g., the communication links between the kernels). A compiler converts the source code into a bitstream and/or binary code which configures programmable and non-programmable logic in a heterogeneous processing environment of a SoC to execute the graph. The compiler can also consider user-defined constraints when compiling the source code. The constraints can dictate where the kernels and buffers should be placed in the heterogeneous processing environment, performance requirements, data communication routes through the SoC, type of data path, delays, and the like.
-
公开(公告)号:US20200371761A1
公开(公告)日:2020-11-26
申请号:US16420831
申请日:2019-05-23
Applicant: Xilinx, Inc.
Inventor: Shail Aditya Gupta , Samuel R. Bayliss , Vinod K. Kathail , Ralph D. Wittig , Philip B. James-Roxby , Akella Sastry
IPC: G06F8/41 , G06F9/54 , G06F16/901 , G06F15/78
Abstract: Examples herein describe techniques for generating dataflow graphs using source code for defining kernels and communication links between those kernels. In one embodiment, the graph is formed using nodes (e.g., kernels) which are communicatively coupled by edges (e.g., the communication links between the kernels). A compiler converts the source code into a bit stream and/or binary code which configure a heterogeneous processing system of a SoC to execute the graph. The compiler uses the graph expressed in source code to determine where to assign the kernels in the heterogeneous processing system. Further, the compiler can select the specific communication techniques to establish the communication links between the kernels and whether synchronization should be used in a communication link. Thus, the programmer can express the dataflow graph at a high-level (using source code) without understanding about how the operator graph is implemented using the heterogeneous hardware in the SoC.
-
-
-
-
-
-
-
-