Abstract:
Implementing an application for a data processing engine (DPE) array can include detecting, using computer hardware, a component of a hardware library package instantiated by an application. The application is specified in source code and is configured to execute on a DPE array. An instance of the component is extracted from the application. The extracted instance specifies values of parameters for the instance of the component. The instance can be partitioned by generating program code defining one or more kernels corresponding to the instance of the component. The partitioning is based on a defined performance metric of the component and a defined performance requirement of the application. The application is transformed by replacing the instance of the component with the program code generated by the partitioning. The application, as transformed, is compiled into program code executable by the DPE array.
Abstract:
Approaches for visualizing data buses in a circuit design include determining ones of the data buses that satisfy selection criteria. For each element connected to a data bus of the ones of the data buses, a method and system determine whether the element is of interest or the element is not of interest. A graphical representation of the ones of the data buses and each element of interest is generated, and data buses of the circuit design determined to not satisfy the selection criteria and elements not of interest are excluded from the graphical representation. The graphical representation is displayed on a display device.
Abstract:
Incremental synthesis for changes to a circuit design can include synthesizing, using computer hardware, a first circuit design resulting in a partitioning of the first circuit design and a plurality of synthesized partitions of the first circuit design and, for a second circuit design that is a modified version of the first circuit design and based upon the partitioning of the first circuit design, determining, using the computer hardware, a partition of the second circuit design that differs from the first circuit design. The partition of the second circuit design can be technology mapped using the computer hardware resulting in a synthesized partition of the second circuit design. A synthesized circuit design corresponding to the second circuit design can be generated using the computer hardware by combining synthesized partitions of the plurality of synthesized partitions of the first circuit design that are unchanged relative to the second circuit design with the synthesized partition of the second circuit design.
Abstract:
Implementing an application for a data processing engine (DPE) array can include detecting, using computer hardware, a component of a hardware library package instantiated by an application. The application is specified in source code and is configured to execute on a DPE array. An instance of the component is extracted from the application. The extracted instance specifies values of parameters for the instance of the component. The instance can be partitioned by generating program code defining one or more kernels corresponding to the instance of the component. The partitioning is based on a defined performance metric of the component and a defined performance requirement of the application. The application is transformed by replacing the instance of the component with the program code generated by the partitioning. The application, as transformed, is compiled into program code executable by the DPE array.
Abstract:
Approaches for logic compaction include inputting an optimization directive that specifies one of area optimization or speed optimization to a synthesis tool executing on a computer processor. The synthesis tool identifies a multiplier and/or an adder specified in a circuit design and synthesizing the multiplier into logic having LUT-to-LUT connections between LUTs on separate slices of a programmable integrated circuit (IC) in response to the optimization directive specifying speed optimization. The synthesis tool synthesizes the multiplier and/or adder into logic having LUT-carry connections between LUTs and carry logic within a single slice of the programmable IC in response to the optimization directive specifying area optimization. The method includes implementing a circuit on the programmable IC from the logic having LUT-carry connections in response to the optimization directive specifying area optimization.
Abstract:
Placement of macros of a circuit design includes mapping the macros to types of sub-circuits of an integrated circuit (IC). The IC includes anchors and instances of each type of the types of sub-circuits. The macros are grouped based on couplings of the macros to the anchors specified in the circuit design. Each group includes one or more macros, and the one or more macros in each group are all coupled to the same set of one or more anchors. A location is selected from alternative locations for each group of macros based on a distance of the location from the same set of anchors. Each location includes one or more instances of one or more types of the types of sub-circuits. The circuit design is placed and routed after selecting the location for each group, and implementation data is generated for making an IC that implements the circuit design.
Abstract:
Range information is determined for each variable of a circuit design. The range information is propagated from inputs to outputs of nodes of a DFG representation of the circuit design. For each multiplexer of the circuit design represented as a multiplexer node in the DFG, whether range information associated with a selector input of the multiplexer node restricts selection of data inputs of the multiplexer node to only one selected data input of the multiplexer node is determined. In response to determining that range information associated with the selector input restricts selection of data inputs to only one data input, the DFG is modified by connecting the selected data input to each load of the multiplexer node, and removing the multiplexer node, a corresponding select logic node of the multiplexer node, and nodes connected to unselected data inputs of the multiplexer node.
Abstract:
Implementing a circuit design can include determining a chain of a plurality of loop elements of a circuit design, wherein each loop element includes a bit select node configured to perform a bit assignment operation and a corresponding address calculation node, wherein the address calculation nodes use a common variable to calculate a starting bit location provided to the corresponding bit select node. In response to the determining, the chain is replicated resulting in one chain for each value of the common variable and transforming each chain into a plurality of wires. A multiplexer is inserted into the circuit design. The plurality of wires for each chain is coupled to inputs of the multiplexer and the common variable is provided to the multiplexer as a select signal.
Abstract:
Physical optimization for timing closure for an integrated circuit includes processing a circuit design at least partially through a design flow to a late stage of the design flow. Using a processor, a baseline delay is calculated for each of a plurality of paths of the circuit design. A slack for each of the plurality of paths is determined. Physical optimization further includes selecting a path of the circuit design that meets a selection criterion according, at least in part, to the slack of the path, applying, using the processor, a physical optimization to the selected path resulting in an optimized path, and calculating a delay of the optimized path. The optimized path is incorporated into the circuit design only responsive to determining that the delay of the optimized path is less than the baseline delay of the selected path.
Abstract:
Control set optimization for a circuit design includes generating, by a processor, Observability Don't Care (ODC) expressions for registers of the circuit design. Redundant reset pins of the registers of the circuit design are determined by the processor by iteratively checking, on a per-cube and a per-literal basis for each ODC expression, whether a value of a literal causes the ODC expression to evaluate to 1. A modified version of the circuit design is generated by the processor by connecting one or more reset pins of the set of redundant reset pins to one or more constants.