Abstract:
Strategies are stored in a memory arrangement, and each strategy includes a set of parameter settings for a design tool. The design tool identifies a set of features of an input circuit design and applies classification models to the input circuit design. Each classification model indicates one the strategies, and application of each classification model indicates a likelihood that use of the strategy would improve a metric of the input circuit design based on the set of features of the input circuit design. One strategy of the plurality of strategies is selected based on the likelihood that use of the one strategy would improve the metric of the input circuit design, and the design tool is configured with the set of parameter settings of the one strategy. The design tool then processes the input circuit design into implementation data that is suitable for making an integrated circuit (IC).
Abstract:
Scheduling kernels on a system with heterogeneous compute circuits includes receiving, by a hardware processor, a plurality of kernels and a graph including a plurality of nodes corresponding to the plurality of kernels. The graph defines a control flow and a data flow for the plurality of kernels. The kernels are implemented within different ones of a plurality of compute circuits coupled to the hardware processor. A set of buffers for performing a job for the graph are allocated based, at least in part, on the data flow specified by the graph. Different ones of the kernels as implemented in the compute circuits are invoked based on the control flow defined by the graph.
Abstract:
Scheduling work of a machine learning application includes instantiating kernel objects by a computer processor in response to input of kernel definitions. Each kernel object is of a kernel type indicating a compute circuit. The computer processor generates a graph in a memory. Each node represents a task and specifies an assignment of the task to one or more of the kernel objects, and each edge represents a data dependency. Task queues are created in the memory and assigned to queue tasks represented by the nodes. Kernel objects are assigned to the task queues, and the tasks are enqueued by threads executing the kernel objects, based on assignments of the kernel objects to the task queues and assignments of the tasks to the kernel objects. Tasks are dequeued by the threads, and the compute circuits are activated to initiate processing of the dequeued tasks.
Abstract:
Core processing and parameterization may include detecting, using a processor, a super parameter within a core, and, responsive to the detecting, automatically creating, using the processor, a data structure within a memory element having a hierarchy and having a parameter of the core. The data structure may be set as a value of the super parameter of the core.
Abstract:
A method of processing a circuit design in a circuit design tool includes: identifying selection of a parameterized core to be instantiated in a description of the circuit design managed by the circuit design tool and configured for implementation in target hardware; processing a configuration file for the parameterized core to select a set of parameter values from a plurality of sets of parameter values dynamically based at least in part on the target hardware; creating an instance of the parameterized core in the circuit design having the selected set of parameter values; and implementing the circuit design for the target hardware.
Abstract:
A user interface for a computer-aided design tool includes a display. The display includes a visualization of a processor system of a system-on-a-chip (SOC). The visualization includes a plurality of blocks and each block represents a component of the processor system. Each block visually indicates a configuration status of the component represented by the block.
Abstract:
A design for a programmable integrated circuit (IC) is synthesized and includes an inference engine and a data transformer. A portion of the design including the data transformer is designated as a dynamic function exchange (DFX) module. The inference engine is excluded from the DFX module. The design is implemented, by placing and routing, such that the DFX module is confined to a defined physical area of the programmable integrated circuit. An abstract shell for the design specifying boundary connections of the DFX module as placed and routed is generated. A locked version of the design as placed and routed with the DFX module removed is generated. The method includes implementing a different data transformer as a further DFX module for the design using the abstract shell.
Abstract:
Processing of a neural network specification includes gathering first layers of a neural network graph into groups of layers based on profiled compute times of the layers and equalized compute times between the groups. Each group is a subgraph of one or more of the layers of the neural network. The neural network graph is compiled into instructions for pipelined execution of the neural network graph by compute circuits. The compiling includes designating, for each first subgraph of the subgraphs having output activations that are input activations of a second subgraph of the subgraphs, operations of the first subgraph to be performed by a first compute circuit and operations of the second subgraph to be performed by a second compute circuit. The compute circuits are configured to execute the instructions.
Abstract:
Implementing a circuit design may include, responsive to a user input selecting a design, executing an implementation script of the design using the processor. Executing the implementation script may generate instructions for generating a circuit design from the design. Responsive to the instructions and using the processor, cores of the design may be automatically instantiated and connected.
Abstract:
An exemplary method of implementing a circuit design for a programmable integrated circuit (IC) includes, on at least one programmed processor, performing operations including: generating a description of circuit components of the circuit design including first portion of a circuit module that is independent of assignment of resources of the programmable IC; assigning a plurality of the resources of the programmable IC to a plurality of the circuit components including determining at least one resource assignment for the circuit module; and generating a physical implementation of the circuit components for implementation in the programmable IC, including generating a second portion of the circuit module that is dependent on the at least one resource assignment, and combining the second portion of the circuit module with the first portion of the circuit module.