-
公开(公告)号:US20230205452A1
公开(公告)日:2023-06-29
申请号:US17646172
申请日:2021-12-28
Applicant: Xilinx, Inc.
Inventor: Kristof Denolf , Jack S. Lo , Louis Coulon , Kornelis A. Vissers
IPC: G06F3/06
CPC classification number: G06F3/0656 , G06F3/0604 , G06F3/0679
Abstract: A circular buffer architecture includes a memory coupled to a producer circuit and a consumer circuit. The memory is configured to store objects. The memory can include memory banks. The number of the memory banks is less than a number of the objects. The circular buffer can include hardware locks configured to reserve selected ones of the memory banks for use by the producer circuit or the consumer circuit. The circular buffer can include a buffer controller coupled to the memory and configured to track a plurality of positions. The positions can include a consumer bank position, a consumer object position, a producer bank position, and a producer object position. The buffer controller is configured to allocate selected ones of the objects from the memory banks to the producer circuit and to the consumer circuit according to the tracked positions and using the hardware locks.
-
公开(公告)号:US20240176981A1
公开(公告)日:2024-05-30
申请号:US18072012
申请日:2022-11-30
Applicant: Xilinx, Inc.
Inventor: Alireza Khodamoradi , Kristof Denolf
IPC: G06N3/04
CPC classification number: G06N3/04
Abstract: In pruning weights from a neural network (NN), a design tool selects a dt-ds pair from a plurality of dt-ds pairs supported by a target device. Each dt-ds pair specifies a data type, dt, and an associated circuit structure, ds, that is configurable to compute d×s operations in parallel on a set of input activations and a matrix of weights of the data type, d is a number of rows in a sub-matrix of the matrix of weights, s is a number of columns in the sub-matrix, and d×s≥1. The design tool selects as pruned weights, one or more subsets of the weights, based at least on each subset of the one or more subsets including d×s weights in the matrix of weights of the layer. If performance of the pruned NN model is satisfactory, the NN is compiled into an execution graph and configuration data.
-
公开(公告)号:US11676004B2
公开(公告)日:2023-06-13
申请号:US15677311
申请日:2017-08-15
Applicant: Xilinx, Inc.
Inventor: Kristof Denolf , Kornelis A. Vissers
Abstract: An example a method of optimizing a neural network having a plurality of layers includes: obtaining an architecture constraint for circuitry of an inference platform that implements the neural network; training the neural network on a training platform to generate network parameters and feature maps for the plurality of layers; and constraining the network parameters, the feature maps, or both based on the architecture constraint.
-
4.
公开(公告)号:US12271818B1
公开(公告)日:2025-04-08
申请号:US17330048
申请日:2021-05-25
Applicant: XILINX, INC.
Inventor: Kristof Denolf , Alireza Khodamoradi , Kornelis A. Vissers
Abstract: Embodiments herein describe a learnable transform block disposed before, or in between, the neural network layers to transform received data into a more computational-friendly domain while preserving discriminative features required for the neural network to generate accurate results. In one embodiment, during a training phase, an AI system learns parameters for the transform block that are then used during the inference phase to transform received data into the computational-friendly domain that has a reduced size input. The transformed data may require less compute resources or less memory usage to process by the underlying hardware device that hosts the neural network.
-
公开(公告)号:US11327677B1
公开(公告)日:2022-05-10
申请号:US17019454
申请日:2020-09-14
Applicant: Xilinx, Inc.
Inventor: Kristof Denolf , Jack S. Lo , Kornelis A. Vissers
IPC: G06F3/06
Abstract: An integrated circuit (IC) can include a decomposer data mover circuit configured to read sub-arrays from array data stored in a source memory; generate metadata headers for the sub-arrays, wherein each metadata header includes location information indicating location of a corresponding sub-array within the array data; create data tiles, wherein each data tile includes a sub-array and a corresponding metadata header; and output the data tiles to compute circuitry within the IC. The IC can include a composer data mover circuit configured to receive processed versions of the data tiles from the compute circuitry; extract valid data regions from the processed versions of the data tiles; and write the valid data regions to a destination memory based on the location information from the metadata headers of the processed versions of the data tiles.
-
公开(公告)号:US20190057305A1
公开(公告)日:2019-02-21
申请号:US15677311
申请日:2017-08-15
Applicant: Xilinx, Inc.
Inventor: Kristof Denolf , Kornelis A. Vissers
Abstract: An example a method of optimizing a neural network having a plurality of layers includes: obtaining an architecture constraint for circuitry of an inference platform that implements the neural network; training the neural network on a training platform to generate network parameters and feature maps for the plurality of layers; and constraining the network parameters, the feature maps, or both based on the architecture constraint.
-
公开(公告)号:US12067484B2
公开(公告)日:2024-08-20
申请号:US16449264
申请日:2019-06-21
Applicant: Xilinx, Inc.
Inventor: Yaman Umuroglu , Nicholas Fraser , Michaela Blott , Kristof Denolf , Kornelis A. Vissers
Abstract: An example method of training a neural network includes defining hardware building blocks (HBBs), neuron equivalents (NEQs), and conversion procedures from NEQs to HBBs; defining the neural network using the NEQs in a machine learning framework; training the neural network on a training platform; and converting the neural network as trained into a netlist of HBBs using the conversion procedures to convert the NEQs in the neural network to the HBBs of the netlist.
-
公开(公告)号:US11954359B2
公开(公告)日:2024-04-09
申请号:US17646172
申请日:2021-12-28
Applicant: Xilinx, Inc.
Inventor: Kristof Denolf , Jack S. Lo , Louis Coulon , Kornelis A. Vissers
IPC: G06F3/06
CPC classification number: G06F3/0656 , G06F3/0604 , G06F3/0679
Abstract: A circular buffer architecture includes a memory coupled to a producer circuit and a consumer circuit. The memory is configured to store objects. The memory can include memory banks. The number of the memory banks is less than a number of the objects. The circular buffer can include hardware locks configured to reserve selected ones of the memory banks for use by the producer circuit or the consumer circuit. The circular buffer can include a buffer controller coupled to the memory and configured to track a plurality of positions. The positions can include a consumer bank position, a consumer object position, a producer bank position, and a producer object position. The buffer controller is configured to allocate selected ones of the objects from the memory banks to the producer circuit and to the consumer circuit according to the tracked positions and using the hardware locks.
-
-
-
-
-
-
-