CIRCULAR BUFFER ARCHITECTURE USING LOCAL MEMORIES WITH LIMITED RESOURCES

    公开(公告)号:US20230205452A1

    公开(公告)日:2023-06-29

    申请号:US17646172

    申请日:2021-12-28

    Applicant: Xilinx, Inc.

    CPC classification number: G06F3/0656 G06F3/0604 G06F3/0679

    Abstract: A circular buffer architecture includes a memory coupled to a producer circuit and a consumer circuit. The memory is configured to store objects. The memory can include memory banks. The number of the memory banks is less than a number of the objects. The circular buffer can include hardware locks configured to reserve selected ones of the memory banks for use by the producer circuit or the consumer circuit. The circular buffer can include a buffer controller coupled to the memory and configured to track a plurality of positions. The positions can include a consumer bank position, a consumer object position, a producer bank position, and a producer object position. The buffer controller is configured to allocate selected ones of the objects from the memory banks to the producer circuit and to the consumer circuit according to the tracked positions and using the hardware locks.

    INSTRUCTION PRUNING FOR NEURAL NETWORKS
    2.
    发明公开

    公开(公告)号:US20240176981A1

    公开(公告)日:2024-05-30

    申请号:US18072012

    申请日:2022-11-30

    Applicant: Xilinx, Inc.

    CPC classification number: G06N3/04

    Abstract: In pruning weights from a neural network (NN), a design tool selects a dt-ds pair from a plurality of dt-ds pairs supported by a target device. Each dt-ds pair specifies a data type, dt, and an associated circuit structure, ds, that is configurable to compute d×s operations in parallel on a set of input activations and a matrix of weights of the data type, d is a number of rows in a sub-matrix of the matrix of weights, s is a number of columns in the sub-matrix, and d×s≥1. The design tool selects as pruned weights, one or more subsets of the weights, based at least on each subset of the one or more subsets including d×s weights in the matrix of weights of the layer. If performance of the pruned NN model is satisfactory, the NN is compiled into an execution graph and configuration data.

    Implementation-tuned architecture for neural network processing in a learned transform domain

    公开(公告)号:US12271818B1

    公开(公告)日:2025-04-08

    申请号:US17330048

    申请日:2021-05-25

    Applicant: XILINX, INC.

    Abstract: Embodiments herein describe a learnable transform block disposed before, or in between, the neural network layers to transform received data into a more computational-friendly domain while preserving discriminative features required for the neural network to generate accurate results. In one embodiment, during a training phase, an AI system learns parameters for the transform block that are then used during the inference phase to transform received data into the computational-friendly domain that has a reduced size input. The transformed data may require less compute resources or less memory usage to process by the underlying hardware device that hosts the neural network.

    Data mover circuitry for N-dimensional data in an integrated circuit

    公开(公告)号:US11327677B1

    公开(公告)日:2022-05-10

    申请号:US17019454

    申请日:2020-09-14

    Applicant: Xilinx, Inc.

    Abstract: An integrated circuit (IC) can include a decomposer data mover circuit configured to read sub-arrays from array data stored in a source memory; generate metadata headers for the sub-arrays, wherein each metadata header includes location information indicating location of a corresponding sub-array within the array data; create data tiles, wherein each data tile includes a sub-array and a corresponding metadata header; and output the data tiles to compute circuitry within the IC. The IC can include a composer data mover circuit configured to receive processed versions of the data tiles from the compute circuitry; extract valid data regions from the processed versions of the data tiles; and write the valid data regions to a destination memory based on the location information from the metadata headers of the processed versions of the data tiles.

    ARCHITECTURE OPTIMIZED TRAINING OF NEURAL NETWORKS

    公开(公告)号:US20190057305A1

    公开(公告)日:2019-02-21

    申请号:US15677311

    申请日:2017-08-15

    Applicant: Xilinx, Inc.

    Abstract: An example a method of optimizing a neural network having a plurality of layers includes: obtaining an architecture constraint for circuitry of an inference platform that implements the neural network; training the neural network on a training platform to generate network parameters and feature maps for the plurality of layers; and constraining the network parameters, the feature maps, or both based on the architecture constraint.

    Circular buffer architecture using local memories with limited resources

    公开(公告)号:US11954359B2

    公开(公告)日:2024-04-09

    申请号:US17646172

    申请日:2021-12-28

    Applicant: Xilinx, Inc.

    CPC classification number: G06F3/0656 G06F3/0604 G06F3/0679

    Abstract: A circular buffer architecture includes a memory coupled to a producer circuit and a consumer circuit. The memory is configured to store objects. The memory can include memory banks. The number of the memory banks is less than a number of the objects. The circular buffer can include hardware locks configured to reserve selected ones of the memory banks for use by the producer circuit or the consumer circuit. The circular buffer can include a buffer controller coupled to the memory and configured to track a plurality of positions. The positions can include a consumer bank position, a consumer object position, a producer bank position, and a producer object position. The buffer controller is configured to allocate selected ones of the objects from the memory banks to the producer circuit and to the consumer circuit according to the tracked positions and using the hardware locks.

Patent Agency Ranking