Supporting access to accelerators on a programmable integrated circuit by multiple host processes

    公开(公告)号:US11086815B1

    公开(公告)日:2021-08-10

    申请号:US16384624

    申请日:2019-04-15

    Applicant: Xilinx, Inc.

    Abstract: Supporting multiple clients on a single programmable integrated circuit (IC) can include implementing a first image within the programmable IC in response to a first request for processing to be performed by the programmable IC, wherein the request is from a first process executing in a host data processing system coupled to the programmable IC, receiving, using a processor of the host data processing system, a second request for processing to be performed on the programmable IC from a second and different process executing in the host data processing system while the programmable IC still implements the first image, comparing, using the processor, a second image specified by the second request to the first image, and, in response to determining that the second image matches the first image based on the comparing, granting, using the processor, the second request for processing to be performed by the programmable IC.

    MACHINE LEARNING RUNTIME LIBRARY FOR NEURAL NETWORK ACCELERATION

    公开(公告)号:US20190114533A1

    公开(公告)日:2019-04-18

    申请号:US15785679

    申请日:2017-10-17

    Applicant: Xilinx, Inc.

    Abstract: Embodiments herein describe techniques for interfacing a neural network application with a neural network accelerator using a library. The neural network application may execute on a host computing system while the neural network accelerator executes on a massively parallel hardware system, e.g., a FPGA. The library operates a pipeline for submitting the tasks received from the neural network application to the neural network accelerator. In one embodiment, the pipeline includes a pre-processing stage, an FPGA execution stage, and a post-processing stage which each correspond to different threads. When receiving a task from the neural network application, the library generates a packet that includes the information required for the different stages in the pipeline to perform the tasks. Because the stages correspond to different threads, the library can process multiple packets in parallel which can increase the utilization of the neural network accelerator on the hardware system.

    PARALLEL COMPUTE OFFLOAD TO DATABASE ACCELERATOR

    公开(公告)号:US20180373760A1

    公开(公告)日:2018-12-27

    申请号:US15632082

    申请日:2017-06-23

    Applicant: Xilinx, Inc.

    Abstract: Embodiments herein describe techniques for preparing and executing tasks related to a database query in a database accelerator. In one embodiment, the database accelerator is separate from a host CPU. A database management system (DBMS) can offload tasks corresponding to a database query to the database accelerator. The DBMS can request data from the database relevant to the query and then convert that data into one or more data blocks that are suitable for processing by the database accelerator. In one embodiment, the database accelerator contains individual hardware processing units (PUs) that can process data in parallel or concurrently. In order to process the data concurrently, the data block includes individual PU data blocks that are each intended for a respective PU in the database accelerator.

    HETEROGENEOUS MULTIPROCESSOR PLATFORM TARGETING PROGRAMMABLE INTEGRATED CIRCUITS
    17.
    发明申请
    HETEROGENEOUS MULTIPROCESSOR PLATFORM TARGETING PROGRAMMABLE INTEGRATED CIRCUITS 有权
    异构多媒体平台定向可编程集成电路

    公开(公告)号:US20160132441A1

    公开(公告)日:2016-05-12

    申请号:US14539985

    申请日:2014-11-12

    Applicant: Xilinx, Inc.

    CPC classification number: G06F13/1689 G06F13/28

    Abstract: An integrated circuit (IC) includes a first region being static and providing an interface between the IC and a host processor. The first region includes a first interconnect circuit block having a first master interface and a second interconnect circuit block having a first slave interface. The IC includes a second region coupled to the first region. The second region implements a kernel of a heterogeneous, multiprocessor design and includes a slave interface coupled to the first master interface of the first interconnect circuit block and configured to receive commands from the host processor. The second region also includes a master interface coupled the first slave interface of the second interconnect circuit block, wherein the master interface of the second region is a master for a memory controller.

    Abstract translation: 集成电路(IC)包括静态的第一区域,并且在IC和主机处理器之间提供接口。 第一区域包括具有第一主接口的第一互连电路块和具有第一从接口的第二互连电路块。 IC包括耦合到第一区域的第二区域。 第二区域实现异构多处理器设计的内核,并且包括耦合到第一互连电路块的第一主接口并被配置为从主处理器接收命令的从接口。 第二区域还包括耦合第二互连电路块的第一从接口的主接口,其中第二区域的主接口是用于存储器控制器的主站。

    Unified container for hardware and software binaries

    公开(公告)号:US11720422B1

    公开(公告)日:2023-08-08

    申请号:US17198887

    申请日:2021-03-11

    Applicant: Xilinx, Inc.

    CPC classification number: G06F9/545 G06F8/44 G06F21/53 G06F21/572 G06F8/65

    Abstract: A unified container file can be selected using computer hardware. The unified container file can include a plurality of files embedded therein used to configure a programmable integrated circuit (IC). The plurality of files can include a first partial configuration bitstream and a second partial configuration bitstream. The unified container file also includes metadata specifying a defined relationship between the first partial configuration bitstream and the second partial configuration bitstream for programming the programmable IC. The defined relationship can be determined using computer hardware by reading the metadata from the unified container file. The programmable IC can be configured, using the computer hardware, based on the defined relationship specified by the metadata using the first partial configuration bitstream and the second partial configuration bitstream.

    TRANSPARENT AND REMOTE KERNEL EXECUTION IN A HETEROGENEOUS COMPUTING SYSTEM

    公开(公告)号:US20230229497A1

    公开(公告)日:2023-07-20

    申请号:US17648172

    申请日:2022-01-17

    Applicant: Xilinx, Inc.

    Abstract: Remote kernel execution in a heterogeneous computing system can include executing, using a device processor of a device communicatively linked to a host processor, a device runtime and receiving from the host processor within a hardware submission queue of the device, a command. The command requests execution of a software kernel and specifies a descriptor stored in a region of a memory of the device shared with the host processor. In response to receiving the command, the device runtime, as executed by the device processor, invokes a runner program associated with the software kernel. The runner program can map a physical address of the descriptor to a virtual memory address corresponding to the descriptor that is usable by the software kernel. The runner program can execute the software kernel. The software kernel can access data specified by the descriptor using the virtual memory address as provided by the runner program.

Patent Agency Ranking