-
公开(公告)号:US11379389B1
公开(公告)日:2022-07-05
申请号:US15944179
申请日:2018-04-03
Applicant: Xilinx, Inc.
Inventor: Juan J. Noguera Serra , Goran H K Bilski , Baris Ozgul , Jan Langer
IPC: G06F13/16 , G06F12/084 , G06F9/54 , G11C8/16 , G06F15/167
Abstract: Examples herein describe techniques for transferring data between data processing engines in an array using shared memory. In one embodiment, certain engines in the array have connections to the memory in neighboring engines. For example, each engine may have its own assigned memory module which can be accessed directly (e.g., without using a streaming or memory mapped interconnect). In addition, the surrounding engines (referred to herein as the neighboring engines) may also include direct connections to the memory module. Using these direct connections, the cores can load and/or store data in the neighboring memory modules.
-
公开(公告)号:US11296707B1
公开(公告)日:2022-04-05
申请号:US17196574
申请日:2021-03-09
Applicant: Xilinx, Inc.
Inventor: Javier Cabezas Rodriguez , Juan J. Noguera Serra , David Clarke , Sneha Bhalchandra Date , Tim Tuan , Peter McColgan , Jan Langer , Baris Ozgul
IPC: H03K19/1776 , H03K19/17704 , H03K19/17768 , H03K19/17758 , H03K19/17796
Abstract: An integrated circuit can include a data processing engine (DPE) array having a plurality of tiles. The plurality of tiles can include a plurality of DPE tiles, wherein each DPE tile includes a stream switch, a core configured to perform operations, and a memory module. The plurality of tiles can include a plurality of memory tiles, wherein each memory tile includes a stream switch, a direct memory access (DMA) engine, and a random-access memory. The DMA engine of each memory tile may be configured to access the random-access memory within the same memory tile and the random-access memory of at least one other memory tile. Selected ones of the plurality of DPE tiles may be configured to access selected ones of the plurality of memory tiles via the stream switches.
-
公开(公告)号:US10747531B1
公开(公告)日:2020-08-18
申请号:US15944315
申请日:2018-04-03
Applicant: Xilinx, Inc.
Inventor: Jan Langer , Baris Ozgul , Juan J. Noguera Serra , Goran HK Bilski , Tim Tuan
Abstract: An example core for a data processing engine (DPE) includes a register file, a processor, coupled to the register file. The processor includes a multiply-accumulate (MAC) circuit, and permute circuitry coupled between the register file and the MAC circuit, the permute circuitry configured to concatenate at least one pair of outputs of the register file to provide at least one input to the MAC circuit. The core further includes an instruction decoder, coupled to the processor, configured to decode a very large instruction word (VLIW) to set a plurality of parameters of the processor, the plurality of parameters including first parameters of the permute circuitry and second parameters of the MAC circuit.
-
公开(公告)号:US20190303033A1
公开(公告)日:2019-10-03
申请号:US15944160
申请日:2018-04-03
Applicant: Xilinx, Inc.
Inventor: Juan J. Noguera Serra , Goran HK Bilski , Jan Langer , Baris Ozgul , Tim Tuan , Richard L. Walke , Ralph D. Wittig , Kornelis A. Vissers , Christopher H. Dick
IPC: G06F3/06
Abstract: A device may include a plurality of data processing engines. Each of the data processing engines may include a core and a memory module. The plurality of data processing engines may be organized in a plurality of rows. Each core may be configured to communicate with other neighboring data processing engines of the plurality of data processing engines by shared access to the memory modules of the neighboring data processing engines.
-
公开(公告)号:US20230131698A1
公开(公告)日:2023-04-27
申请号:US18145810
申请日:2022-12-22
Applicant: Xilinx, Inc.
Inventor: Juan J. Noguera Serra , Goran HK Bilski , Jan Langer , Baris Ozgul , Richard L. Walke , Ralph D. Wittig , Kornelis A. Vissers , Tim Tuan , David Clarke
IPC: G06F3/06 , G06F15/78 , G06F15/173
Abstract: A device includes a data processing engine array having a plurality of data processing engines organized in a grid having a plurality of rows and a plurality of columns. Each data processing engine includes a core, a memory module including a memory and a direct memory access engine. Each data processing engine includes a stream switch connected to the core, the direct memory access engine, and the stream switch of one or more adjacent data processing engines. Each memory module includes a first memory interface directly coupled to the core in the same data processing engine and one or more second memory interfaces directly coupled to the core of each of the one or more adjacent data processing engines.
-
公开(公告)号:US11599498B1
公开(公告)日:2023-03-07
申请号:US17068697
申请日:2020-10-12
Applicant: XILINX, INC.
Inventor: Juan J. Noguera Serra , Sneha Bhalchandra Date , Jan Langer , Baris Ozgul , Goran Hk Bilski
IPC: G06F9/00 , G06F15/177 , G06F15/80 , G06F15/173 , G06F9/4401
Abstract: A device may include a processor system and an array of data processing engines (DPEs) communicatively coupled to the processor system. Each of the DPEs includes a core and a DPE interconnect. The processor system is configured to transmit configuration data to the array of DPEs, and each of the DPEs is independently configurable based on the configuration data received at the respective DPE via the DPE interconnect of the respective DPE. The array of DPEs enable, without modifying operation of a first kernel of a first subset of the DPEs of the array of DPEs, reconfiguration of a second subset of the DPEs of the array of DPEs.
-
公开(公告)号:US11573726B1
公开(公告)日:2023-02-07
申请号:US17097917
申请日:2020-11-13
Applicant: Xilinx, Inc.
Inventor: Juan J. Noguera Serra , Goran H K Bilski , Jan Langer , Baris Ozgul , Richard L. Walke , Ralph D. Wittig , Kornelis A. Vissers , Christopher H. Dick , Philip B. James-Roxby
IPC: G06F3/06 , G06F15/173 , G06F15/78
Abstract: A device may include a plurality of data processing engines. Each of the data processing engines may include a memory pool having a plurality of memory banks, a plurality of cores each coupled to the memory pool and configured to access the plurality of memory banks, a memory mapped switch coupled to the memory pool and a memory mapped switch of at least one neighboring data processing engine, and a stream switch coupled to each of the plurality of cores and to a stream switch of the at least one neighboring data processing engine.
-
公开(公告)号:US11323391B1
公开(公告)日:2022-05-03
申请号:US16833029
申请日:2020-03-27
Applicant: XILINX, INC.
Inventor: Peter McColgan , David Clarke , Goran Hk Bilski , Juan J. Noguera Serra , Baris Ozgul , Jan Langer , Tim Tuan
IPC: H04L12/935 , G06F13/28 , H04L49/00
Abstract: Some examples described herein relate to multi-port stream switches of data processing engines (DPEs) of an electronic device, such as a programmable device. In an example, a programmable device includes a plurality of DPEs. Each DPE of the DPEs includes a hardened processor core and a stream switch. The stream switch is connected to respective stream switches of ones of the DPEs that neighbor the respective DPE in respective ones of directions. The stream switch has input ports associated with each direction of the directions and has output ports associated with each direction of the directions. For each direction of the directions, each input port of the input ports associated with the respective direction is selectively connectable to one of the output ports associated with the respective direction.
-
公开(公告)号:US11061673B1
公开(公告)日:2021-07-13
申请号:US15944393
申请日:2018-04-03
Applicant: Xilinx, Inc.
Inventor: Baris Ozgul , Jan Langer , Juan J. Noguera Serra , Goran H. K. Bilski , Richard L. Walke
Abstract: An example core for data processing engine (DPE) includes a first register file configured to provide a first plurality of output lanes, a processor, coupled to the register file, including: a multiply-accumulate (MAC) circuit, and a first permute circuit coupled between the first register file and the MAC circuit. The first permute circuit is configured to generate a first vector by selecting a first set of output lanes from the first plurality of output lanes, and a second permute circuit coupled between the first register file and the MAC circuit. The second permute circuit is configured to generate a second vector by selecting a second set of output lanes from the first plurality of output lanes.
-
公开(公告)号:US11016822B1
公开(公告)日:2021-05-25
申请号:US15944578
申请日:2018-04-03
Applicant: Xilinx, Inc.
Inventor: Goran H. K. Bilski , Juan J. Noguera Serra , Jan Langer , Baris Ozgul , Richard L. Walke
Abstract: Examples herein describe techniques for communicating directly between cores in an array of data processing engines. In one embodiment, the array is a 2D array where each of the data processing engines includes one or more cores. In addition to the cores, the data processing engines can include a memory module (with memory banks for storing data) and an interconnect which provides connectivity between the cores. Using the interconnect, however, can add latency when transmitting data between the cores. In the embodiments herein, the array includes core-to-core communication links that directly connect one core in the array to another core. The cores can use these communication links to bypass the interconnect and the memory module to transmit data directly.
-
-
-
-
-
-
-
-
-