Abstract:
An embodiment of a massively parallel computing system comprising a plurality of processors, which may be subarranged into clusters of processors, and interconnected by means of a configurable directional 2D router for Networks on Chips (NOCs) is disclosed. The system further comprises diverse high bandwidth external I/O devices and interfaces, which may include without limitation Ethernet interfaces, and dynamic RAM (DRAM) memories. The system is designed for implementation in programmable logic in FPGAs, but may also be implemented in other integrated circuit technologies, such as non programmable circuitry, and in integrated circuits such as application specific integrated circuits (ASICs). The system enables the practical implementation of diverse FPGA computing accelerators to speed up computation for example in data centers or telecom networking infrastructure. The system uses the NOC to interconnect processors, clusters, accelerators, and/or external interfaces. A great diversity of NOC client cores, for communication amongst various external interfaces and devices, and on-chip interfaces and resources, may be coupled to a router in order to efficiently communicate with other NOC client cores. The system, router, and NOC enable feasible FPGA implementation of large integrated systems on chips, interconnecting hundreds of client cores over high bandwidth links, including compute and accelerator cores, industry standard IP cores, DRAM/HBM/HMC channels, PCI Express channels, and 10G/25G/40G/100G/400G networks.
Abstract:
Methods for implementing mini-mezzanine Open Compute Project (OCP) plug-and-play Network PHY Cards and associated apparatus. In accordance with one aspect, the MAC (Media Access Channel) and PHY (Physical) layer functions in one or more communication protocol stacks are split between a MAC block in a Platform Controller Hub (PCH) or processor SoC and a PHY card installed in a mezzanine slot of a platform and including one or more ports. During platform initialization operations, configuration parameters are read from the PHY card including a PHY card ID, and a corresponding configuration script is selected and executed to configure the PHY card for use in the platform. The configuration parameters are also used to enumerate PCIe devices associated with physical functions and ports supported by the PHY card.
Abstract:
Supply of a first clock signal used in an interface part of each of a plurality of slave devices on a ring bus and a second clock signal used in a core part of each of the plurality of slave devices is controlled. The slave device as the target of a request issued from a master device is specified. The first clock signal is supplied to each of the plurality of slave devices and the second clock signal is supplied to the specified slaved device.
Abstract:
An apparatus includes first and second reservation stations. The first reservation station (421.L) dispatches a load micro instruction, and indicates on a hold bus (444) if the load micro instruction is a specified load micro instruction directed to retrieve an operand from a prescribed resource other than on-core cache memory. The second reservation station (421.1-421.N) is coupled to the hold bus (444), and dispatches one or more younger micro instructions therein that depend on the load micro instruction for execution after a number of clock cycles following dispatch of the first load micro instruction, and if it is indicated on the hold bus (444) that the load micro instruction is the specified load micro instruction, the second reservation station (421.1-421.N) is configured to stall dispatch of the one or more younger micro instructions until the load micro instruction has retrieved the operand. The plurality of non-core resources includes a control element, coupled to the out-of order processor via a control bus.
Abstract:
A graph-based program specification (110) includes components corresponding to tasks and directed links between ports of the components, including: a first type of link configuration defined by respective output and input ports of linked components, and a second type of link configuration defined by respective output and input ports of linked components. A compiler (120) recognizes different types of link configurations and provides in a target program specification occurrences of a target primitive for executing a function for each occurrence of a data element flowing over a link of the second type. A computing node (152) initiates execution of the target program specification, and determines at runtime, for components associated with the occurrences of the target primitive, an order in which instances of tasks corresponding to the components are to be invoked, and/or a computing node on which instances of tasks corresponding to the components are to be executed.
Abstract:
Die Erfindung betrifft ein modulares Computersystem, umfassend ein Chassis mit einer Mehrzahl von im Bereich einer ersten Gehäuseseite angeordneten Aufnahmeschächten zur Aufnahme von korrespondierenden Funktionsmodulen, insbesondere Servermodulen (22). Das modulare Computersystem umfasst des Weiteren wenigstens ein erstes an einer zweiten Gehäuseseite angeordnetes Bedienpanel (4a, 4b) mit einer Mehrzahl von Bedienelementen. Dabei ist das wenigstens eine Bedienpanel über wenigstens ein erstes serielles Bussystem (45) mit Anschlüssen eines ersten Aufnahmeschachtes und eines zweiten Aufnahmeschachtes gekoppelt. Wenigstens eine erste Untergruppe der Bedienelemente ist dem ersten Aufnahmeschacht zugeordnet und eine zweite Untergruppe der Bedienelemente ist dem zweiten Aufnahmeschacht zugeordnet. Dabei ist das modulare Computersystem dazu eingerichtet, modulspezifische Steuerdaten über das erste serielle Bussystem zwischen der ersten Untergruppe der Bedienelemente und einem in dem ersten Aufnahmeschacht aufgenommenen Funktionsmodul und zwischen der zweiten Untergruppe der Bedienelemente und einem in dem zweiten Aufnahmeschacht aufgenommenen Funktionsmodul zu übertragen.