摘要:
An apparatus comprises a plurality of processor cores, each comprising a computation unit and a memory. The apparatus further comprises an interconnection network to transmit data among the processor cores. At least some of the memories are configured as a cache for memory external to the processor cores, and at least some of the processor cores are configured to transmit a message over the interconnection network to access a cache of another processor core.
摘要:
Programming in a multiprocessor environment includes accepting a program specification that defines a plurality of processing modules and one or more channels for sending data between ports of the modules, mapping each of the processing modules to run on a set of one or more processing engines of a network of interconnected processing engines, and for at least some of the channels, assigning one or more elements of one or more processing engines in the network to the channel for sending data between respective processing modules.
摘要:
Managing processes in a computing system comprising one or more cores includes generating an object in an operating system running on at least one core. A reference to the object is distributed to each of at least one and fewer than all of a plurality of processes to be executed on the at least one core. The operating system controls access to a resource such that processes to which the reference to the object was distributed have access to the resource and processes to which the reference to the object was not distributed do not have access to the resource.
摘要:
A flexible, scalable server is described. The server includes plural server nodes each server node including processor cores and switching circuitry configured to couple the processor to a network among the cores with the plurality of cores implementing networking functions within the compute nodes wherein the plurality of cores networking capabilities allow the cores to connect to each other, and to offer a single interface to a network coupled to the server.
摘要:
Receiving data at a first device transferred from a second device includes: storing a starting address with respect to a memory address space for a memory of the first device in a storage location within the first device. A request is received at the first device to transfer one or more data values from the second device, the request including a target address with respect to a communication channel address space for a communication channel between the first device and the second device. The second device determines whether the target address corresponds to a reserved address value designated as an indicator of a transfer to a memory address beyond the communication channel address space. The one or more data values are transferred from the second device to the first device according to the stored starting address if the target address does correspond to the reserved address value, or the one or more data values are transferred from the second device to the first device according to the target address if the target address does not correspond to the reserved address value.
摘要:
Processing packets in a system that comprises a plurality of interconnected processing cores includes: receiving packets into one or more queues; associating at least some nodes in a hierarchy of nodes with at least one of the queues, and at least some of the nodes with a rate; mapping a set of one or more nodes to a processor core based on a level in the hierarchy of the nodes in the set and at least one rate associated with a node not in the set; and processing the packets in the mapped processor cores according to the hierarchy.
摘要:
A system comprises a plurality of computation units interconnected by an interconnection network. A method for configuring the system comprises forming subsets of instructions corresponding to different portions of a program, the subsets of instructions being related according to a control flow graph; forming one or more memory analysis regions that include one or more of the subsets of instructions, where each subset of instructions is included in a single memory analysis region; analyzing each memory analysis region to partition memory objects and instructions that access the memory objects into equivalence classes such that instructions within an equivalence class only access objects in the same equivalence class; and assigning memory access instructions a given equivalence class to one of the computation units for execution on the assigned computation unit.
摘要:
An integrated circuit comprises a plurality of tiles. Each tile comprises a processor including a storage module, wherein the processor is configured to process multiple streams of instructions, a switch including switching circuitry to forward data received over data paths from other tiles to the processor and to switches of other tiles, and to forward data received from the processor to switches of other tiles, and coupling circuitry configured to couple data resulting from processing an instruction from at least one of the streams of instructions to the storage module and to the switch.
摘要:
A multicore processor comprises a plurality of cache memories; a plurality of processor cores, each associated with one of the cache memories; one or more memory interfaces providing memory access paths from the cache memories to a main memory; and one or more directory controllers for respective portions of the main memory, each associated with corresponding storage for directory state. Each corresponding storage provides space for maintaining directory state for each memory line that is indicated as stored in at least one of the cache memories such that the space for maintaining directory state is independent of the size of the main memory.
摘要:
A multicore processor comprises a plurality of cache memories, and a plurality of processor cores, each associated with one of the cache memories. Each of at least some of the cache memories is configured to maintain at least a portion of the cache memory in which each cache line is dynamically managed as either local to the associated processor core or shared among multiple processor cores.