摘要:
An external Auxiliary Execution Unit (AXU) interface is provided between a processing core disposed in a first programmable chip and an off-chip AXU disposed in a second programmable chip to integrate the AXU with an issue unit, a fixed point execution unit, and optionally other functional units in the processing core. The external AXU interface enables the issue unit to issue instructions to the AXU in much the same manner as the issue unit would be able to issue instructions to an AXU that was disposed on the same chip. By doing so, the AXU on the second programmable chip can be designed, tested and verified independent of the processing core on the first programmable chip, thereby enabling a common processing core, which has been designed, tested, and verified, to be used in connection with multiple different AXU designs.
摘要:
An external Auxiliary Execution Unit (AXU) interface is provided between a processing core disposed in a first programmable chip and an off-chip AXU disposed in a second programmable chip to integrate the AXU with an issue unit, a fixed point execution unit, and optionally other functional units in the processing core. The external AXU interface enables the issue unit to issue instructions to the AXU in much the same manner as the issue unit would be able to issue instructions to an AXU that was disposed on the same chip. By doing so, the AXU on the second programmable chip can be designed, tested and verified independent of the processing core on the first programmable chip, thereby enabling a common processing core, which has been designed, tested, and verified, to be used in connection with multiple different AXU designs.
摘要:
Emulating a computer run time environment including: storing translated code in blocks of a translated code cache, each block of the translated code cache designated for storage of translated code for a separate one of the target executable processes, including identifying each block in dependence upon an identifier of the process for which the block is designated as storage; executing by the emulation environment a particular one of the target executable processes, using for target code translation the translated code in the block of the translated code cache designated as storage for the particular process; and upon encountering a context switch by the target operating system to execution of a new target executable process, changing from the block designated for the particular process to using for target code translation the translated code in the block of the translated code cache designated as storage for the new target executable process.
摘要:
Emulating a computer run time environment including: storing translated code in blocks of a translated code cache, each block of the translated code cache designated for storage of translated code for a separate one of the target executable processes, including identifying each block in dependence upon an identifier of the process for which the block is designated as storage; executing by the emulation environment a particular one of the target executable processes, using for target code translation the translated code in the block of the translated code cache designated as storage for the particular process; and upon encountering a context switch by the target operating system to execution of a new target executable process, changing from the block designated for the particular process to using for target code translation the translated code in the block of the translated code cache designated as storage for the new target executable process.
摘要:
Emulating a computer run time environment as a component of a dynamic binary translation loop that translates target executable code compiled for execution on a target computer to code executable on a host computer of a kind other than the target computer, the target executable code including function calls to functions to be translated. Embodiments of the present invention include: determining, upon encountering in the binary translation loop a function call to a function to be translated, that the function call is a call to a host library function in a host native library; hashing a target executable image of the function to be translated from the target executable code, thereby producing a hash value; and using the hash value as an index to retrieve from a thunk table a host native address of the host library function in the host native library.
摘要:
Memory sharing in a software pipeline on a network on chip (‘NOC’), the NOC including integrated processor (‘IP’) blocks, routers, memory communications controllers, and network interface controllers, with each IP block adapted to a router through a memory communications controller and a network interface controller, where each memory communications controller controlling communications between an IP block and memory, and each network interface controller controlling inter-IP block communications through routers, including segmenting a computer software application into stages of a software pipeline, the software pipeline comprising one or more paths of execution; allocating memory to be shared among at least two stages including creating a smart pointer, the smart pointer including data elements for determining when the shared memory can be deallocated; determining, in dependence upon the data elements for determining when the shared memory can be deallocated, that the shared memory can be deallocated; and deallocating the shared memory.
摘要:
Direct Memory Access (DMA) is used in connection with passing commands between a host device and a target device coupled via a push buffer. Commands passed to a push buffer by a host device may be accumulated by the host device prior to forwarding the commands to the push buffer, such that DMA may be used to collectively pass a block of commands to the push buffer. In addition, a host device may utilize DMA to pass command parameters for commands to a command buffer that is accessible by the target device but is separate from the push buffer, with the commands that are passed to the push buffer including pointers to the associated command parameters in the command buffer.
摘要:
A method includes receiving at a master processing element primitive data that includes properties of a primitive. The method includes partially traversing a spatial data structure that represents a three-dimensional image to identify an internal node of the spatial data structure. The internal node represents a portion of the three-dimensional image. The method also includes selecting a slave processing element from a plurality of slave processing elements. The selected processing element is associated with the internal node. The method further includes sending the primitive data to the selected slave processing element to traverse a portion of the spatial data structure to identify a leaf node of the spatial data structure.
摘要:
A circuit arrangement and method make state changes to shared state data in a highly multithreaded environment by propagating or streaming the changes to multiple parallel hardware threads of execution in the multithreaded environment using an on-chip communications network and without attempting to access any copy of the shared state data in a shared memory to which the parallel threads of execution are also coupled.
摘要:
A network on chip (‘NOC’), and methods of operation of a NOC, that maintains cache coherency with invalidation messages, the NOC including integrated processor (‘IP’) blocks, routers, memory communications controllers, and network interface controller, each IP block adapted to a router through a memory communications controller and a network interface controller, each memory communications controller controlling communication between an IP block and memory, and each network interface controller controlling inter-IP block communications through routers, the NOC also including an invalidating module configured to send, to selected IP blocks, an invalidation message, the invalidation message representing an instruction to invalidate cached memory and the selected IP blocks, each selected IP block configured to invalidate the contents of the cached memory responsive to receiving the invalidation message.