摘要:
A task queue manager manages the task queues corresponding to virtual devices. When a virtual device function is requested, the task queue manager determines whether an SPU is currently assigned to the virtual device task. If an SPU is already assigned, the request is queued in a task queue being read by the SPU. If an SPU has not been assigned, the task queue manager assigns one of the SPUs to the task queue. The queue manager assigns the task based upon which SPU is least busy as well as whether one of the SPUs recently performed the virtual device function. If an SPU recently performed the virtual device function, it is more likely that the code used to perform the function is still in the SPU's local memory and will not have to be retrieved from shared common memory using DMA operations.
摘要:
A task queue manager manages the task queues corresponding to virtual devices. When a virtual device function is requested, the task queue manager determines whether an SPU is currently assigned to the virtual device task. If an SPU is already assigned, the request is queued in a task queue being read by the SPU. If an SPU has not been assigned, the task queue manager assigns one of the SPUs to the task queue. The queue manager assigns the task based upon which SPU is least busy as well as whether one of the SPUs recently performed the virtual device function. If an SPU recently performed the virtual device function, it is more likely that the code used to perform the function is still in the SPU's local memory and will not have to be retrieved from shared common memory using DMA operations.
摘要:
Computational load is balanced across a plurality of processors. Source code subtasks are compiled into byte code subtasks whereby the byte code subtasks are translated into processor-specific object code subtasks at runtime. The processor-type selection is based upon one of three approaches which are 1) a brute force approach, 2) higher-level approach, or 3) processor availability approach. Each object code subtask is loaded in a corresponding processor type for execution. In one embodiment, a compiler stores a pointer in a byte code file that references the location of a byte code subtask. In this embodiment, the byte code subtask is stored in a shared library and, at runtime, a runtime loader uses the pointer to identify the location of the byte code subtask in order to translate the byte code subtask.
摘要:
Computational load is balanced across a plurality of processors. Source code subtasks are compiled into byte code subtasks whereby the byte code subtasks are translated into processor-specific object code subtasks at runtime. The processor-type selection is based upon one of three approaches which are 1) a brute force approach, 2) higher-level approach, or 3) processor availability approach. Each object code subtask is loaded in a corresponding processor type for execution. In one embodiment, a compiler stores a pointer in a byte code file that references the location of a byte code subtask. In this embodiment, the byte code subtask is stored in a shared library and, at runtime, a runtime loader uses the pointer to identify the location of the byte code subtask in order to translate the byte code subtask.
摘要:
Source code subtasks are compiled into byte code subtasks whereby the byte code subtasks are translated into processor-specific object code subtasks at runtime. The processor-type selection is based upon one of three approaches which are 1) a brute force approach, 2) higher-level approach, or 3) processor availability approach. Each object code subtask is loaded in a corresponding processor type for execution. In one embodiment, a compiler stores a pointer in a byte code file that references the location of a byte code subtask. In this embodiment, the byte code subtask is stored in a shared library and, at runtime, a runtime loader uses the pointer to identify the location of the byte code subtask in order to translate the byte code subtask.
摘要:
A system and method for providing a persistent function server is provided. A multi-processor environment uses an interface definition language (idl) file to describe a particular function, such as an “add” function. A compiler uses the idl file to generate source code for use in marshalling and de-marshalling data between a main processor and a support processor. A header file is also created that corresponds to the particular function. The main processor includes parameters in the header file and sends the header file to the support processor. For example, a main processor may include two numbers in an “add” header file and send the “add” header file to a support processor that is responsible for performing math functions. In addition, the persistent function server capability of the support processor is programmable such that the support processor may be assigned to execute unique and complex functions.
摘要:
An image that includes ray traced pixel data and rasterized pixel data is generated. A synergistic processing unit (SPU) uses a rendering algorithm to generate ray traced data for objects that require high-quality image rendering. The ray traced data is fragmented, whereby each fragment includes a ray traced pixel depth value and a ray traced pixel color value. A rasterizer compares ray traced pixel depth values to corresponding rasterized pixel depth values, and overwrites ray traced pixel data with rasterized pixel data when the corresponding rasterized fragment is “closer” to a viewing point, which results in composite data. A display subsystem uses the resultant composite data to generate an image on a user's display.
摘要:
A method and system for solving a large system of dense linear equations using a system having a processing unit and one or more secondary processing units that can access a common memory for sharing data. A set of coefficients corresponding to a system of linear equations is received, and the coefficients, after being placed in matrix form, are divided into blocks and loaded into the common memory. Each of the processors is programmed to perform matrix operations on individual blocks to solve the linear equations. A table containing a list of the matrix operations is created in the common memory to keep track of the operations that have been performed and the operations that are still pending. SPUs determine whether tasks are pending, access the coefficients by accessing the common memory, perform the required tasks, and store the result back in the common memory for the result to be accessible by the PU and the other SPUs.
摘要:
An image is generated that includes ray traced pixel data and rasterized pixel data. A synergistic processing unit (SPU) uses a rendering algorithm to generate ray traced data for objects that require high-quality image rendering. The ray traced data is fragmented, whereby each fragment includes a ray traced pixel depth value and a ray traced pixel color value. A rasterizer compares ray traced pixel depth values to corresponding rasterized pixel depth values, and overwrites ray traced pixel data with rasterized pixel data when the corresponding rasterized fragment is “closer” to a viewing point, which results in composite data. A display subsystem uses the resultant composite data to generate an image on a user's display.
摘要:
An image is generated that includes ray traced pixel data and rasterized pixel data. A synergistic processing unit (SPU) uses a rendering algorithm to generate ray traced data for objects that require high-quality image rendering. The ray traced data is fragmented, whereby each fragment includes a ray traced pixel depth value and a ray traced pixel color value. A rasterizer compares ray traced pixel depth values to corresponding rasterized pixel depth values, and overwrites ray traced pixel data with rasterized pixel data when the corresponding rasterized fragment is “closer” to a viewing point, which results in composite data. A display subsystem uses the resultant composite data to generate an image on a user's display.