Abstract:
A computer graphics method and apparatus allows designer control over the rendering of objects and scenes, in a rendering system using ray tracing for example. A modeling system is adapted to accept rules for controlling how certain objects affect the appearance of certain other objects. In a ray tracing implementation, rules are specified by ray type and can be specified as either “including” all but certain objects or “excluding” specific objects for any given object. A rendering system extracts these rules from a bytestream or other input including other graphics data and instructions, and populates lists for internal use by other components of the rendering system. A ray tracer in the rendering system is adapted to consult the list when performing ray tracing, so as to enforce the rendering control specified by the content creator when the objects and scene are rendered.
Abstract:
A method and an apparatus that determine a total number of threads to concurrently execute executable codes compiled from a single source for target processing units in response to an API (Application Programming Interface) request from an application running in a host processing unit are described. The target processing units include GPUs (Graphics Processing Unit) and CPUs (Central Processing Unit). Thread group sizes for the target processing units are determined to partition the total number of threads according to a multi-dimensional global thread number included in the API request. The executable codes are loaded to be executed in thread groups with the determined thread group sizes concurrently in the target processing units.
Abstract:
A method and an apparatus for a parallel computing program calling APIs (application programming interfaces) in a host processor to perform a data processing task in parallel among compute units are described. The compute units are coupled to the host processor including central processing units (CPUs) and graphic processing units (GPUs). A program object corresponding to a source code for the data processing task is generated in a memory coupled to the host processor according to the API calls. Executable codes for the compute units are generated from the program object according to the API calls to be loaded for concurrent execution among the compute units to perform the data processing task.
Abstract:
Graphics display performance is significantly improved by compressing pixel information, by aligning the 8, 16 or 32 bit pixels transferred over a 32-bit Peripheral Component Interface (PCI) bus with the pixels in the display memory, and by avoiding moves of pixel data within display memory. Compression is achieved by not transferring data for pixels that are not modified by the transfer. Rather, a count of unmodified pixel bytes to skip precedes each set of pixel data for contiguous pixels that are modified. Alignment is achieved by ensuring that the boundaries between words within the pixel set transferred matches those within the corresponding target pixels in the display memory. This alignment significantly speeds up modifying pixel data within the display memory. The burden of ensuring this alignment is placed on the applications software that initiates the transfer. For a static image, such as a cockpit, this alignment can be achieved at the time that the image information used by the software is compiled into a bitmap. For a dynamic image, such as a sprite, this alignment can be achieved by compiling all possible word alignments of the sprite's pixel data into different bitmap versions. At run time, the applications software uses the sprite's current location to dynamically select which bitmap version to transfer. In one embodiment, a graphics accelerator interprets the bitmap transferred and updates display memory accordingly. In another embodiment, software executing on the host CPU directly writes pre-aligned pixel data into the display memory.
Abstract:
A method and an apparatus for a parallel computing program calling APIs (application programming interfaces) in a host processor to perform a data processing task in parallel among compute units are described. The compute units are coupled to the host processor including central processing units (CPUs) and graphic processing units (GPUs). A program object corresponding to a source code for the data processing task is generated in a memory coupled to the host processor according to the API calls. Executable codes for the compute units are generated from the program object according to the API calls to be loaded for concurrent execution among the compute units to perform the data processing task.
Abstract:
A method and an apparatus for a parallel computing program using subbuffers to perform a data processing task in parallel among heterogeneous compute units are described. The compute units can include a heterogeneous mix of central processing units (CPUs) and graphic processing units (GPUs). A system creates a subbuffer from a parent buffer for each of a plurality of heterogeneous compute units. If a subbuffer is not associated with the same compute unit as the parent buffer, the system copies data from the subbuffer to memory of that compute unit. The system further tracks updates to the data and transfers those updates back to the subbuffer.
Abstract:
Digital rendering over a network is described. Rendering resources associated with a project are stored in a project resource pool at a rendering service site, and for each rendering request received from a client site the project resource pool is compared to current rendering resources at the client site. A given rendering resource is uploaded from the client site to the rendering service only if the project resource pool does not contain the current version, thereby conserving bandwidth. In one embodiment, redundant generation of raw rendering resource files is avoided by only generating those raw rendering resource files not mated with generated rendering resource files. Reducing redundant generation of raw resources is also described, as well as statistically reducing the number of raw resource files required to be uploaded to the rendering service for multi-frame sessions.
Abstract:
The present invention relates to computer graphics applications involving scene rendering using objects modeled at multiple levels of detail. In accordance with an aspect of the invention, a ray tracer implementation allows users to specify multiple versions of a particular object, categorized by LOD ID's. A scene server selects the version appropriate for the particular scene, based on the size of the object on the screen for example, and provides a smooth transition between multiple versions of an object model. In one example, the scene server will select two LOD representations associated with a given object and assign relative weights to each representation. The LOD weights are specified to indicate how to blend these representations together. A ray tracer computes the objects hit by camera rays associated with pixels in the camera window, as well as secondary rays in recursive implementations, and rays striking LOD objects are detected and shaded in accordance with the weights assigned to the different representations. Embodiments are disclosed for level of detail control using both forward ray tracing and backward ray tracing, including handling of camera rays, reflected rays, refracted rays and shadow rays.