Abstract:
A hardware-based traversal coprocessor provides acceleration of tree traversal operations searching for intersections between primitives represented in a tree data structure and a ray. The primitives may include opaque and alpha triangles used in generating a virtual scene. The hardware-based traversal coprocessor is configured to determine primitives intersected by the ray, and return intersection information to a streaming multiprocessor for further processing. The hardware-based traversal coprocessor is configured to provide a deterministic result of intersected triangles regardless of the order that the memory subsystem returns triangle range blocks for processing, while opportunistically eliminating alpha intersections that lie further along the length of the ray than closer opaque intersections.
Abstract:
A hardware-based traversal coprocessor provides acceleration of tree traversal operations searching for intersections between primitives represented in a tree data structure and a ray. The primitives may include opaque and alpha triangles used in generating a virtual scene. The hardware-based traversal coprocessor is configured to determine primitives intersected by the ray, and return intersection information to a streaming multiprocessor for further processing. The hardware-based traversal coprocessor is configured to omit reporting of one or more primitives the ray is determined to intersect. The omitted primitives include primitives which are provably capable of being omitted without a functional impact on visualizing the virtual scene.
Abstract:
A hardware-based traversal coprocessor provides acceleration of tree traversal operations searching for intersections between primitives represented in a tree data structure and a ray. The primitives may include triangles used in generating a virtual scene. The hardware-based traversal coprocessor is configured to properly handle numerically challenging computations at or near edges and/or vertices of primitives and/or ensure that a single intersection is reported when a ray intersects a surface formed by primitives at or near edges and/or vertices of the primitives.
Abstract:
Methods and systems are described in some examples for changing the traversal of an acceleration data structure in a highly dynamic query-specific manner, with each query specifying test parameters, a test opcode and a mapping of test results to actions. In an example ray tracing implementation, traversal of a bounding volume hierarchy by a ray is performed with the default behavior of the traversal being changed in accordance with results of a test performed using the test opcode and test parameters specified in the ray data structure and another test parameter specified in a node of the bounding volume hierarchy. In an example implementation a traversal coprocessor is configured to perform the traversal of the bounding volume hierarchy.
Abstract:
In a ray tracer, a cache for streaming workloads groups ray requests for coherent successive bounding volume hierarchy traversal operations by sending common data down an attached data path to all ray requests in the group at the same time or about the same time. Grouping the requests provides good performance with a smaller number of cache lines.
Abstract:
Systems and methods for an efficient and robust multiprocessor-coprocessor interface that may be used between a streaming multiprocessor and an acceleration coprocessor in a GPU are provided. According to an example implementation, in order to perform an acceleration of a particular operation using the coprocessor, the multiprocessor: issues a series of write instructions to write input data for the operation into coprocessor-accessible storage locations, issues an operation instruction to cause the coprocessor to execute the particular operation; and then issues a series of read instructions to read result data of the operation from coprocessor-accessible storage locations to multiprocessor-accessible storage locations.
Abstract:
A hardware-based traversal coprocessor provides acceleration of tree traversal operations searching for intersections between primitives represented in a tree data structure and a ray. The primitives may include triangles used in generating a virtual scene. The hardware-based traversal coprocessor is configured to properly handle numerically challenging computations at or near edges and/or vertices of primitives and/or ensure that a single intersection is reported when a ray intersects a surface formed by primitives at or near edges and/or vertices of the primitives.
Abstract:
A system and method for constructing binary radix trees in parallel, which are used for as a building block for constructing secondary trees. A non-transitory computer-readable storage medium having computer-executable instructions for causing a computer system to perform a method is disclosed. The method includes determining a plurality of primitives comprising a total number of primitive nodes that are indexed, wherein the plurality of primitives correspond to leaf nodes of a hierarchical tree. The method includes sorting the plurality of primitives. The method includes building the hierarchical tree in a manner requiring at most a linear amount of temporary storage with respect to the total number of primitive nodes. The method includes building an internal node of the hierarchical tree in parallel with one or more of its ancestor nodes.
Abstract:
A non-transitory computer-readable storage medium having computer-executable instructions for causing a computer system to perform a method for constructing k-d trees, octrees, and quadtrees from radix trees is disclosed. The method includes assigning a Morton code for each of a plurality of primitives corresponding to leaf nodes of a binary radix tree, and sorting the plurality of Morton codes. The method includes building a radix tree requiring at most a linear amount of temporary storage with respect to the leaf nodes, wherein an internal node is built in parallel with one or more of its ancestor nodes. The method includes, partitioning the plurality of Morton codes for each node of the radix tree into categories based on a corresponding highest differing bit to build a k-d tree. A number of octree or quadtree nodes is determined for each node of the k-d tree. A total number of nodes in the octree or quadtree is determined, allocated and output.
Abstract:
A hardware-based traversal coprocessor provides acceleration of tree traversal operations searching for intersections between primitives represented in a tree data structure and a ray. The primitives may include opaque and alpha triangles used in generating a virtual scene. The hardware-based traversal coprocessor is configured to determine primitives intersected by the ray, and return intersection information to a streaming multiprocessor for further processing. The hardware-based traversal coprocessor is configured to omit reporting of one or more primitives the ray is determined to intersect. The omitted primitives include primitives which are provably capable of being omitted without a functional impact on visualizing the virtual scene.