-
公开(公告)号:US20220027280A1
公开(公告)日:2022-01-27
申请号:US17483133
申请日:2021-09-23
Applicant: NVIDIA Corporation
Inventor: Greg MUTHLER , Timo AILA , Tero KARRAS , Samuli LAINE , William Parsons NEWHALL, JR. , Ronald Charles BABICH, JR. , John BURGESS , Ignacio LLAMAS
IPC: G06F12/0875 , G06T15/06 , G06F16/901
Abstract: In a ray tracer, a cache for streaming workloads groups ray requests for coherent successive bounding volume hierarchy traversal operations by sending common data down an attached data path to all ray requests in the group at the same time or about the same time. Grouping the requests provides good performance with a smaller number of cache lines.
-
公开(公告)号:US20220020202A1
公开(公告)日:2022-01-20
申请号:US17490024
申请日:2021-09-30
Applicant: NVIDIA Corporation
Inventor: Samuli LAINE , Tero KARRAS , Greg MUTHLER , William Parsons NEWHALL , Ronald Charles BABICH , Ignacio LLAMAS , John BURGESS
Abstract: A hardware-based traversal coprocessor provides acceleration of tree traversal operations searching for intersections between primitives represented in a tree data structure and a ray. The primitives may include opaque and alpha triangles used in generating a virtual scene. The hardware-based traversal coprocessor is configured to determine primitives intersected by the ray, and return intersection information to a streaming multiprocessor for further processing. The hardware-based traversal coprocessor is configured to provide a deterministic result of intersected triangles regardless of the order that the memory subsystem returns triangle range blocks for processing, while opportunistically eliminating alpha intersections that lie further along the length of the ray than closer opaque intersections.
-
公开(公告)号:US20210390758A1
公开(公告)日:2021-12-16
申请号:US16898980
申请日:2020-06-11
Applicant: NVIDIA Corporation
Inventor: Gregory MUTHLER , John BURGESS
Abstract: Enhanced techniques applicable to a ray tracing hardware accelerator for traversing a hierarchical acceleration structure are disclosed. For example, traversal efficiency is improved by combining programmable traversals based on ray operations with per-node static configurations that modify traversal behavior. The per-node static configurations enable creators of acceleration data structures to optimize for potential traversals without necessarily requiring detailed information about ray characteristics and ray operations used when traversing the acceleration structure. Moreover, by providing for selective exclusion of certain nodes using per-node static configurations, less memory is needed to express an acceleration structure that includes, for example, different geometric levels of details corresponding to a single object.
-
公开(公告)号:US20200051316A1
公开(公告)日:2020-02-13
申请号:US16101196
申请日:2018-08-10
Applicant: NVIDIA Corporation
Inventor: Samuli LAINE , Tero KARRAS , Greg MUTHLER , William Parsons NEWHALL, JR. , Ronald Charles BABICH , Ignacio LLAMAS , John BURGESS
Abstract: A hardware-based traversal coprocessor provides acceleration of tree traversal operations searching for intersections between primitives represented in a tree data structure and a ray. The primitives may include opaque and alpha triangles used in generating a virtual scene. The hardware-based traversal coprocessor is configured to determine primitives intersected by the ray, and return intersection information to a streaming multiprocessor for further processing. The hardware-based traversal coprocessor is configured to provide a deterministic result of intersected triangles regardless of the order that the memory subsystem returns triangle range blocks for processing, while opportunistically eliminating alpha intersections that lie further along the length of the ray than closer opaque intersections.
-
公开(公告)号:US20250046003A1
公开(公告)日:2025-02-06
申请号:US18921368
申请日:2024-10-21
Applicant: NVIDIA Corporation
Inventor: Gregory MUTHLER , John BURGESS , Magnus ANDERSSON , Timo VIITANEN , Levi OLIVER
Abstract: An alternate root tree or graph structure for ray and path tracing enables dynamic instancing build time decisions to split any number of geometry acceleration structures in a manner that is developer transparent, nearly memory storage neutral, and traversal efficient. The resulting traversals only need to partially traverse the acceleration structure, which improves efficiency. One example use reduces the number of false positive instance acceleration structure to geometry acceleration structure transitions for many spatially separated instances of the same geometry.
-
36.
公开(公告)号:US20240362851A1
公开(公告)日:2024-10-31
申请号:US18769000
申请日:2024-07-10
Applicant: NVIDIA Corporation
Inventor: Gregory MUTHLER , John BURGESS
CPC classification number: G06T15/06 , G06F9/5027 , G06T15/08 , G06T17/005 , G06T17/10 , G06T2210/12
Abstract: A bounding volume is used to approximate the space an object occupies. If a more precise understanding beyond an approximation is required, the object itself is then inspected to determine what space it occupies. Often, a simple volume (such as an axis-aligned box) is used as bounding volume to approximate the space occupied by an object. But objects can be arbitrary, complicated shapes. So a simple volume often does not fit the object very well. That causes a lot of space that is not occupied by the object to be included in the approximation of the space being occupied by the object. Hardware-based techniques are disclosed herein, for example, for efficiently using multiple bounding volumes (such as axis-aligned bounding boxes) to represent, in effect, an arbitrarily shaped bounding volume to better fit the object, and for using such arbitrary bounding volumes to improve performance in applications such as ray tracing.
-
公开(公告)号:US20240211255A1
公开(公告)日:2024-06-27
申请号:US18596106
申请日:2024-03-05
Applicant: NVIDIA Corporation
Inventor: Ronald Charles BABICH, JR. , John BURGESS , Jack CHOQUETTE , Tero KARRAS , Samuli LAINE , Ignacio LLAMAS , Gregory MUTHLER , William Parsons NEWHALL, JR.
CPC classification number: G06F9/3004 , G06F9/3877 , G06F9/4843 , G06F15/163 , G06T1/20 , G06T1/60 , G06T2200/28
Abstract: Systems and methods for an efficient and robust multiprocessor-coprocessor interface that may be used between a streaming multiprocessor and an acceleration coprocessor in a GPU are provided. According to an example implementation, in order to perform an acceleration of a particular operation using the coprocessor, the multiprocessor: issues a series of write instructions to write input data for the operation into coprocessor-accessible storage locations, issues an operation instruction to cause the coprocessor to execute the particular operation; and then issues a series of read instructions to read result data of the operation from coprocessor-accessible storage locations to multiprocessor-accessible storage locations.
-
38.
公开(公告)号:US20240169655A1
公开(公告)日:2024-05-23
申请号:US18420449
申请日:2024-01-23
Applicant: NVIDIA Corporation
Inventor: Greg MUTHLER , Ronald Charles BABICH, JR. , William Parsons NEWHALL, Jr. , Peter NELSON , James ROBERTSON , John BURGESS
CPC classification number: G06T15/06 , G06F9/3877 , G06N5/046 , G06T1/20 , G06T1/60 , G06T17/005
Abstract: In a ray tracer, to prevent any long-running query from hanging the graphics processing unit, a traversal coprocessor provides a preemption mechanism that will allow rays to stop processing or time out early. The example non-limiting implementations described herein provide such a preemption mechanism, including a forward progress guarantee, and additional programmable timeout options that can be time or cycle based. Those programmable options provide a means for quality of service timing guarantees for applications such as virtual reality (VR) that have strict timing requirements.
-
公开(公告)号:US20240104826A1
公开(公告)日:2024-03-28
申请号:US18509038
申请日:2023-11-14
Applicant: NVIDIA Corporation
Inventor: Gregory MUTHLER , John BURGESS , Ronald Charles BABICH, JR. , William Parsons Newhall, JR.
CPC classification number: G06T15/06 , G06F9/48 , G06F9/5027 , G06T17/10 , G06T2210/21
Abstract: Techniques are disclosed for improving the throughput of ray intersection or visibility queries performed by a ray tracing hardware accelerator. Throughput is improved, for example, by releasing allocated resources before ray visibility query results are reported by the hardware accelerator. The allocated resources are released when the ray visibility query results can be stored in a compressed format outside of the allocated resources. When reporting the ray visibility query results, the results are reconstructed based on the results stored in the compressed format. The compressed format storage can be used for ray visibility queries that return no intersections or terminate on any hit ray visibility query. One or more individual components of allocated resources can also be independently deallocated based on the type of data to be returned and/or results of the ray visibility query.
-
公开(公告)号:US20240095996A1
公开(公告)日:2024-03-21
申请号:US17946509
申请日:2022-09-16
Applicant: NVIDIA Corporation
Inventor: Gregory MUTHLER , John BURGESS , Eric ENDERTON , Nikhil DIXIT , Josh NOEL
IPC: G06T15/06
CPC classification number: G06T15/06
Abstract: To improve the efficiency of bounding volumes in a hardware based ray tracer, we employ a sheared axis-aligned bounding box to approximate an oriented bounding box typically defined by rotations. To achieve this, the bounding volume hierarchy builder shears an axis-aligned box to fit tightly around its enclosed oriented geometry in top level or bottom level space, then computes the inverse shear transform. The bounds are still stored as axis-aligned boxes in memory, now defined in the new sheared coordinate system, along with the derived parameters to transform a ray into the sheared coordinate system before testing intersection with the boxes. The ray-bounding volume intersection test is performed as usual, just in the new sheared coordinate system. Additional efficiencies are gained by constraining the number of shear dimensions, constraining the shear transform coefficients to a quantized list, sharing a shear transform across a collection of bounds, performing a shear transform only for ray-bounds testing and not for ray-geometry intersection testing, and adding a specialized shear transform calculator/accelerator to the hardware.
-
-
-
-
-
-
-
-
-