-
公开(公告)号:US11189075B2
公开(公告)日:2021-11-30
申请号:US16893107
申请日:2020-06-04
Applicant: NVIDIA CORPORATION
Inventor: Samuli Laine , Timo Aila , Tero Karras , Gregory Muthler , William P. Newhall, Jr. , Ronald C. Babich, Jr. , Craig Kolb , Ignacio Llamas , John Burgess
Abstract: Methods and systems are described in some examples for changing the traversal of an acceleration data structure in a highly dynamic query-specific manner, with each query specifying test parameters, a test opcode and a mapping of test results to actions. In an example ray tracing implementation, traversal of a bounding volume hierarchy by a ray is performed with the default behavior of the traversal being changed in accordance with results of a test performed using the test opcode and test parameters specified in the ray data structure and another test parameter specified in a node of the bounding volume hierarchy. In an example implementation a traversal coprocessor is configured to perform the traversal of the bounding volume hierarchy.
-
公开(公告)号:US20210349763A1
公开(公告)日:2021-11-11
申请号:US17169283
申请日:2021-02-05
Applicant: NVIDIA Corporation
Inventor: Stephen Jones , Philip Alexander Cuadra , Daniel Elliot Wexler , Ignacio Llamas , Lacky V. Shah , Jerome F. Duluk, JR. , Christopher Lamb
Abstract: One embodiment of the present invention sets forth a technique for performing nested kernel execution within a parallel processing subsystem. The technique involves enabling a parent thread to launch a nested child grid on the parallel processing subsystem, and enabling the parent thread to perform a thread synchronization barrier on the child grid for proper execution semantics between the parent thread and the child grid. This technique advantageously enables the parallel processing subsystem to perform a richer set of programming constructs, such as conditionally executed and nested operations and externally defined library functions without the additional complexity of CPU involvement.
-
公开(公告)号:US11157414B2
公开(公告)日:2021-10-26
申请号:US16101109
申请日:2018-08-10
Applicant: NVIDIA Corporation
Inventor: Greg Muthler , Timo Aila , Tero Karras , Samuli Laine , William Parsons Newhall, Jr. , Ronald Charles Babich, Jr. , John Burgess , Ignacio Llamas
IPC: G06F12/00 , G06F12/0875 , G06T15/06 , G06F16/901
Abstract: In a ray tracer, a cache for streaming workloads groups ray requests for coherent successive bounding volume hierarchy traversal operations by sending common data down an attached data path to all ray requests in the group at the same time or about the same time. Grouping the requests provides good performance with a smaller number of cache lines.
-
公开(公告)号:US11138009B2
公开(公告)日:2021-10-05
申请号:US16101247
申请日:2018-08-10
Applicant: NVIDIA Corporation
Inventor: Ronald Charles Babich, Jr. , John Burgess , Jack Choquette , Tero Karras , Samuli Laine , Ignacio Llamas , Gregory Muthler , William Parsons Newhall, Jr.
Abstract: Systems and methods for an efficient and robust multiprocessor-coprocessor interface that may be used between a streaming multiprocessor and an acceleration coprocessor in a GPU are provided. According to an example implementation, in order to perform an acceleration of a particular operation using the coprocessor, the multiprocessor: issues a series of write instructions to write input data for the operation into coprocessor-accessible storage locations, issues an operation instruction to cause the coprocessor to execute the particular operation; and then issues a series of read instructions to read result data of the operation from coprocessor-accessible storage locations to multiprocessor-accessible storage locations.
-
公开(公告)号:US20200151016A1
公开(公告)日:2020-05-14
申请号:US16746714
申请日:2020-01-17
Applicant: NVIDIA Corporation
Inventor: Stephen Jones , Philip Alexander Cuadra , Daniel Elliot Wexler , Ignacio Llamas , Lacky V. Shah , Jerome F. Duluk, Jr. , Christopher Lamb
Abstract: One embodiment of the present invention sets forth a technique for performing nested kernel execution within a parallel processing subsystem. The technique involves enabling a parent thread to launch a nested child grid on the parallel processing subsystem, and enabling the parent thread to perform a thread synchronization barrier on the child grid for proper execution semantics between the parent thread and the child grid. This technique advantageously enables the parallel processing subsystem to perform a richer set of programming constructs, such as conditionally executed and nested operations and externally defined library functions without the additional complexity of CPU involvement.
-
16.
公开(公告)号:US09697044B2
公开(公告)日:2017-07-04
申请号:US13899343
申请日:2013-05-21
Applicant: NVIDIA CORPORATION
Inventor: Ignacio Llamas
CPC classification number: G06F9/4881
Abstract: An application programming interface (API) provides various software constructs that allow a developer to assemble a processing pipeline having arbitrary structure and complexity. Once assembled, the processing pipeline is configured to include a set of interconnected pipestages. Those pipestages are associated with one or more different CTAs that may execute in parallel with one another on a parallel processing unit. The developer specifies the configuration of the pipestages, including the configuration of the different CTAs across all pipestages, as well as the different processing operations performed by each different CTA.
-
公开(公告)号:US20230360321A1
公开(公告)日:2023-11-09
申请号:US18353809
申请日:2023-07-17
Applicant: NVIDIA Corporation
Inventor: Martin Stich , Ignacio Llamas , Steven Parker
CPC classification number: G06T15/83 , G06F9/54 , G06T15/005 , G06T15/06
Abstract: In various examples, shader bindings may be recorded in a shader binding table that includes shader records. Geometry of a 3D scene may be instantiated using object instances, and each may be associated with a respective set of the shader records using a location identifier of the set of shader records in memory. The set of shader records may represent shader bindings for an object instance under various predefined conditions. One or more of these predefined conditions may be implicit in the way the shader records are arranged in memory (e.g., indexed by ray type, by sub-geometry, etc.). For example, a section selector value (e.g., a section index) may be computed to locate and select a shader record based at least in part on a result of a ray tracing query (e.g., what sub-geometry was hit, what ray type was traced, etc.).
-
18.
公开(公告)号:US20220058856A1
公开(公告)日:2022-02-24
申请号:US17520100
申请日:2021-11-05
Applicant: NVIDIA Corporation
Inventor: Greg Muthler , Tero Karras , Samuli Laine , William Parsons Newhall, JR. , Ronald Charles Babich, JR. , John Burgess , Ignacio Llamas
IPC: G06T15/06
Abstract: A hardware-based traversal coprocessor provides acceleration of tree traversal operations searching for intersections between primitives represented in a tree data structure and a ray. The primitives may include opaque and alpha triangles used in generating a virtual scene. The hardware-based traversal coprocessor is configured to determine primitives intersected by the ray, and return intersection information to a streaming multiprocessor for further processing. The hardware-based traversal coprocessor is configured to omit reporting of one or more primitives the ray is determined to intersect. The omitted primitives include primitives which are provably capable of being omitted without a functional impact on visualizing the virtual scene.
-
公开(公告)号:US20210343072A1
公开(公告)日:2021-11-04
申请号:US17376866
申请日:2021-07-15
Applicant: NVIDIA CORPORATION
Inventor: Martin Stich , Ignacio Llamas , Steven Parker
Abstract: In various examples, shader bindings may be recorded in a shader binding table that includes shader records. Geometry of a 3D scene may be instantiated using object instances, and each may be associated with a respective set of the shader records using a location identifier of the set of shader records in memory. The set of shader records may represent shader bindings for an object instance under various predefined conditions. One or more of these predefined conditions may be implicit in the way the shader records are arranged in memory (e.g., indexed by ray type, by sub-geometry, etc.). For example, a section selector value (e.g., a section index) may be computed to locate and select a shader record based at least in part on a result of a ray tracing query (e.g., what sub-geometry was hit, what ray type was traced, etc.).
-
公开(公告)号:US11069129B2
公开(公告)日:2021-07-20
申请号:US16376943
申请日:2019-04-05
Applicant: NVIDIA Corporation
Inventor: Martin Stich , Ignacio Llamas , Steven Parker
Abstract: In various examples, shader bindings may be recorded in a shader binding table that includes shader records. Geometry of a 3D scene may be instantiated using object instances, and each may be associated with a respective set of the shader records using a location identifier of the set of shader records in memory. The set of shader records may represent shader bindings for an object instance under various predefined conditions. One or more of these predefined conditions may be implicit in the way the shader records are arranged in memory (e.g., indexed by ray type, by sub-geometry, etc.). For example, a section selector value (e.g., a section index) may be computed to locate and select a shader record based at least in part on a result of a ray tracing query (e.g., what sub-geometry was hit, what ray type was traced, etc.).
-
-
-
-
-
-
-
-
-