-
公开(公告)号:US11966737B2
公开(公告)日:2024-04-23
申请号:US17465234
申请日:2021-09-02
申请人: NVIDIA Corporation
发明人: Ronald Charles Babich, Jr. , John Burgess , Jack Choquette , Tero Karras , Samuli Laine , Ignacio Llamas , Gregory Muthler , William Parsons Newhall, Jr.
CPC分类号: G06F9/3004 , G06F9/3877 , G06F9/4843 , G06F15/163 , G06T1/20 , G06T1/60 , G06T2200/28
摘要: Systems and methods for an efficient and robust multiprocessor-coprocessor interface that may be used between a streaming multiprocessor and an acceleration coprocessor in a GPU are provided. According to an example implementation, in order to perform an acceleration of a particular operation using the coprocessor, the multiprocessor: issues a series of write instructions to write input data for the operation into coprocessor-accessible storage locations, issues an operation instruction to cause the coprocessor to execute the particular operation; and then issues a series of read instructions to read result data of the operation from coprocessor-accessible storage locations to multiprocessor-accessible storage locations.
-
公开(公告)号:US11675704B2
公开(公告)日:2023-06-13
申请号:US17483133
申请日:2021-09-23
申请人: NVIDIA Corporation
发明人: Greg Muthler , Timo Aila , Tero Karras , Samuli Laine , William Parsons Newhall, Jr. , Ronald Charles Babich, Jr. , John Burgess , Ignacio Llamas
IPC分类号: G06F12/00 , G06F12/0875 , G06T15/06 , G06F16/901
CPC分类号: G06F12/0875 , G06F16/9027 , G06T15/06 , G06T2207/20021
摘要: In a ray tracer, a cache for streaming workloads groups ray requests for coherent successive bounding volume hierarchy traversal operations by sending common data down an attached data path to all ray requests in the group at the same time or about the same time. Grouping the requests provides good performance with a smaller number of cache lines.
-
公开(公告)号:US11164360B2
公开(公告)日:2021-11-02
申请号:US16919700
申请日:2020-07-02
申请人: NVIDIA Corporation
发明人: Samuli Laine , Tero Karras , Greg Muthler , William Parsons Newhall , Ronald Charles Babich , Ignacio Llamas , John Burgess
摘要: A hardware-based traversal coprocessor provides acceleration of tree traversal operations searching for intersections between primitives represented in a tree data structure and a ray. The primitives may include opaque and alpha triangles used in generating a virtual scene. The hardware-based traversal coprocessor is configured to determine primitives intersected by the ray, and return intersection information to a streaming multiprocessor for further processing. The hardware-based traversal coprocessor is configured to provide a deterministic result of intersected triangles regardless of the order that the memory subsystem returns triangle range blocks for processing, while opportunistically eliminating alpha intersections that lie further along the length of the ray than closer opaque intersections.
-
公开(公告)号:US10915364B2
公开(公告)日:2021-02-09
申请号:US15368434
申请日:2016-12-02
申请人: Nvidia Corporation
发明人: Stephen Jones , Philip Alexander Cuadra , Daniel Elliot Wexler , Ignacio Llamas , Lacky V. Shah , Jerome F. Duluk , Christopher Lamb
摘要: Apparatuses, systems, and techniques for performing nested kernel execution within a parallel processing subsystem. In at least one embodiment, a parent thread launches a nested child grid on the parallel processing subsystem, and enables the parent thread to perform a thread synchronization barrier on the child grid for proper execution semantics between the parent thread and the child grid.
-
公开(公告)号:US20200349755A1
公开(公告)日:2020-11-05
申请号:US16935431
申请日:2020-07-22
申请人: NVIDIA Corporation
摘要: Disclosed approaches may leverage the actual spatial and reflective properties of a virtual environment—such as the size, shape, and orientation of a bidirectional reflectance distribution function (BRDF) lobe of a light path and its position relative to a reflection surface, a virtual screen, and a virtual camera—to produce, for a pixel, an anisotropic kernel filter having dimensions and weights that accurately reflect the spatial characteristics of the virtual environment as well as the reflective properties of the surface. In order to accomplish this, geometry may be computed that corresponds to a projection of a reflection of the BRDF lobe below the surface along a view vector to the pixel. Using this approach, the dimensions of the anisotropic filter kernel may correspond to the BRDF lobe to accurately reflect the spatial characteristics of the virtual environment as well as the reflective properties of the surface.
-
公开(公告)号:US20170083373A1
公开(公告)日:2017-03-23
申请号:US15368434
申请日:2016-12-02
申请人: Nvidia Corporation
发明人: Stephen Jones , Philip Alexander Cuadra , Daniel Elliot Wexler , Ignacio Llamas , Lacky V. Shah , Jerome F. Duluk , Christopher Lamb
CPC分类号: G06F9/5027 , G06F9/522 , G06F2209/483 , G06T1/20
摘要: One embodiment of the present invention sets forth a technique for performing nested kernel execution within a parallel processing subsystem. The technique involves enabling a parent thread to launch a nested child grid on the parallel processing subsystem, and enabling the parent thread to perform a thread synchronization barrier on the child grid for proper execution semantics between the parent thread and the child grid. This technique advantageously enables the parallel processing subsystem to perform a richer set of programming constructs, such as conditionally executed and nested operations and externally defined library functions without the additional complexity of CPU involvement.
-
公开(公告)号:US12124378B1
公开(公告)日:2024-10-22
申请号:US18137421
申请日:2023-04-20
申请人: NVIDIA Corporation
发明人: Gregory A. Muthler , Timo Aila , Tero Karras , Samuli Laine , William Parsons Newhall, Jr. , Ronald Charles Babich, Jr. , John Burgess , Ignacio Llamas
IPC分类号: G06F12/00 , G06F12/0875 , G06F16/901 , G06T15/06
CPC分类号: G06F12/0875 , G06F16/9027 , G06T15/06 , G06T2207/20021
摘要: In a ray tracer, a cache for streaming workloads groups ray requests for coherent successive bounding volume hierarchy traversal operations by sending common data down an attached data path to all ray requests in the group at the same time or about the same time. Grouping the requests provides good performance with a smaller number of cache lines.
-
公开(公告)号:US20240257439A1
公开(公告)日:2024-08-01
申请号:US18612293
申请日:2024-03-21
申请人: NVIDIA Corporation
CPC分类号: G06T15/06 , G06T5/20 , G06T5/70 , G06T15/506 , G06T15/60 , G06T2210/21
摘要: Disclosed approaches may leverage the actual spatial and reflective properties of a virtual environment—such as the size, shape, and orientation of a bidirectional reflectance distribution function (BRDF) lobe of a light path and its position relative to a reflection surface, a virtual screen, and a virtual camera—to produce, for a pixel, an anisotropic kernel filter having dimensions and weights that accurately reflect the spatial characteristics of the virtual environment as well as the reflective properties of the surface. In order to accomplish this, geometry may be computed that corresponds to a projection of a reflection of the BRDF lobe below the surface along a view vector to the pixel. Using this approach, the dimensions of the anisotropic filter kernel may correspond to the BRDF lobe to accurately reflect the spatial characteristics of the virtual environment as well as the reflective properties of the surface.
-
公开(公告)号:US11804000B2
公开(公告)日:2023-10-31
申请号:US17513023
申请日:2021-10-28
申请人: NVIDIA CORPORATION
发明人: Samuli Laine , Timo Aila , Tero Karras , Gregory Muthler , William P. Newhall, Jr. , Ronald C. Babich, Jr. , Craig Kolb , Ignacio Llamas , John Burgess
CPC分类号: G06T15/06 , G06T15/005 , G06T17/005
摘要: Methods and systems are described in some examples for changing the traversal of an acceleration data structure in a highly dynamic query-specific manner, with each query specifying test parameters, a test opcode and a mapping of test results to actions. In an example ray tracing implementation, traversal of a bounding volume hierarchy by a ray is performed with the default behavior of the traversal being changed in accordance with results of a test performed using the test opcode and test parameters specified in the ray data structure and another test parameter specified in a node of the bounding volume hierarchy. In an example implementation a traversal coprocessor is configured to perform the traversal of the bounding volume hierarchy.
-
公开(公告)号:US11790595B2
公开(公告)日:2023-10-17
申请号:US17490024
申请日:2021-09-30
申请人: NVIDIA Corporation
发明人: Samuli Laine , Tero Karras , Greg Muthler , William Parsons Newhall, Jr. , Ronald Charles Babich, Jr. , Ignacio Llamas , John Burgess
CPC分类号: G06T15/06 , G06T1/20 , G06T15/005 , G06T2210/21
摘要: A hardware-based traversal coprocessor provides acceleration of tree traversal operations searching for intersections between primitives represented in a tree data structure and a ray. The primitives may include opaque and alpha triangles used in generating a virtual scene. The hardware-based traversal coprocessor is configured to determine primitives intersected by the ray, and return intersection information to a streaming multiprocessor for further processing. The hardware-based traversal coprocessor is configured to provide a deterministic result of intersected triangles regardless of the order that the memory subsystem returns triangle range blocks for processing, while opportunistically eliminating alpha intersections that lie further along the length of the ray than closer opaque intersections.
-
-
-
-
-
-
-
-
-