-
公开(公告)号:US20210294660A1
公开(公告)日:2021-09-23
申请号:US17204508
申请日:2021-03-17
Applicant: NVIDIA CORPORATION
Inventor: Yury URALSKY , Henry MORETON , Matthijs de SMEDT , Lei YANG
Abstract: The present technology augments the GPU compute model to provide system-provided data marshalling characteristics of graphics pipelining to increase efficiency and reduce overhead. A simple scheduling model based on scalar counters semaphores) abstract the availability of hardware resources. Resource releases can be done programmatically, and a system scheduler only needs to track the states of such counters/semaphores to make work launch decisions. Semantics of the counters/sema.phores are defined by an application, which can use the counters/semaphores to represent the availability of free space in a memory buffer, the amount of cache pressure induced by the data flow in the network, or the presence of work items to be processed.
-
公开(公告)号:US20230078932A1
公开(公告)日:2023-03-16
申请号:US17946828
申请日:2022-09-16
Applicant: NVIDIA Corporation
Inventor: John BURGESS , Gregory MUTHLER , Nikhil DIXIT , Henry MORETON , Yury URALSKY , Magnus ANDERSSON , Marco SALVI , Christoph KUBISCH
Abstract: A Displaced Micro-mesh (DMM) primitive enables high complexity geometry for ray and path tracing while minimizing the associated builder costs and preserving high efficiency. A structured, hierarchical representation implicitly encodes vertex positions of a triangle micro-mesh based on a barycentric grid, and enables microvertex displacements to be encoded efficiently (e.g., as scalars linearly interpolated between minimum and maximum triangle surfaces). The resulting displaced micro-mesh primitive provides a highly compressed representation of a potentially vast number of displaced microtriangles that can be stored in a small amount of space. Improvements in ray tracing hardware permit automatic processing of such primitive for ray-geometry intersection testing by ray tracing circuits without requiring intermediate reporting to a shader.
-
公开(公告)号:US20190236829A1
公开(公告)日:2019-08-01
申请号:US15881572
申请日:2018-01-26
Applicant: NVIDIA Corporation
Inventor: Ziyad HAKURA , Yury URALSKY , Christoph KUBISCH , Pierre BOUDIER , Henry MORETON
CPC classification number: G06T15/005 , G06F9/4881 , G06T15/80 , G06T17/10 , G06T17/20
Abstract: In various embodiments, a deduplication application pre-processes index buffers for a graphics processing pipeline that generates rendered images via a shading program. In operation, the deduplication application causes execution threads to identify a set of unique vertices specified in an index buffer based on an instruction. The deduplication application then generates a vertex buffer and an indirect index buffer based on the set of unique vertices. The vertex buffer and the indirect index buffer are associated with a portion of an input mesh. The graphics processing pipeline then renders a first frame and a second frame based on the vertex buffer, the indirect index buffer, and the shading program. Advantageously, the graphics processing pipeline may re-use the vertex buffer and indirect index buffer until the topology of the input mesh changes.
-
公开(公告)号:US20170148203A1
公开(公告)日:2017-05-25
申请号:US14952390
申请日:2015-11-25
Applicant: NVIDIA CORPORATION
Inventor: Ziyad HAKURA , Cynthia ALLISON , Dale KIRKLAND , Jeffrey BOLZ , Yury URALSKY , Jonah ALBEN
CPC classification number: G06T15/005 , G06T1/20 , G06T15/405 , G06T15/503 , G06T15/80 , G06T17/20
Abstract: A multi-pass unit interoperates with a device driver to configure a screen space pipeline to perform multiple processing passes with buffered graphics primitives. The multi-pass unit receives primitive data and state bundles from the device driver. The primitive data includes a graphics primitive and a primitive mask. The primitive mask indicates the specific passes when the graphics primitive should be processed. The state bundles include one or more state settings and a state mask. The state mask indicates the specific passes where the state settings should be applied. The primitives and state settings are interleaved. For a given pass, the multi-pass unit extracts the interleaved state settings for that pass and configures the screen space pipeline according to those state settings. The multi-pass unit also extracts the interleaved graphics primitives to be processed in that pass. Then, the multi-pass unit causes the screen space pipeline to process those graphics primitives.
-
公开(公告)号:US20250094232A1
公开(公告)日:2025-03-20
申请号:US18961452
申请日:2024-11-27
Applicant: NVIDIA CORPORATION
Inventor: Yury URALSKY , Henry MORETON , Matthijs de SMEDT , Lei YANG
Abstract: The present technology augments the GPU compute model to provide system-provided data marshalling characteristics of graphics pipelining to increase efficiency and reduce overhead. A simple scheduling model based on scalar counters (e.g., semaphores) abstract the availability of hardware resources. Resource releases can be done programmatically, and a system scheduler only needs to track the states of such counters/semaphores to make work launch decisions. Semantics of the counters/semaphores are defined by an application, which can use the counters/semaphores to represent the availability of free space in a memory buffer, the amount of cache pressure induced by the data flow in the network, or the presence of work items to be processed.
-
6.
公开(公告)号:US20190236828A1
公开(公告)日:2019-08-01
申请号:US15881566
申请日:2018-01-26
Applicant: NVIDIA Corporation
Inventor: Ziyad HAKURA , Yury URALSKY , Christoph KUBISCH , Pierre BOUDIER , Henry MORETON
CPC classification number: G06T15/005 , G06F9/4881 , G06T15/80 , G06T17/10 , G06T17/20
Abstract: In various embodiments, a parallel processor implements a graphics processing pipeline that generates rendered images. In operation, the parallel processor causes execution threads to execute a task shading program on an input mesh to generate a task shader output specifying a mesh shader count. The parallel processor then generates mesh shader identifiers, where the total number of the mesh shader identifiers equals the mesh shader count. For each mesh shader identifier, the parallel processor invokes a mesh shader based on the mesh shader identifier and the task shader output to generate geometry associated with the mesh shader identifier. Subsequently, the parallel processor performs operations on the geometries associated with the mesh shader identifiers to generate a rendered image. Advantageously, unlike conventional graphics processing pipelines, the performance of the graphics processing pipeline is not limited by a primitive distributor.
-
7.
公开(公告)号:US20190236827A1
公开(公告)日:2019-08-01
申请号:US15881564
申请日:2018-01-26
Applicant: NVIDIA Corporation
Inventor: Ziyad HAKURA , Yury URALSKY , Christoph KUBISCH , Pierre BOUDIER , Henry MORETON
Abstract: In various embodiments, a parallel processor implements a graphics processing pipeline that generates rendered images via a shading program. In operation, the parallel processor causes a first set of execution threads to execute the shading program on a first portion of the input mesh to generate first geometry stored in an on-chip memory. The parallel processor also causes a second set of execution threads to execute the mesh shading program on a second portion of the input mesh to generate second geometry stored in the on-chip memory. Subsequently, the parallel processor reads the first geometry and the second geometry from the on-chip memory, and performs operations on the first geometry and the second geometry to generate a rendered image derived from the input mesh. Advantageously, unlike conventional graphics processing pipelines, the performance of the graphics processing pipeline is not limited by a primitive distributor.
-
公开(公告)号:US20230084570A1
公开(公告)日:2023-03-16
申请号:US17946221
申请日:2022-09-16
Applicant: NVIDIA Corporation
Inventor: Gregory MUTHLER , John BURGESS , Henry Packard MORETON , Yury URALSKY , Levi OLIVER , Magnus ANDERSSON , Johannes DELIGIANNIS
Abstract: Techniques applicable to a ray tracing hardware accelerator for traversing a hierarchical acceleration structure with reduced round-trip communications with a processor are disclosed. The reduction of round-trip communications with a processor during traversal is achieved by having a visibility mask that defines visibility states for regions within a geometric primitive available to be accessed in the ray tracing hardware accelerator when a ray intersection is detected for the geometric primitive.
-
公开(公告)号:US20230081791A1
公开(公告)日:2023-03-16
申请号:US17946515
申请日:2022-09-16
Applicant: NVIDIA Corporation
Inventor: John BURGESS , Gregory MUTHLER , Nikhil DIXIT , Henry MORETON , Yury URALSKY , Magnus ANDERSSON , Marco SALVI , Christoph KUBISCH
Abstract: A Displaced Micro-mesh (DMM) primitive enables high complexity geometry for ray and path tracing while minimizing the associated builder costs and preserving high efficiency. A structured, hierarchical representation implicitly encodes vertex positions of a triangle micro-mesh based on a barycentric grid, and enables microvertex displacements to be encoded efficiently (e.g., as scalars linearly interpolated between minimum and maximum triangle surfaces). The resulting displaced micro-mesh primitive provides a highly compressed representation of a potentially vast number of displaced microtriangles that can be stored in a small amount of space. Improvements in ray tracing hardware permit automatic processing of such primitive for ray-geometry intersection testing by ray tracing circuits without requiring intermediate reporting to a shader.
-
公开(公告)号:US20170148204A1
公开(公告)日:2017-05-25
申请号:US14952400
申请日:2015-11-25
Applicant: NVIDIA CORPORATION
Inventor: Ziyad HAKURA , Cynthia ALLISON , Dale KIRKLAND , Jeffrey BOLZ , Yury URALSKY , Jonah ALBEN
CPC classification number: G06T15/005 , G06T11/40
Abstract: A multi-pass unit interoperates with a device driver to configure a screen space pipeline to perform multiple processing passes with buffered graphics primitives. The multi-pass unit receives primitive data and state bundles from the device driver. The primitive data includes a graphics primitive and a primitive mask. The primitive mask indicates the specific passes when the graphics primitive should be processed. The state bundles include one or more state settings and a state mask. The state mask indicates the specific passes where the state settings should be applied. The primitives and state settings are interleaved. For a given pass, the multi-pass unit extracts the interleaved state settings for that pass and configures the screen space pipeline according to those state settings. The multi-pass unit also extracts the interleaved graphics primitives to be processed in that pass. Then, the multi-pass unit causes the screen space pipeline to process those graphics primitives.
-
-
-
-
-
-
-
-
-