-
公开(公告)号:US20190235915A1
公开(公告)日:2019-08-01
申请号:US15881587
申请日:2018-01-26
Applicant: NVIDIA Corporation
Inventor: Ziyad HAKURA , Olivier GIROUX , Wishwesh GANDHI
CPC classification number: G06F9/4843 , G06F9/461 , G06F9/52 , G06F12/06 , G06F2209/462 , G06F2209/521 , G06F2212/1008
Abstract: In various embodiments, an ordered atomic operation enables a parallel processing subsystem to executes an atomic operation associated with a memory location in a specified order relative to other ordered atomic operations associated with the memory location. A level 2 (L2) cache slice includes an atomic processing circuit and a content-addressable memory (CAM). The CAM stores an ordered atomic operation specifying at least a memory address, an atomic operation, and an ordering number. In operation, the atomic processing circuit performs a look-up operation on the CAM, where the look-up operation specifies the memory address. After the atomic processing circuit determines that the ordering number is equal to a current ordering number associated with the memory address, the atomic processing circuit executes the atomic operation and returns the result to a processor executing an algorithm. Advantageously, the ordered atomic operation enables the algorithm to achieve a deterministic result while optimizing latency.
-
公开(公告)号:US20170148204A1
公开(公告)日:2017-05-25
申请号:US14952400
申请日:2015-11-25
Applicant: NVIDIA CORPORATION
Inventor: Ziyad HAKURA , Cynthia ALLISON , Dale KIRKLAND , Jeffrey BOLZ , Yury URALSKY , Jonah ALBEN
CPC classification number: G06T15/005 , G06T11/40
Abstract: A multi-pass unit interoperates with a device driver to configure a screen space pipeline to perform multiple processing passes with buffered graphics primitives. The multi-pass unit receives primitive data and state bundles from the device driver. The primitive data includes a graphics primitive and a primitive mask. The primitive mask indicates the specific passes when the graphics primitive should be processed. The state bundles include one or more state settings and a state mask. The state mask indicates the specific passes where the state settings should be applied. The primitives and state settings are interleaved. For a given pass, the multi-pass unit extracts the interleaved state settings for that pass and configures the screen space pipeline according to those state settings. The multi-pass unit also extracts the interleaved graphics primitives to be processed in that pass. Then, the multi-pass unit causes the screen space pipeline to process those graphics primitives.
-
公开(公告)号:US20190236829A1
公开(公告)日:2019-08-01
申请号:US15881572
申请日:2018-01-26
Applicant: NVIDIA Corporation
Inventor: Ziyad HAKURA , Yury URALSKY , Christoph KUBISCH , Pierre BOUDIER , Henry MORETON
CPC classification number: G06T15/005 , G06F9/4881 , G06T15/80 , G06T17/10 , G06T17/20
Abstract: In various embodiments, a deduplication application pre-processes index buffers for a graphics processing pipeline that generates rendered images via a shading program. In operation, the deduplication application causes execution threads to identify a set of unique vertices specified in an index buffer based on an instruction. The deduplication application then generates a vertex buffer and an indirect index buffer based on the set of unique vertices. The vertex buffer and the indirect index buffer are associated with a portion of an input mesh. The graphics processing pipeline then renders a first frame and a second frame based on the vertex buffer, the indirect index buffer, and the shading program. Advantageously, the graphics processing pipeline may re-use the vertex buffer and indirect index buffer until the topology of the input mesh changes.
-
公开(公告)号:US20170148203A1
公开(公告)日:2017-05-25
申请号:US14952390
申请日:2015-11-25
Applicant: NVIDIA CORPORATION
Inventor: Ziyad HAKURA , Cynthia ALLISON , Dale KIRKLAND , Jeffrey BOLZ , Yury URALSKY , Jonah ALBEN
CPC classification number: G06T15/005 , G06T1/20 , G06T15/405 , G06T15/503 , G06T15/80 , G06T17/20
Abstract: A multi-pass unit interoperates with a device driver to configure a screen space pipeline to perform multiple processing passes with buffered graphics primitives. The multi-pass unit receives primitive data and state bundles from the device driver. The primitive data includes a graphics primitive and a primitive mask. The primitive mask indicates the specific passes when the graphics primitive should be processed. The state bundles include one or more state settings and a state mask. The state mask indicates the specific passes where the state settings should be applied. The primitives and state settings are interleaved. For a given pass, the multi-pass unit extracts the interleaved state settings for that pass and configures the screen space pipeline according to those state settings. The multi-pass unit also extracts the interleaved graphics primitives to be processed in that pass. Then, the multi-pass unit causes the screen space pipeline to process those graphics primitives.
-
5.
公开(公告)号:US20190236828A1
公开(公告)日:2019-08-01
申请号:US15881566
申请日:2018-01-26
Applicant: NVIDIA Corporation
Inventor: Ziyad HAKURA , Yury URALSKY , Christoph KUBISCH , Pierre BOUDIER , Henry MORETON
CPC classification number: G06T15/005 , G06F9/4881 , G06T15/80 , G06T17/10 , G06T17/20
Abstract: In various embodiments, a parallel processor implements a graphics processing pipeline that generates rendered images. In operation, the parallel processor causes execution threads to execute a task shading program on an input mesh to generate a task shader output specifying a mesh shader count. The parallel processor then generates mesh shader identifiers, where the total number of the mesh shader identifiers equals the mesh shader count. For each mesh shader identifier, the parallel processor invokes a mesh shader based on the mesh shader identifier and the task shader output to generate geometry associated with the mesh shader identifier. Subsequently, the parallel processor performs operations on the geometries associated with the mesh shader identifiers to generate a rendered image. Advantageously, unlike conventional graphics processing pipelines, the performance of the graphics processing pipeline is not limited by a primitive distributor.
-
6.
公开(公告)号:US20190236827A1
公开(公告)日:2019-08-01
申请号:US15881564
申请日:2018-01-26
Applicant: NVIDIA Corporation
Inventor: Ziyad HAKURA , Yury URALSKY , Christoph KUBISCH , Pierre BOUDIER , Henry MORETON
Abstract: In various embodiments, a parallel processor implements a graphics processing pipeline that generates rendered images via a shading program. In operation, the parallel processor causes a first set of execution threads to execute the shading program on a first portion of the input mesh to generate first geometry stored in an on-chip memory. The parallel processor also causes a second set of execution threads to execute the mesh shading program on a second portion of the input mesh to generate second geometry stored in the on-chip memory. Subsequently, the parallel processor reads the first geometry and the second geometry from the on-chip memory, and performs operations on the first geometry and the second geometry to generate a rendered image derived from the input mesh. Advantageously, unlike conventional graphics processing pipelines, the performance of the graphics processing pipeline is not limited by a primitive distributor.
-
-
-
-
-