-
公开(公告)号:US20170178401A1
公开(公告)日:2017-06-22
申请号:US14979342
申请日:2015-12-22
Applicant: NVIDIA Corporation
Inventor: Niket Agrawal , Amit Jain , Dale Kirkland , Karim Abdalla , Ziyad Hakura , Haren Kethareswaran
CPC classification number: G06T17/10 , G06T15/005 , G06T17/20
Abstract: One embodiment of the present invention includes a technique for distributing work slices associated with a graphics processing unit for processing. A primitive distribution system receives a draw command related to a graphics object associated with a plurality of indices. The primitive distribution system creates a plurality of work slices, where each work slice is associated with a different subset of the indices included in the plurality of indices. The primitive distribution system scans a first subset of indices to identify a first set of characteristics that is needed to process a second subset of indices. The primitive distribution system processes the second subset of indices based at least in part on the one or more characteristics. Advantageously, because multiple work slices are analyzed in parallel for duplicate indices, the time required to analyze work slices is more in balance with the time required to process the work slices, leading to greater utilization of GPU resources and improved overall performance.
-
公开(公告)号:US10332310B2
公开(公告)日:2019-06-25
申请号:US14979342
申请日:2015-12-22
Applicant: NVIDIA Corporation
Inventor: Niket Agrawal , Amit Jain , Dale Kirkland , Karim Abdalla , Ziyad Hakura , Haren Kethareswaran
Abstract: One embodiment of the present invention includes a technique for distributing work slices associated with a graphics processing unit for processing. A primitive distribution system receives a draw command related to a graphics object associated with a plurality of indices. The primitive distribution system creates a plurality of work slices, where each work slice is associated with a different subset of the indices included in the plurality of indices. The primitive distribution system scans a first subset of indices to identify a first set of characteristics that is needed to process a second subset of indices. The primitive distribution system processes the second subset of indices based at least in part on the one or more characteristics. Advantageously, because multiple work slices are analyzed in parallel for duplicate indices, the time required to analyze work slices is more in balance with the time required to process the work slices, leading to greater utilization of GPU resources and improved overall performance.
-
公开(公告)号:US10019776B2
公开(公告)日:2018-07-10
申请号:US14924624
申请日:2015-10-27
Applicant: NVIDIA CORPORATION
Inventor: Ziyad Hakura , Eric Lum , Dale Kirkland , Jack Choquette , Patrick R. Brown , Yury Y. Uralsky , Jeffrey Bolz
Abstract: A tile coalescer within a graphics processing pipeline coalesces coverage data into tiles. The coverage data indicates, for a set of XY positions, whether a graphics primitive covers those XY positions. The tile indicates, for a larger set of XY positions, whether one or more graphics primitives cover those XY positions. The tile coalescer includes coverage data in the tile only once for each XY position, thereby allowing the API ordering of the graphics primitives covering each XY position to be preserved. The tile is then distributed to a set of streaming multiprocessors for shading and blending operations. The different streaming multiprocessors execute thread groups to process the tile. In doing so, those thread groups may perform read-modify-write operations with data stored in memory. Each such thread group is scheduled to execute via atomic operations, and according to the API order of the associated graphics primitives.
-
公开(公告)号:US10453168B2
公开(公告)日:2019-10-22
申请号:US15999185
申请日:2018-08-17
Applicant: NVIDIA Corporation
Inventor: Ziyad Hakura , Eric Lum , Dale Kirkland , Jack Choquette , Patrick R. Brown , Yury Y. Uralsky , Jeffrey Bolz
Abstract: A tile coalescer within a graphics processing pipeline coalesces coverage data into tiles. The coverage data indicates, for a set of XY positions, whether a graphics primitive covers those XY positions. The tile indicates, for a larger set of XY positions, whether one or more graphics primitives cover those XY positions. The tile coalescer includes coverage data in the tile only once for each XY position, thereby allowing the API ordering of the graphics primitives covering each XY position to be preserved. The tile is then distributed to a set of streaming multiprocessors for shading and blending operations. The different streaming multiprocessors execute thread groups to process the tile. In doing so, those thread groups may perform read-modify-write operations with data stored in memory. Each such thread group is scheduled to execute via atomic operations, and according to the API order of the associated graphics primitives.
-
公开(公告)号:US10055806B2
公开(公告)日:2018-08-21
申请号:US14924618
申请日:2015-10-27
Applicant: NVIDIA CORPORATION
Inventor: Ziyad Hakura , Eric Lum , Dale Kirkland , Jack Choquette , Patrick R. Brown , Yury Y. Uralsky , Jeffrey Bolz
Abstract: A tile coalescer within a graphics processing pipeline coalesces coverage data into tiles. The coverage data indicates, for a set of XY positions, whether a graphics primitive covers those XY positions. The tile indicates, for a larger set of XY positions, whether one or more graphics primitives cover those XY positions. The tile coalescer includes coverage data in the tile only once for each XY position, thereby allowing the API ordering of the graphics primitives covering each XY position to be preserved. The tile is then distributed to a set of streaming multiprocessors for shading and blending operations. The different streaming multiprocessors execute thread groups to process the tile. In doing so, those thread groups may perform read-modify-write operations with data stored in memory. Each such thread group is scheduled to execute via atomic operations, and according to the API order of the associated graphics primitives.
-
公开(公告)号:US10430989B2
公开(公告)日:2019-10-01
申请号:US14952400
申请日:2015-11-25
Applicant: NVIDIA CORPORATION
Inventor: Ziyad Hakura , Cynthia Allison , Dale Kirkland , Jeffrey Bolz , Yury Uralsky , Jonah Alben
Abstract: A multi-pass unit interoperates with a device driver to configure a screen space pipeline to perform multiple processing passes with buffered graphics primitives. The multi-pass unit receives primitive data and state bundles from the device driver. The primitive data includes a graphics primitive and a primitive mask. The primitive mask indicates the specific passes when the graphics primitive should be processed. The state bundles include one or more state settings and a state mask. The state mask indicates the specific passes where the state settings should be applied. The primitives and state settings are interleaved. For a given pass, the multi-pass unit extracts the interleaved state settings for that pass and configures the screen space pipeline according to those state settings. The multi-pass unit also extracts the interleaved graphics primitives to be processed in that pass. Then, the multi-pass unit causes the screen space pipeline to process those graphics primitives.
-
公开(公告)号:US20180374185A1
公开(公告)日:2018-12-27
申请号:US15999185
申请日:2018-08-17
Applicant: NVIDIA Corporation
Inventor: Ziyad Hakura , Eric Lum , Dale Kirkland , Jack Choquette , Patrick R. Brown , Yury Y. Uralsky , Jeffrey Bolz
Abstract: A tile coalescer within a graphics processing pipeline coalesces coverage data into tiles. The coverage data indicates, for a set of XY positions, whether a graphics primitive covers those XY positions. The tile indicates, for a larger set of XY positions, whether one or more graphics primitives cover those XY positions. The tile coalescer includes coverage data in the tile only once for each XY position, thereby allowing the API ordering of the graphics primitives covering each XY position to be preserved. The tile is then distributed to a set of streaming multiprocessors for shading and blending operations. The different streaming multiprocessors execute thread groups to process the tile. In doing so, those thread groups may perform read-modify-write operations with data stored in memory. Each such thread group is scheduled to execute via atomic operations, and according to the API order of the associated graphics primitives.
-
公开(公告)号:US10147222B2
公开(公告)日:2018-12-04
申请号:US14952390
申请日:2015-11-25
Applicant: NVIDIA CORPORATION
Inventor: Ziyad Hakura , Cynthia Allison , Dale Kirkland , Jeffrey Bolz , Yury Uralsky , Jonah Alben
Abstract: A multi-pass unit interoperates with a device driver to configure a screen space pipeline to perform multiple processing passes with buffered graphics primitives. The multi-pass unit receives primitive data and state bundles from the device driver. The primitive data includes a graphics primitive and a primitive mask. The primitive mask indicates the specific passes when the graphics primitive should be processed. The state bundles include one or more state settings and a state mask. The state mask indicates the specific passes where the state settings should be applied. The primitives and state settings are interleaved. For a given pass, the multi-pass unit extracts the interleaved state settings for that pass and configures the screen space pipeline according to those state settings. The multi-pass unit also extracts the interleaved graphics primitives to be processed in that pass. Then, the multi-pass unit causes the screen space pipeline to process those graphics primitives.
-
公开(公告)号:US10032245B2
公开(公告)日:2018-07-24
申请号:US14924628
申请日:2015-10-27
Applicant: NVIDIA CORPORATION
Inventor: Ziyad Hakura , Eric Lum , Dale Kirkland , Jack Choquette , Patrick R. Brown , Yury Y. Uralsky , Jeffrey Bolz
Abstract: A tile coalescer within a graphics processing pipeline coalesces coverage data into tiles. The coverage data indicates, for a set of XY positions, whether a graphics primitive covers those XY positions. The tile indicates, for a larger set of XY positions, whether one or more graphics primitives cover those XY positions. The tile coalescer includes coverage data in the tile only once for each XY position, thereby allowing the API ordering of the graphics primitives covering each XY position to be preserved. The tile is then distributed to a set of streaming multiprocessors for shading and blending operations. The different streaming multiprocessors execute thread groups to process the tile. In doing so, those thread groups may perform read-modify-write operations with data stored in memory. Each such thread group is scheduled to execute via atomic operations, and according to the API order of the associated graphics primitives.
-
-
-
-
-
-
-
-