DISTRIBUTED INDEX FETCH, PRIMITIVE ASSEMBLY, AND PRIMITIVE BATCHING

    公开(公告)号:US20170178401A1

    公开(公告)日:2017-06-22

    申请号:US14979342

    申请日:2015-12-22

    CPC classification number: G06T17/10 G06T15/005 G06T17/20

    Abstract: One embodiment of the present invention includes a technique for distributing work slices associated with a graphics processing unit for processing. A primitive distribution system receives a draw command related to a graphics object associated with a plurality of indices. The primitive distribution system creates a plurality of work slices, where each work slice is associated with a different subset of the indices included in the plurality of indices. The primitive distribution system scans a first subset of indices to identify a first set of characteristics that is needed to process a second subset of indices. The primitive distribution system processes the second subset of indices based at least in part on the one or more characteristics. Advantageously, because multiple work slices are analyzed in parallel for duplicate indices, the time required to analyze work slices is more in balance with the time required to process the work slices, leading to greater utilization of GPU resources and improved overall performance.

    Distributed index fetch, primitive assembly, and primitive batching

    公开(公告)号:US10332310B2

    公开(公告)日:2019-06-25

    申请号:US14979342

    申请日:2015-12-22

    Abstract: One embodiment of the present invention includes a technique for distributing work slices associated with a graphics processing unit for processing. A primitive distribution system receives a draw command related to a graphics object associated with a plurality of indices. The primitive distribution system creates a plurality of work slices, where each work slice is associated with a different subset of the indices included in the plurality of indices. The primitive distribution system scans a first subset of indices to identify a first set of characteristics that is needed to process a second subset of indices. The primitive distribution system processes the second subset of indices based at least in part on the one or more characteristics. Advantageously, because multiple work slices are analyzed in parallel for duplicate indices, the time required to analyze work slices is more in balance with the time required to process the work slices, leading to greater utilization of GPU resources and improved overall performance.

    Techniques for maintaining atomicity and ordering for pixel shader operations

    公开(公告)号:US10019776B2

    公开(公告)日:2018-07-10

    申请号:US14924624

    申请日:2015-10-27

    CPC classification number: G06T1/20 G06T1/60 G06T11/40

    Abstract: A tile coalescer within a graphics processing pipeline coalesces coverage data into tiles. The coverage data indicates, for a set of XY positions, whether a graphics primitive covers those XY positions. The tile indicates, for a larger set of XY positions, whether one or more graphics primitives cover those XY positions. The tile coalescer includes coverage data in the tile only once for each XY position, thereby allowing the API ordering of the graphics primitives covering each XY position to be preserved. The tile is then distributed to a set of streaming multiprocessors for shading and blending operations. The different streaming multiprocessors execute thread groups to process the tile. In doing so, those thread groups may perform read-modify-write operations with data stored in memory. Each such thread group is scheduled to execute via atomic operations, and according to the API order of the associated graphics primitives.

    Techniques for maintaining atomicity and ordering for pixel shader operations

    公开(公告)号:US10453168B2

    公开(公告)日:2019-10-22

    申请号:US15999185

    申请日:2018-08-17

    Abstract: A tile coalescer within a graphics processing pipeline coalesces coverage data into tiles. The coverage data indicates, for a set of XY positions, whether a graphics primitive covers those XY positions. The tile indicates, for a larger set of XY positions, whether one or more graphics primitives cover those XY positions. The tile coalescer includes coverage data in the tile only once for each XY position, thereby allowing the API ordering of the graphics primitives covering each XY position to be preserved. The tile is then distributed to a set of streaming multiprocessors for shading and blending operations. The different streaming multiprocessors execute thread groups to process the tile. In doing so, those thread groups may perform read-modify-write operations with data stored in memory. Each such thread group is scheduled to execute via atomic operations, and according to the API order of the associated graphics primitives.

    Techniques for maintaining atomicity and ordering for pixel shader operations

    公开(公告)号:US10055806B2

    公开(公告)日:2018-08-21

    申请号:US14924618

    申请日:2015-10-27

    CPC classification number: G06T1/20 G06T1/60 G06T11/40

    Abstract: A tile coalescer within a graphics processing pipeline coalesces coverage data into tiles. The coverage data indicates, for a set of XY positions, whether a graphics primitive covers those XY positions. The tile indicates, for a larger set of XY positions, whether one or more graphics primitives cover those XY positions. The tile coalescer includes coverage data in the tile only once for each XY position, thereby allowing the API ordering of the graphics primitives covering each XY position to be preserved. The tile is then distributed to a set of streaming multiprocessors for shading and blending operations. The different streaming multiprocessors execute thread groups to process the tile. In doing so, those thread groups may perform read-modify-write operations with data stored in memory. Each such thread group is scheduled to execute via atomic operations, and according to the API order of the associated graphics primitives.

    Multi-pass rendering in a screen space pipeline

    公开(公告)号:US10430989B2

    公开(公告)日:2019-10-01

    申请号:US14952400

    申请日:2015-11-25

    Abstract: A multi-pass unit interoperates with a device driver to configure a screen space pipeline to perform multiple processing passes with buffered graphics primitives. The multi-pass unit receives primitive data and state bundles from the device driver. The primitive data includes a graphics primitive and a primitive mask. The primitive mask indicates the specific passes when the graphics primitive should be processed. The state bundles include one or more state settings and a state mask. The state mask indicates the specific passes where the state settings should be applied. The primitives and state settings are interleaved. For a given pass, the multi-pass unit extracts the interleaved state settings for that pass and configures the screen space pipeline according to those state settings. The multi-pass unit also extracts the interleaved graphics primitives to be processed in that pass. Then, the multi-pass unit causes the screen space pipeline to process those graphics primitives.

    Techniques for maintaining atomicity and ordering for pixel shader operations

    公开(公告)号:US20180374185A1

    公开(公告)日:2018-12-27

    申请号:US15999185

    申请日:2018-08-17

    CPC classification number: G06T1/20 G06T1/60 G06T11/40

    Abstract: A tile coalescer within a graphics processing pipeline coalesces coverage data into tiles. The coverage data indicates, for a set of XY positions, whether a graphics primitive covers those XY positions. The tile indicates, for a larger set of XY positions, whether one or more graphics primitives cover those XY positions. The tile coalescer includes coverage data in the tile only once for each XY position, thereby allowing the API ordering of the graphics primitives covering each XY position to be preserved. The tile is then distributed to a set of streaming multiprocessors for shading and blending operations. The different streaming multiprocessors execute thread groups to process the tile. In doing so, those thread groups may perform read-modify-write operations with data stored in memory. Each such thread group is scheduled to execute via atomic operations, and according to the API order of the associated graphics primitives.

    Multi-pass rendering in a screen space pipeline

    公开(公告)号:US10147222B2

    公开(公告)日:2018-12-04

    申请号:US14952390

    申请日:2015-11-25

    Abstract: A multi-pass unit interoperates with a device driver to configure a screen space pipeline to perform multiple processing passes with buffered graphics primitives. The multi-pass unit receives primitive data and state bundles from the device driver. The primitive data includes a graphics primitive and a primitive mask. The primitive mask indicates the specific passes when the graphics primitive should be processed. The state bundles include one or more state settings and a state mask. The state mask indicates the specific passes where the state settings should be applied. The primitives and state settings are interleaved. For a given pass, the multi-pass unit extracts the interleaved state settings for that pass and configures the screen space pipeline according to those state settings. The multi-pass unit also extracts the interleaved graphics primitives to be processed in that pass. Then, the multi-pass unit causes the screen space pipeline to process those graphics primitives.

    Techniques for maintaining atomicity and ordering for pixel shader operations

    公开(公告)号:US10032245B2

    公开(公告)日:2018-07-24

    申请号:US14924628

    申请日:2015-10-27

    Abstract: A tile coalescer within a graphics processing pipeline coalesces coverage data into tiles. The coverage data indicates, for a set of XY positions, whether a graphics primitive covers those XY positions. The tile indicates, for a larger set of XY positions, whether one or more graphics primitives cover those XY positions. The tile coalescer includes coverage data in the tile only once for each XY position, thereby allowing the API ordering of the graphics primitives covering each XY position to be preserved. The tile is then distributed to a set of streaming multiprocessors for shading and blending operations. The different streaming multiprocessors execute thread groups to process the tile. In doing so, those thread groups may perform read-modify-write operations with data stored in memory. Each such thread group is scheduled to execute via atomic operations, and according to the API order of the associated graphics primitives.

Patent Agency Ranking