TECHNIQUES FOR COMPREHENSIVELY SYNCHRONIZING EXECUTION THREADS

    公开(公告)号:US20180314520A1

    公开(公告)日:2018-11-01

    申请号:US15499843

    申请日:2017-04-27

    CPC classification number: G06F9/3009 G06F9/30087 G06F9/3851 G06F9/46

    Abstract: In one embodiment, a synchronization instruction causes a processor to ensure that specified threads included within a warp concurrently execute a single subsequent instruction. The specified threads include at least a first thread and a second thread. In operation, the first thread arrives at the synchronization instruction. The processor determines that the second thread has not yet arrived at the synchronization instruction and configures the first thread to stop executing instructions. After issuing at least one instruction for the second thread, the processor determines that all the specified threads have arrived at the synchronization instruction. The processor then causes all the specified threads to execute the subsequent instruction. Advantageously, unlike conventional approaches to synchronizing threads, the synchronization instruction enables the processor to reliably and properly execute code that includes complex control flows and/or instructions that presuppose that threads are converged.

    Techniques for maintaining atomicity and ordering for pixel shader operations

    公开(公告)号:US20180374185A1

    公开(公告)日:2018-12-27

    申请号:US15999185

    申请日:2018-08-17

    CPC classification number: G06T1/20 G06T1/60 G06T11/40

    Abstract: A tile coalescer within a graphics processing pipeline coalesces coverage data into tiles. The coverage data indicates, for a set of XY positions, whether a graphics primitive covers those XY positions. The tile indicates, for a larger set of XY positions, whether one or more graphics primitives cover those XY positions. The tile coalescer includes coverage data in the tile only once for each XY position, thereby allowing the API ordering of the graphics primitives covering each XY position to be preserved. The tile is then distributed to a set of streaming multiprocessors for shading and blending operations. The different streaming multiprocessors execute thread groups to process the tile. In doing so, those thread groups may perform read-modify-write operations with data stored in memory. Each such thread group is scheduled to execute via atomic operations, and according to the API order of the associated graphics primitives.

Patent Agency Ranking