Abstract:
Apparatuses, systems, and techniques to transform information corresponding to one or more memory transactions. In at least one embodiment, one or more circuits are to perform an application programming interface (API) to cause information corresponding to one or more memory transactions resulting from performance of the API to be transformed.
Abstract:
Apparatuses, systems, and techniques to cause a first tensor to be translated into a second tensor according to a tensor map. In at least one embodiment, one or more circuits are to perform an application programming interface (API) to cause a first tensor to be translated into a second tensor according to a tensor map.
Abstract:
A system, method, and computer program product are provided for accessing multi-sample surfaces. A multi-sample store instruction that specifies data for a single sample of a multi-sample pixel and a sample mask is received and the data for the single sample is stored to each sample of the multi-sample pixel that is enabled according to the sample mask. A multi-sample load instruction that specifies a multi-sample pixel is received, and, in response to executing the multi-sample load instruction, data for one sample of the multi-sample pixel is received. A determination is made that the data for the one sample of the multi-sample pixel represents multi-sample pixel data for at least one additional sample of the multi-sample pixel.
Abstract:
Apparatuses, systems, and techniques to perform a tensor prefetch instruction to cause one or more tensors to be stored into one or more caches. In at least one embodiment, one or more circuits of a GPU are to perform a tensor prefetch instruction to cause one or more tensors to be stored into one or more GPU caches.
Abstract:
A system, method, and computer program product are provided for multi-sample processing. The multi-sample pixel data is received and is analyzed to identify subsets of samples of a multi-sample pixel that have equal data, such that data for one sample in a subset represents multi-sample pixel data for all samples in the subset. An encoding state is generated that indicates which samples of the multi-sample pixel are included in each one of the subsets.
Abstract:
Apparatuses, systems, and techniques to store information in a plurality of storage locations allocated to a graphics processing unit (GPU). In at least one embodiment, one or more circuits are to perform an application programming interface (API) to cause information to be stored in a plurality of storage locations allocated to a first GPU.
Abstract:
Apparatuses, systems, and techniques to indicate storage locations of information to be mapped from a first tensor to a second tensor. In at least one embodiment, one or more circuits are to perform an application programming interface (API) to indicate one or more storage locations of information to be mapped from a first tensor to a second tensor.
Abstract:
A system, method, and computer program product are provided for redistributing multi-sample processing workloads between threads. A workload for a plurality of multi-sample pixels is received and each thread in a parallel thread group is associated with a corresponding multi-sample pixel of the plurality of pixels. The workload is redistributed between the threads in the parallel thread group based on a characteristic of the workload and the workload is processed by the parallel thread group. In one embodiment, the characteristic is rasterized coverage information for the plurality of multi-sample pixels.
Abstract:
A system, method, and computer program product are provided for multi-sample processing. The multi-sample pixel data is received and an encoding state associated with the multi-sample pixel data is determined. Data for one sample of a multi-sample pixel and the encoding state are provided to a processing unit. The one sample of the multi-sample pixel is processed by the processing unit to generate processed data for the one sample that represents processed multi-sample pixel data for all samples of the multi-sample pixel or two or more samples of the multi-sample pixel.
Abstract:
Apparatuses, systems, and techniques to perform a graphics processing unit (GPU) prefetch instruction to cause a variable amount of information to be stored into one or more GPU caches. In at least one embodiment, one or more circuits of a GPU are to perform a GPU prefetch instruction to cause a variable amount of information to be stored into one or more GPU caches.