Abstract:
A multi-threaded processing unit includes a hardware pre-processor coupled to one or more processing engines (e.g., copy engines, GPCs, etc.) that implement pre-emption techniques by dividing tasks into smaller subtasks and scheduling subtasks on the processing engines based on the priority of the tasks. By limiting the size of the subtasks, higher priority tasks may be executed quickly without switching the context state of the processing engine. Tasks may be subdivided based on a threshold size or by taking into account other consideration such as physical boundaries of the memory system.