Abstract:
A mechanism is described for facilitating inference coordination and processing utilization for machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting, at training time, information relating to one or more tasks to be performed according to a training dataset relating to a processor including a graphics processor. The method may further include analyzing the information to determine one or more portions of hardware relating to the processor capable of supporting the one or more tasks, and configuring the hardware to pre-select the one or more portions to perform the one or more tasks, while other portions of the hardware remain available for other tasks.
Abstract:
In an example, an apparatus comprises a plurality of processing unit cores, a plurality of cache memory modules associated with the plurality of processing unit cores, and a machine learning model communicatively coupled to the plurality of processing unit cores, wherein the plurality of cache memory modules share cache coherency data with the machine learning model. Other embodiments are also disclosed and claimed.
Abstract:
Example methods and apparatus to generate anomaly detection datasets are disclosed. An example method to generate an anomaly detection dataset for training a machine learning model to detect real world anomalies includes receiving a user definition of an anomaly generator function, executing, with a processor, the anomaly generator function to generate user-defined anomaly data, and combining the user-defined anomaly data with nominal data to generate the anomaly detection dataset.
Abstract:
A mechanism is described for facilitating inference coordination and processing utilization for machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting, at training time, information relating to one or more tasks to be performed according to a training dataset relating to a processor including a graphics processor. The method may further include analyzing the information to determine one or more portions of hardware relating to the processor capable of supporting the one or more tasks, and configuring the hardware to pre-select the one or more portions to perform the one or more tasks, while other portions of the hardware remain available for other tasks.
Abstract:
In an example, an apparatus comprises a plurality of execution units comprising at least a first type of execution unit and a second type of execution unit and logic, at least partially including hardware logic, to analyze a workload and assign the workload to one of the first type of execution unit or the second type of execution unit. Other embodiments are also disclosed and claimed.
Abstract:
Technologies for identification of a potential root cause of a use-after-free memory corruption bug of a program include a computing device to replay execution of the execution of the program based on an execution log of the program. The execution log comprises an ordered set of executed instructions of the program that resulted in the use-after-free memory corruption bug. The computing device compares a use-after-free memory address access of the program to a memory address associated with an occurrence of the use-after-free memory corruption bug in response to detecting the use-after-free memory address access and records the use-after-free memory address access of the program as a candidate for a root cause of the use-after-free memory corruption bug to a candidate list in response to detecting a match between the use-after-free memory address access of the program and the memory address associated with the occurrence of the use-after-free memory corruption bug.
Abstract:
Embodiments may provide a method for performing a replay of a previous execution of a program. The method includes generating an order of recorded chunks of instructions across a plurality of recorded threads based, at least in part, on log files generated from the previous execution of the program. The method includes initiating execution of the program, the executing program having a plurality of threads, each thread having a number of chunks of instructions. The method includes intercepting, by a virtual machine unit executing on a processor, an instruction of a chunk before the instruction is executed. The method includes determining, by a replay module executing on the processor, that the chunk is an active chunk if the chunk is currently in line for execution according to the order of recorded chunks, and responsive to a determination that the chunk is the active chunk, executing the instruction.
Abstract:
One embodiment provides a parallel processor comprising a hardware scheduler to schedule pipeline commands for compute operations to one or more of multiple types of compute units, a plurality of processing resources including a first sparse compute unit configured for input at a first level of sparsity and hybrid memory circuitry including a memory controller, a memory interface, and a second sparse compute unit configured for input at a second level of sparsity that is greater than the first level of sparsity.
Abstract:
One embodiment provides a parallel processor comprising a hardware scheduler to schedule pipeline commands for compute operations to one or more of multiple types of compute units, a plurality of processing resources including a first sparse compute unit configured for input at a first level of sparsity and hybrid memory circuitry including a memory controller, a memory interface, and a second sparse compute unit configured for input at a second level of sparsity that is greater than the first level of sparsity.
Abstract:
An autonomous vehicle is provided that includes one or more processors configured to provide a local compute manager to manage execution of compute workloads associated with the autonomous vehicle. The local compute manager can perform various compute operations, including receiving offload of compute operations from to other compute nodes and offloading compute operations to other compute notes, where the other compute nodes can be other autonomous vehicles. The local compute manager can also facilitate autonomous navigation functionality.