Abstract:
A method and an apparatus for a parallel computing program calling APIs (application programming interfaces) in a host processor to perform a data processing task in parallel among compute units are described. The compute units are coupled to the host processor including central processing units (CPUs) and graphic processing units (GPUs). A program object corresponding to a source code for the data processing task is generated in a memory coupled to the host processor according to the API calls. Executable codes for the compute units are generated from the program object according to the API calls to be loaded for concurrent execution among the compute units to perform the data processing task.
Abstract:
Methods and apparatuses to dynamically manage a performance state of a data processing system are described. The data processing system includes a plurality of components; one or more buses coupled to the plurality of components, and a dynamic performance state manager unit coupled to the components. The dynamic performance state manager unit is configured to receive information about a first plurality of current states of components of the system. The dynamic performance state manager unit is configured to determine a second plurality of required system performance states for the components; and to determine a current system performance state based on the first plurality and the second plurality.
Abstract:
A method and an apparatus that determine a total number of threads to concurrently execute executable codes compiled from a single source for target processing units in response to an API (Application Programming Interface) request from an application running in a host processing unit are described. The target processing units include GPUs (Graphics Processing Unit) and CPUs (Central Processing Unit). Thread group sizes for the target processing units are determined to partition the total number of threads according to a multi-dimensional global thread number included in the API request. The executable codes are loaded to be executed in thread groups with the determined thread group sizes concurrently in the target processing units.
Abstract:
A method and an apparatus that determine a total number of threads to concurrently execute executable codes compiled from a single source for target processing units in response to an API (Application Programming Interface) request from an application running in a host processing unit are described. The target processing units include GPUs (Graphics Processing Unit) and CPUs (Central Processing Unit). Thread group sizes for the target processing units are determined to partition the total number of threads according to a multi-dimensional global thread number included in the API request. The executable codes are loaded to be executed in thread groups with the determined thread group sizes concurrently in the target processing units.
Abstract:
The various methods and devices described herein relate to devices which, in at least certain embodiments, may include a method of decoding data or a data stream in a file, which may include checking for a first data portion of a plurality of data portions in the file, the first data portion having a first data value, reading in data from another data portion of the plurality of data portions, decoding or decompressing the data, performing a checksum operation on the decoded data if the first data portion having the first data value is not detected, and skipping a checksum operation on the decoded data if the first data portion having the first data value is detected. In the embodiment, a checksum operation on encoded data may also be skipped. In an embodiment, the first data value may include information or instructions about how a decoder may decode the data and may also include a tag or identifier.
Abstract:
The various methods and devices described herein relate to devices which, in at least certain embodiments, may include a method of decoding data or a data stream in a file, which may include checking for a first data portion of a plurality of data portions in the file, the first data portion having a first data value, reading in data from another data portion of the plurality of data portions, decoding or decompressing the data, performing a checksum operation on the decoded data if the first data portion having the first data value is not detected, and skipping a checksum operation on the decoded data if the first data portion having the first data value is detected. In the embodiment, a checksum operation on encoded data may also be skipped. In an embodiment, the first data value may include information or instructions about how a decoder may decode the data and may also include a tag or identifier.
Abstract:
Methods and apparatuses to dynamically manage a performance state of a data processing system are described. The data processing system includes a plurality of components; one or more buses coupled to the plurality of components, and a dynamic performance state manager unit coupled to the components. The dynamic performance state manager unit is configured to receive information about a first plurality of current states of components of the system. The dynamic performance state manager unit is configured to determine a second plurality of required system performance states for the components; and to determine a current system performance state based on the first plurality and the second plurality.
Abstract:
The various methods and devices described herein relate to devices which, in at least certain embodiments, may include a method of decoding data or a data stream in a file, which may include checking for a first data portion of a plurality of data portions in the file, the first data portion having a first data value, reading in data from another data portion of the plurality of data portions, decoding or decompressing the data, performing a checksum operation on the decoded data if the first data portion having the first data value is not detected, and skipping a checksum operation on the decoded data if the first data portion having the first data value is detected. In the embodiment, a checksum operation on encoded data may also be skipped. In an embodiment, the first data value may include information or instructions about how a decoder may decode the data and may also include a tag or identifier.