摘要:
An optimized power saving technique is described for a processor, such as, for example, a graphic processing unit (GPU), which includes one or more processing cores and at least one data link interface. According to the technique, the processor is operable in a low power mode in which power to the at least one processing core is off and power to the at least one data link interface is on. This technique provides reduced exit latencies compared to currently available approaches in which the core power is turned off.
摘要:
A method for compressing framebuffer data is presented. The method includes determining a reduction ratio for framebuffer data in a tile including multiple samples. The reduction ratio determined is independent of the sampling mode, where the sampling mode is the number of samples within each pixel in the tile. The method further includes comparing a first portion of the framebuffer data for each of the multiple samples to determine an equality comparison result and also comparing a second portion of the framebuffer data for each one of the multiple samples to compute per-channel differences for each one of the multiple samples and testing the per-channel differences against a threshold value to determine a threshold comparison result. Finally, the method comprises compressing the framebuffer data for the tile based on the reduction ratio, the equality comparison result and the threshold comparison result to produce output framebuffer data for the tile.
摘要:
Various embodiments include a computer memory system that dynamically adjusts a memory device performance feature, such as dynamic assist control, dynamic turbo mode, and/or the like, to improve the performance of memory devices in the memory system. The memory system enables or disables the memory device performance feature based on the operating voltage relative to a threshold voltage. If the operating voltage crosses the threshold voltage in one direction, then the memory device system enables the memory device performance feature. If the operating voltage crosses the threshold voltage in another direction, then the memory system disables the memory device performance feature. Various techniques enable the memory device performance feature to be employed even with complex integrated circuits that may include tens of thousands of devices that employ the memory device performance feature.
摘要:
Integrated circuits, or computer chips, typically include multiple hardware components (e.g. memory, processors, etc.) operating under a shared power (e.g. thermal) constraint that is sourced by one or more power sources for the chip. Typically, the hardware components can be individually configured to operate at certain states (e.g. to operate at a certain frequency by setting a clock speed for a clock dedicated to the hardware component). Thus, each hardware component can be configured to operate at an operating point that is determined to be optimal, usually in terms of achieving some desired goal for a specific application (e.g. frame rates for gaming, etc.). In the context of chip hardware that operates under a shared power/thermal constraint, a method, computer readable medium, and system are provided for determining the optimal operating point for the chip that takes into consideration both performance of the chip and power consumption by the chip.
摘要:
One embodiment of the invention sets forth a CROP configured to perform both color raster operations and atomic transactions. Upon receiving an atomic transaction, the distribution unit within the CROP transmits a read request to the L2 cache for retrieving the destination operand. The distribution unit also transmits the source operands and the operation code to the latency buffer for storage until the destination operand is retrieved from the L2 cache. The processing pipeline transmits the operation code, the source and destination operands and an atomic flag to the blend unit for processing. The blend unit performs the atomic transaction on the source and destination operands based on the operation code and returns the result of the atomic transaction to the processing pipeline for storage in the internal cache. The processing pipeline writes the result of the atomic transaction to the L2 cache for storage at the memory location associated with the atomic transaction.
摘要:
A method for compressing framebuffer data is presented. The method includes determining a reduction ratio for framebuffer data in a tile including multiple samples. The reduction ratio determined is independent of the sampling mode, where the sampling mode is the number of samples within each pixel in the tile. The method further includes comparing a first portion of the framebuffer data for each of the multiple samples to determine an equality comparison result and also comparing a second portion of the framebuffer data for each one of the multiple samples to compute per-channel differences for each one of the multiple samples and testing the per-channel differences against a threshold value to determine a threshold comparison result. Finally, the method comprises compressing the framebuffer data for the tile based on the reduction ratio, the equality comparison result and the threshold comparison result to produce output framebuffer data for the tile.