摘要:
Methods and apparatus for providing additional storage, in the form of a hardware assisted stack, usable by software running an environment with limited resources. As an example, the hardware assisted stack may provide additional stack space to VBIOS code that is accessible within its limited allocated address space.
摘要:
One embodiment of the present invention sets forth a technique for technique for capturing and storing a level of an input signal using a dual-trigger low-energy flip-flop circuit that is fully-static and insensitive to fabrication process variations. The dual-trigger low-energy flip-flop circuit presents only three transistor gate loads to the clock signal and none of the internal nodes toggle when the input signal remains constant. One of the clock signals may be a low-frequency “keeper clock” that toggles less frequently than the other two clock signal that is input to two transistor gates. The output signal Q is set or reset at the rising clock edge using separate trigger sub-circuits. Either the set or reset may be armed while the clock signal is low, and the set or reset is triggered at the rising edge of the clock.
摘要:
An integrated circuit includes at least two different types of processors, such as a graphics processor and a video processor. At least one operation is commonly by supported by two different types of processors. For each commonly supported operation that is scheduled, a decision is made to determine which type of processor will be selected to implement the operation.
摘要:
In a graphics processor, a rendering object and a post-processing object share access to a host processor with a programmable execution core. The rendering object generates fragment data for an image from geometry data. The post-processing object operates to generate a frame of pixel data from the fragment data and to store the pixel data in a frame buffer. In parallel with operations of the host processor, a scanout engine reads pixel data for a previously generated frame and supplies the pixel data to a display device. The scanout engine periodically triggers the host processor to operate the post-processing object to generate the next frame. Timing between the scanout engine and the post-processing object can be controlled such that the next frame to be displayed is ready in a frame buffer when the scanout engine finishes reading a current frame.
摘要:
Methods and apparatuses for effectively clearing stencil buffers at high speed using surrogate stencil buffer clearing. A hardware register tracks the number of surrogate clears of the stencil buffer since the last actual clear. Bits are reserved in each stencil register for storing the surrogate clear number that cleared other stencil registers the last time the stencil register held an assigned value. A comparison between the contents of the hardware register and the reserved bits in each stencil register determines if each stencil register should be assigned a cleared value. If the numbers do not match the stencil register is assigned a predetermined surrogate clear value. In some applications the number of reserved bits is fixed, while in other applications the number of reserved bits can be set, either by a designer or by software.
摘要:
Circuits, methods, and apparatus for reducing power on a graphics processor integrated circuit by generating two memory clock signals, reducing the frequency of one under certain conditions, and maintaining the frequency of the other. To reduce skew and jitter between these two memory clocks, and to ensure that they remain in phase, a synchronizer circuit is used by an exemplary embodiment of the present invention. The synchronizer circuit is also useful as a general application clock generator.
摘要:
A programmable system for dithering video data. The system is operable in at least two user-selectable modes which can include a small kernel mode and a large kernel mode. In some embodiments, the system is operable in at least one mode in which it applies two or more kernels (each from a different kernel sequence) to each block of video words. Each kernel sequence repeats after a programmable number of the blocks (e.g., a programmable number of frames containing the blocks) have been dithered. The period of repetition is preferably programmable independently for each kernel sequence. The system preferably includes a frame counter for each kernel sequence. Each counter generates an interrupt when the number of frames of data dithered by kernels of the sequence has reached a predetermined value. In response to the interrupt, software can change the kernel sequence being applied. Typically, the system performs both truncation and dithering on words of video data. For example, some embodiments produce dithered 6-bit color components in response to 8-bit input color component words. Preferably, the inventive system is optionally operable in either a normal mode (in which dithering is applied to all pixels in accordance with the invention) or in an anti-flicker mode. Another aspect of the invention is a computer system in which the dithering system is implemented as a subsystem of a pipelined graphics processor or display device. Another aspect of the invention is a display device that includes an embodiment of the dithering system.
摘要:
A system and method are provided for reducing the number of depth clear operations in a hardware graphics pipeline. Initially, a frame count is stored into a frame buffer associated with the hardware graphics pipeline. The stored frame count is associated with a pixel. A depth clear operation is then performed based at least in part on the frame count utilizing the hardware graphics pipeline.
摘要:
Techniques are disclosed for maintaining cache coherency across a serial interface bus such as a Peripheral Component Interconnect Express (PCIe) bus. The techniques include generating a snoop request (SNP) to determine whether first data stored in a local memory is coherent relative to second data stored in a data cache, the snoop request including destination information that identifies the data cache on the serial interface bus and causing the snoop request to be transmitted over the serial interface bus to a second processor. The techniques further include extracting a cache line address from the snoop request, determining whether the second data is coherent, generating a complete message (CPL) indicating that the first data is coherent with the second data, and causing the complete message to be transmitted over the bus to the first processor. The snoop request and complete messages may be vendor defined messages.
摘要:
Circuits, methods, and apparatus for slowing clock circuits on a graphics processor integrated circuit in order to reduce power dissipation. An exemplary embodiment of the present invention provides a graphics processor having two memory clocks, specifically, a switched memory clock and an unswitched memory clock. The switched memory clock frequency is reduced under specific conditions, while the unswitched memory clock frequency remains fixed. In a specific embodiment, the switched memory clock frequency is reduced when related graphics, display, scaler, and frame buffer circuits are not requesting data, or are such data requests can be delayed. Further refinements to the present invention provide circuits, methods, and apparatus for ensuring that the switched and unswitched memory clock signals remain in-phase and aligned with each other.