摘要:
A computer-implemented method includes initializing a driver associated with an input/output adapter in response to receiving an initialize driver request from a client application. The computer-implemented method includes initializing the input/output adapter to enable adapter capabilities of the input/output adapter to be determined. The computer-implemented method also includes determining the adapter capabilities of the input/output adapter. The computer-implemented method further includes determining slot capabilities of a slot associated with the input/output adapter. The computer-implemented method also includes setting configurable capabilities of the input/output adapter based on the adapter capabilities and the slot capabilities.
摘要:
A logic arrangement and method to support implied storage operation decode uses redundant target address detection, whereby target addresses of previous instructions are compared with the target address of the current instruction, and if equal, and the target addresses of previous instructions are not used as sources, the current instruction is decoded as a store instruction. This allows a redundant operation in an instruction set architecture to be redefined as a store instruction, freeing up opcodes normally used for store instructions to be used for other instructions.
摘要:
A circuit arrangement and method utilize texture data prefetching to prefetch texture data used by an anisotropic filtering algorithm. In particular, stride-based prefetching may be used to prefetch texture data for use in anisotropic filtering, where the value of the stride, or difference between successive accesses, is based upon a distance in a memory address space between sample points taken along the line of anisotropy used in an anisotropic filtering algorithm.
摘要:
A multirate execution unit is capable of being operated in a plurality of modes, with the execution unit being capable of clocked at multiple different rates relative to a multithreaded issue unit such that, in applications where maximum performance is desired, the execution unit can be clocked at a rate that is faster than the clock rate for the multithreaded issue unit, and in applications where a lower power profile is desired, the execution unit can be throttled back to a slower rate to reduce the power consumption of the execution unit. When the execution unit is clocked at a faster rate than the multithreaded issue unit, the issue unit is permitted to issue more instructions per cycle than when the execution unit is throttled to the slower rate to increase overall instruction throughput.
摘要:
The present invention is generally related to integrated circuit devices, and more particularly, to methods, systems and design structures for the field of image processing, and more specifically to vector units for supporting image processing. A dual vector unit implementation is described wherein two vector units are configured receive data from a common register file. The vector units may independently and simultaneously process instructions. Furthermore, the vector units may be adapted to perform scalar operations thereby integrating the vector and scalar processing. The vector units may also be configured to share resources to perform an operation, for example, a cross product operation.
摘要:
A method, program product and system for conducting a ray tracing operation where the rendering compute requirement is reduced by varying the size of bounding volumes into which image data is divided and/or by varying a number of primitives included within nodes of an acceleration data structure that correspond to the bounding volumes.
摘要:
Persistent vector multiplexer control is used in a vector-based execution unit to control the shuffling of words in operand vectors processed by the execution unit. In addition, a persistent swizzle instruction is defined in an instruction set for the vector-based execution unit and is used to cause state information to be persisted such that the operand vectors processed by subsequent vector instructions executed by the vector-based execution unit will be selectively shuffled using the persisted state information. As a result, when multiple vector instructions require a common custom word ordering for one or more operand vectors, a single persistent swizzle instruction may be used to select the desired custom word ordering for all of the vector instructions.
摘要:
Embodiments of the invention provide methods and apparatus for executing a multiple operand instruction. Executing the multiple operand instruction comprises computing an arithmetic result of a pair of operands in each processing lane of a vector unit. The arithmetic results generated in each processing lane of the vector unit may be transferred to a dot product unit. The dot product unit may compute an arithmetic result using the arithmetic result computed by each processing lane of the vector unit to generate an arithmetic result of more than two operands.
摘要:
A method, computer-readable medium, and apparatus for generating a trigonometric value. The method includes receiving a request to calculate a trigonometric value for an angle value and calculating a fractional value from the angle value. The fractional value corresponds to one of a first quadrant value, a second quadrant value, a third quadrant value, and a fourth quadrant value. The method also includes using the fractional value to determine whether to perform at least one of inverting the fractional value and negating the trigonometric value. The method further includes generating the trigonometric value from the fractional value by adding at least a portion of the fractional value with at least one of a shifted fractional value produced by shifting the portion of the fractional value and a constant value and providing the trigonometric value in response to the request.
摘要:
Methods for preprocessing pixel data using a Direct Memory Access (DMA) engine during a data transfer of the pixel data from a first memory (e.g., a DRAM) to a second memory (e.g., an SRAM) are described. The pixel data may derive from a color camera or a depth camera in which individual pixel values are not a multiple of eight bits. In some cases, the DMA engine may perform a variety of image processing operations on the pixel data prior to the pixel data being written into the second memory. In one embodiment, the DMA engine may be configured to determine whether one or more pixels corresponding with the pixel data may be invalidated or skipped based on a minimum pixel value threshold and a maximum pixel value threshold and to embed pixel skipping information within unused bits of the pixel data.