摘要:
A format conversion apparatus which converts image data of a band interleave format into image data of a band separate format is provided. The apparatus includes a memory which stores image data of a band interleave format; and a converting module which reads the memory by increasing a read address of the memory for each stride, and converts the image data of the band interleave format into image data of a band separate format.
摘要:
A fixed multiply-add (FMA) apparatus and method are provided. The FMA apparatus includes a partial product generator configured to generate a partial sum and a partial carry, a carry save adder configured to generate a partial sum having a first bit size and a partial carry having the first bit size by adding the partial sum and the partial carry to least significant bits (LSBs) of the mantissa of a third floating-point number, a carry select adder configured to generate a mantissa having a second bit size by adding the first bit-size partial sum and the first bit-size partial carry to most significant bits (MSBs) of the third floating-point number, and a selector configured to transmit the first bit-size partial sum and the first bit-size partial carry to the carry save adder or the carry select adder according to whether the mantissa of the third floating-point number is zero.
摘要:
A method and system for an early Z test in a tile-based three-dimensional rendering is provided. In the method and system for an early Z test, a portion which is not displayed to a user is removed prior to performing a rasterization process, and thereby performing the 3D rendering efficiently. The method includes segmenting a scene into tiles for performing a rendering with respect to a triangle; selecting a first tile of the tiles, which has a tile Z value less than a minimum Z value of the triangle; and performing the rendering with respect to the triangle in remaining tiles excluding the selected first tile of the tiles.
摘要:
Provided is a method and apparatus for measuring a performance or a progress state of an application program to perform data processing and execute particular functions in a computing environment using a micro architecture. A thread progress tracking apparatus may include a selector to select at least one thread constituting an application program; a determination unit to determine, based on a predetermined criterion, whether an instruction execution scheme corresponds to a deterministic execution scheme having a regular cycle or a nondeterministic execution scheme having an irregular delay cycle with respect to each of at least one instruction constituting a corresponding thread; and a deterministic progress counter to generate a deterministic progress index with respect to an instruction that is executed by the deterministic execution scheme, excluding an instruction that is executed by the nondeterministic execution scheme.
摘要:
A multiport data cache apparatus and a method of controlling the same are provided. The multiport data cache apparatus includes a plurality of cache banks configured to share a cache line, and a data cache controller configured to receive cache requests for the cache banks, each of which including a cache bank identifier, transfer the received cache requests to the respective cache banks according to the cache bank identifiers, and process the cache requests independently from one another.
摘要:
A data processing system and method. The data processing system includes a processor core that executes a program; a loop accelerator that has an array consisting of a plurality of data processing cells and executes a loop in a program by configuring the array according to a set of configuration bits; and a centralized register file which allows data used in the program execution to be shared by the processor core and the loop accelerator. The loop accelerator divides the configuration of the array into at least three phases according to whether data exchange with the central register file is conducted during the loop execution. Thus, unnecessary occupation of the routing resource, which is used for the data exchange between the loop accelerator and the central register file during the loop execution, can be avoided.
摘要:
Disclosed is a data processing system and method. The data processing method determines the number of static registers and the number of rotating registers for assigning a register to a variable contained in a certain program, assigns the register to the variable based on the number of the static registers and the number of the rotating registers, and compiles the program. Further, the method stores in the special register a value corresponding to the number of the rotating registers in the compiling operation, and obtains a physical address from a logical address of the register based on the value. Accordingly, the present invention provides an aspect of efficiently using register files by dynamically controlling the number of rotating registers and the number of static registers for a software pipelined loop, and has an effect capable of reducing the generations of spill/fill codes unnecessary during program execution to a minimum.
摘要:
Pipeline synchronization device for transferring data between clocked devices having different clock frequencies. The Pipeline synchronization device comprises a mousetrap buffer for exchanging data with one of said external devices said mousetrap buffer having a signalling output for coordinating the data exchange with the external device. The pipeline synchronization device comprises further a synchronizer adapted to synchronizing the change in a signalling output with the clock of the external device.
摘要:
Provided are a method and apparatus for avoiding bank conflict. A first instruction that is one of access instructions that are predicted to cause the bank conflict is replaced with a second instruction by changing an execute timing of the first instruction to a timing prior to the execute timing of the first instruction so as for the access instructions not to cause the bank conflict. Next, a load/store unit that is scheduled to access the bank according to the first instruction accesses the bank and reads out a data from the bank at an execute timing of the second instruction, and after that, the load/store unit is allowed to be inputted the read data at the execute timing of the first instruction. Accordingly, although the access instructions that are predicted to cause the bank conflict are allocated to the load/store units, the bank conflict can be prevented, so that it is possible to avoid deterioration in performance due the occurrence of the bank conflict.
摘要:
Provided is a processor and method of performing speculative load instructions of the processor in which a load instruction is performed only in the case where the load instruction substantially accesses a memory. A load instruction for canceling operations is performed in other cases except the above case, so that problems occurring by accessing an input/output (I/O) mapped memory area and the like at the time of performing speculative load instructions can be prevented using only a software-like method, thereby improving the performance of a processor.