摘要:
Provided is a parallel processor for supporting a floating-point operation. The parallel processor has a flexible structure for easy development of a parallel algorithm involving multimedia computing, requires low hardware cost, and consumes low power. To support floating-point operations, the parallel processor uses floating-point accumulators and a flag for floating-point multiplication. Using the parallel processor, it is possible to process a geometric transformation operation in a 3-dimensional (3D) graphics process at low cost. Also, the cost of a bus width for instructions can be minimized by a partitioned Single-Instruction Multiple-Data (SIMD) method and a method of conditionally executing instructions.
摘要:
An apparatus that calculates a Sum of Absolute Differences (SAD) for motion estimation of a variable block capable of parallelly calculating SAD values with respect to multiple current frame macroblocks at a time is presented. The apparatus includes a PE array unit including at least one Processing Element (PE) that is aligned in the form of a matrix, and parallelly calculating a SAD value of at least one pixel provided in multiple serial current frame macroblocks, a local memory including current frame macroblock data, reference frame macroblock data, and reference frame search area data, and transmitting the data to each PE that is provided in the PE array unit, and a controller for making a command for the data that are provided in the local memory to be transmitted corresponding to at least one pixel, on which each PE provided in the PE array unit performs calculation.
摘要:
Provided is a highly energy-efficient processor architecture. The architecture employs 2-stage dynamic voltage scaling (DVS) and a sleep mode for high energy efficiency, dynamically controls the power supply voltage and activation of an embedded processor with instructions, and thus can prevent performance deterioration while reducing power consumption.A highly energy-efficient processor employing the processor architecture includes: a function unit block for performing an operation according to instructions input from the outside; at least one peripheral unit block for performing data communication with an external device; an instruction decoder for interpreting the input instructions and determining operation modes of the function unit block and peripheral unit block required for executing the interpreted instructions; a function unit block driver for applying a different power supply voltage according to the operation mode of the function unit block to the function unit block; and a peripheral unit block driver for applying a different power supply voltage according to the operation mode of the peripheral unit block to the peripheral unit block.
摘要:
Provided is a highly energy-efficient processor architecture. The architecture employs 2-stage dynamic voltage scaling (DVS) and a sleep mode for high energy efficiency, dynamically controls the power supply voltage and activation of an embedded processor with instructions, and thus can prevent performance deterioration while reducing power consumption.A highly energy-efficient processor employing the processor architecture includes: a function unit block for performing an operation according to instructions input from the outside; at least one peripheral unit block for performing data communication with an external device; an instruction decoder for interpreting the input instructions and determining operation modes of the function unit block and peripheral unit block required for executing the interpreted instructions; a function unit block driver for applying a different power supply voltage according to the operation mode of the function unit block to the function unit block; and a peripheral unit block driver for applying a different power supply voltage according to the operation mode of the peripheral unit block to the peripheral unit block.
摘要:
Provided is a single instruction multiple data (SIMD) parallel processor including a plurality of processing units connected to one another. Each processing unit includes: an instruction register; an instruction decoder; a register files selection circuit; and register files. The SIMD parallel processor can selectively control data of register files required for any one of SIMD, single instruction single data (SISD), row, and column operations in response to an instruction. Since each of the SIMD, SISD, row, and column operations can be effectively performed according to the type of application, the SIMD parallel processor has excellent utility, efficiency, and flexibility.
摘要:
Provided are a reconfigurable arithmetic unit and a processor having the same. The reconfigurable arithmetic unit can perform an addition operation or a multiplication operation according to an instruction by sharing an adder. The reconfigurable arithmetic unit includes a booth encoder for encoding a multiplier, a partial product generator for generating a plurality of partial products using the encoded multiplier and a multiplicand, a Wallace tree circuit for compressing the partial products into a first partial product and a second partial product, a first Multiplexer (MUX) for selecting and outputting one of the first partial product and a first addition input according to a selection signal, a second MUX for selecting and outputting one of the second partial product and a second addition input according to the selection signal, and a Carry Propagation Adder (CPA) for adding an output of the first MUX and an output of the second MUX to output an operation result. The arithmetic unit can operate as an adder or a multiplier according to an instruction, and thus can increase the degree of use of entire hardware.
摘要:
Provided is a low-power clock gating circuit using a Multi-Threshold CMOS (MTCMOS) technique. The low-power clock gating circuit includes a latch circuit of an input stage and an AND gate circuit of an output stage, in which power consumption caused by leakage current in the clock gating circuit is reduced in a sleep mode, and supply of a clock to a unused device of a targeted logic circuit is prevented by the control of a clock enable signal in an active mode, thereby reducing power consumption. The low-power clock gating circuit using an MTCMOS technique uses devices having a low threshold voltage and devices having a high threshold voltage, which makes it possible to implement a high-speed, low-power circuit, unlike a conventional clock gating circuit using a single threshold voltage.
摘要:
Provided is a method of performing three-dimensional (3D) graphics geometric transformation using a parallel processor having a plurality of Processing Elements (PEs). The method includes performing model/view transformation and projection transformation on a first group of vertex vectors using the parallel processor; calculating a value used for quaternion correction of the first group of vertex vectors using a general-use processor, and simultaneously performing model/view transformation and projection transformation on a second group of vertex vectors; performing quaternion correction and screen mapping on the first group of vertex vectors, and simultaneously calculating a value used for quaternion correction of the second group of vertex vectors using the general-use processor; and performing quaternion correction and screen mapping on the second group of vertex vectors.
摘要:
Provided is a multi-threshold complementary metal oxide semiconductor (MTCMOS) latch circuit including: a data inverting circuit for inverting and outputting input data under the control of a sleep control signal; a transmission gate for transferring the data signal output from the data inverting circuit under the control of a clock control signal; a signal control circuit for outputting the data signal output from the transmission gate under the control of a reset control signal and the sleep control signal; and a feedback circuit for feeding back the signal output from the signal control circuit and preserving the data in a sleep mode. The MTCMOS latch circuit can minimize power consumption caused by a leakage current due to elements scaled down to nano scale and also contribute to high-speed operation of a logic circuit by using an element having a low threshold voltage.
摘要:
Provided is a multi-threshold complementary metal oxide semiconductor (MTCMOS) latch circuit including: a data inverting circuit for inverting and outputting input data under the control of a sleep control signal; a transmission gate for transferring the data signal output from the data inverting circuit under the control of a clock control signal; a signal control circuit for outputting the data signal output from the transmission gate under the control of a reset control signal and the sleep control signal; and a feedback circuit for feeding back the signal output from the signal control circuit and preserving the data in a sleep mode. The MTCMOS latch circuit can minimize power consumption caused by a leakage current due to elements scaled down to nano scale and also contribute to high-speed operation of a logic circuit by using an element having a low threshold voltage.