Abstract:
A method of sharing a coarse grained array and a processor using the method is provided. A processor includes a first processor core including a plurality of first functional units which execute a first instruction set, a second processor core including a plurality of second functional units which execute a second instruction set, and a coarse grained array including a plurality of third functional units which execute a portion of instructions of the first instruction set and/or the second instruction set, instead of the first processor core and/or the second processor core.
Abstract:
A reconfigurable processor for efficiently performing a vector operation, and a method of controlling the reconfigurable processor are provided. The reconfigurable processor designates at least one of a plurality of processing elements as a vector lane based on vector lane configuration information, and allocates a vector operation to the designated vector lane.
Abstract:
A clamping circuit is provided, which may clamp a voltage at a node of a circuit to a stable level by using a transistor already included in the circuit. The clamping circuit may clamp a voltage at a first node of a circuit inside a semiconductor chip to a more stable level when electro-static discharge (ESD) occurs. The clamping circuit may include a transistor and a capacitive element to store a control voltage to turn on the transistor in response to ESD.
Abstract:
A processor and a computing system are provided. A processor includes a processor core, and a buffer memory to read word data from a memory, the read word data including first byte data read by the processor core from the memory, and to store the read word data, wherein the buffer memory determines whether second byte data requested by the processor core is stored in the buffer memory.
Abstract:
A method of sharing a coarse grained array and a processor using the method is provided. A processor includes a first processor core including a plurality of first functional units which execute a first instruction set, a second processor core including a plurality of second functional units which execute a second instruction set, and a coarse grained array including a plurality of third functional units which execute a portion of instructions of the first instruction set and/or the second instruction set, instead of the first processor core and/or the second processor core.
Abstract:
A hardware memory architecture or arrangement suited for multi-processor systems or arrays is disclosed. In one aspect, the memory arrangement includes at least one memory queue between a functional unit (e.g., computation unit) and at least one memory device, which the functional unit accesses (for write and/or read access).
Abstract:
A data processing system and method. The data processing system includes a processor core that executes a program; a loop accelerator that has an array consisting of a plurality of data processing cells and executes a loop in a program by configuring the array according to a set of configuration bits; and a centralized register file which allows data used in the program execution to be shared by the processor core and the loop accelerator. The loop accelerator divides the configuration of the array into at least three phases according to whether data exchange with the central register file is conducted during the loop execution. Thus, unnecessary occupation of the routing resource, which is used for the data exchange between the loop accelerator and the central register file during the loop execution, can be avoided.
Abstract:
Disclosed is a mixed-type adder with optimized design costs. The mixed-type adder includes I sub adders, (where, I is a positive number larger than 1). An overall bit width of the mixed-type adder is divided into I bit groups which are respectively allocated to the I sub adders. The I sub adders have different carry propagation schemes and are connected in series through a carry signal.
Abstract:
A static branch prediction method and code execution method for a pipeline processor, and a code compiling method for static branch prediction, are provided herein. The static branch prediction method includes predicting a conditional branch code as taken or not-taken, adding the prediction information, converting the conditional branch code into a jump target address setting (JTS) code including target address information, branch time information, and a test code, and scheduling codes in a block. The code may be scheduled into a last slot of the block, and the JTS code may be scheduled into an empty slot after all the other codes in the block are scheduled. When the conditional branch code is predicted as taken in the prediction operation, a target address indicated by the target address information may be fetched at a cycle time indicated by the branch time information.
Abstract:
An apparatus and method capable of reducing idle resources in a multicore device and improving the use of available resources in the multicore device are provided. The apparatus includes a static scheduling unit configured to generate one or more task groups, and to allocate the task groups to virtual cores by dividing or combining the tasks included in the task groups based on the execution time estimates of the task groups. The apparatus also includes a dynamic scheduling unit configured to map the virtual cores to physical cores.