摘要:
Provided is a semiconductor integrated circuit and an exponent calculation method that, when normalizing a plurality of data by a common exponent, speed up exponent calculation and reduce circuit scale and power consumption. When normalizing a plurality of data by a common exponent, a semiconductor integrated circuit calculates the exponent of the plurality of data. Included is a bit string generator that generates a second bit string containing bits having a transition value indicating that values of adjacent bits are different or a non-transition value indicating that values of adjacent bits are not different for each pair of adjacent bits of a first bit string constituting the data, and an exponent calculator that calculates the exponent of the plurality of data based on bit position of the transition value of a plurality of second bit strings generated from a plurality of first bit strings respectively constituting the plurality of data.
摘要:
The processors #0 to #3 execute a plurality of threads whose execution sequence is defined, in parallel. When the processor #1 that executes a thread updates the self-cache memory #1, if the data of the same address exists in the cache memory #2 of the processor #2 that executes a child thread, it updates the cache memory #2 simultaneously, but even if it exists in the cache memory #0 of the processor #0 that executes a parent thread, it doesn't rewrite the cache memory #0 but only records that rewriting has been performed in the cache memory #1. When the processor #0 completes a thread, a cache line with the effect that the data has been rewritten recorded from a child thread may be invalid and a cache line without such record is judged to be effective. Whether a cache line which may be invalid is really invalid or effective is examined during execution of the next thread.
摘要:
It is possible to provide a highly reliable semiconductor device and a communication method in which communication can be performed between circuits with a large degree of freedom of clock frequency which can be set in each of the circuits, a decisive operation, and a small communication latency. The semiconductor device according to the present invention includes a first circuit that performs processing based on a first clock signal, the first clock signal having a frequency M/N times as large as a frequency of a second clock signal (N is a positive integer, and M is a positive integer larger than N); a second circuit that performs processing based on the second clock signal; and a communication timing control circuit that generates a communication timing signal to control a timing at which the first circuit performs communication with the second circuit. The communication timing control circuit generates the communication timing signal determined by a frequency ratio information and a phase relation information, the frequency ratio information setting a frequency ratio of the first clock signal to the second clock signal, the phase relation information indicating a phase relation between the first clock signal and the second clock signal.
摘要:
A clock converting circuit (1) receives and then converts m-phase clocks of a frequency f having a phase difference of 1/(f×m) to n-phase clocks of the frequency f having a phase difference of 1/(f×n). A single-phase clock generating circuit (2) receives the n-phase clocks of the frequency f having a phase difference equivalent time of 1/(f×n) to generate single-phase clocks in synchronism with the rising or falling edges of the n-phase clocks. Since the frequency of the m-phase clocks inputted to the clock converting circuit (1) is ‘f’, if a desired frequency of the single-phase clocks is decided, then ‘n’ can be obtained from the equation: the frequency of the single-phase clocks is equal to (f×n). This value of ‘n’ is set to the clock converting circuit (1), thereby obtaining the n-phase clocks of the frequency f from the m-phase clocks of the frequency f to provide single-phase clocks of a desired frequency.
摘要:
Provided is a pipeline circuit capable of flexibly controlling clock frequencies regardless of whether a pipeline operation by a flow control is stopped or not, without significantly increasing a processing latency even if a clock frequency is decreased, and in response to performance requests for a processing throughput. Among P clocks (P is a positive integer), the phases of which are delayed in the order from a first clock to a P-th clock, for example, among six clocks of P0 to P5, two successive clocks, the phases of which are delayed from each other by a predetermined phase, are allocated to a plurality of stages, for example, five-stage pipeline buffers 32a to 32e, in the order from a previous stage to a subsequent stage, and also are allocated so that one clock signal having an identical phase is shared between two adjacent pipeline buffers.
摘要:
To provide a rational frequency dividing circuit wherein the variations in cycle times of frequency divided clock signals are small, there are many occasions in which the minimum cycle time of frequency divided clock signals and test costs are small. A clock signal frequency dividing circuit, the frequency division ratio of which is specified as N/M where are both N and M are integers, includes an output clock selecting circuit (200) that selects one of three situations: an input clock signal is outputted as it is, the input clock signal is inverted and outputted and the input clock signal is not outputted; and a clock selection control circuit (100) that generates a control signal for controlling the foregoing selection of the output clock selecting circuit. The clock selection control circuit controls the foregoing selection of the output clock selecting circuit at every cycle of the input clock signal.
摘要:
It is possible to provide a highly reliable semiconductor device and a communication method in which communication can be performed between circuits with a large degree of freedom of clock frequency which can be set in each of the circuits, a decisive operation, and a small communication latency. The semiconductor device according to the present invention includes a first circuit that performs processing based on a first clock signal, the first clock signal having a frequency M/N times as large as a frequency of a second clock signal (N is a positive integer, and M is a positive integer larger than N); a second circuit that performs processing based on the second clock signal; and a communication timing control circuit that generates a communication timing signal to control a timing at which the first circuit performs communication with the second circuit. The communication timing control circuit generates the communication timing signal determined by a frequency ratio information and a phase relation information, the frequency ratio information setting a frequency ratio of the first clock signal to the second clock signal, the phase relation information indicating a phase relation between the first clock signal and the second clock signal.
摘要:
A control/data flow analysis unit analyzes the control flow and the data flow of a sequential processing program, and a fork point candidate determination unit determines fork point candidates taking this as the reference. A best fork point candidate combination determination unit determines the best fork point candidate combination by taking as the reference the result from the evaluation of the parallel execution performance of a test fork point candidate combination by a parallel execution performance evaluation unit, and a parallelized program output unit generates and outputs a parallelized program by inserting a fork command based on the best fork point candidate combination.
摘要:
A detector detects at least one kind of dependence in address between instructions executed by at least a processor, the detector being adopted to detect a possibility of presence of the at least one kind of dependence, wherein if the at least one kind of dependence is present in fact, then the detector detects a possibility of presence of the at least one kind of dependence, and if the at least one kind of dependence is not present in fact, then the detector may detect a pseudo presence of the at least one kind of dependence. The detector has an execution history storing unit with a plurality of entries and an address converter for converting an address of a memory access instruction into an entry number, where different addresses may be converted into entry numbers that are the same.
摘要:
In a parallel processor system for executing a plurality of threads in parallel to each other by a plurality of thread execution units, the respective thread execution units allow for forking of a slave thread from an individual thread execution unit into another arbitrary thread execution unit. The respective thread execution units are managed in three states, a free state where fork is possible, a busy state where a thread is being executed, and a term state where a thread being terminated and yet to be settled exists. At the time of forking of a new thread, when there exists no thread execution unit at the free state, a thread that the thread execution unit at the term state has is merged into its immediately succeeding slave thread to bring the thread execution unit in question to the free state and conduct forking of a new thread.