摘要:
The present invention discloses a method for multi-core instruction-set simulation. The proposed method identifies the shared data segment and the dependency relationship between the different cores and thus effectively reduces the number of sync points and lowers the synchronization overhead, allowing multi-core instruction-set simulation to be performed more rapidly while ensuring that the simulation results are accurate. In addition, the present invention also discloses a device for multi-core instruction-set simulation.
摘要:
The present invention discloses a system for generating a software TLM model, comprising a processing unit; a compiler coupled to the processing unit to generate target binary codes of a target software; a decompiler coupled to the processing unit to decompile the target binary codes into high level codes, for example C or C++ codes, to generate a functional model of the target software, wherein the functional model includes a plurality of basic blocks; an execution time calculating module coupled to the processing unit to calculate overall execution time of the plurality of the basic blocks of the functional model; a sync point identifying module coupled to the processing unit to identify sync points of the software transaction-level modeling model; and a time annotating module coupled to the processing unit to annotate the overall execution time of the basic blocks and the sync points into the functional model to obtain the software transaction-level modeling model.
摘要:
The present invention discloses a high-parallelism synchronization method for multi-core instruction-set simulation. The proposed method utilizes a new distributed scheduling mechanism for a parallel compiled MCISS. The proposed method can enhance the parallelism of the MCISS so that the computing power of a multi-core host machine can be effectively utilized. The distributed scheduling with the present invention's prediction method significantly shortens the waiting time which an ISS spends on synchronization
摘要:
The present invention discloses a cycle-count-accurate (CCA) processor modeling, which can achieve high simulation speeds while maintaining timing accuracy of the system simulation. The CCA processor modeling includes a pipeline subsystem model and a cache subsystem model with accurate cycle with accurate cycle count information and guarantees accurate timing and functional behaviors on processor interface. The CCA processor modeling further includes a branch predictor and a bus interface (BIF) to predict the branch of pipeline execution behavior (PEB) and to simulate the data accesses between the processor and the external components via an external bus, respectively. The experimental results show that the CCA processor modeling performs 50 times faster than the corresponding Cycle-accurate (CA) model while providing the same cycle count information as the target RTL model.
摘要:
The present invention discloses a system for generating a software TLM model, comprising a processing unit; a compiler coupled to the processing unit to generate target binary codes of a target software; a decompiler coupled to the processing unit to decompile the target binary codes into high level codes, for example C or C++ codes, to generate a functional model of the target software, wherein the functional model includes a plurality of basic blocks; an execution time calculating module coupled to the processing unit to calculate overall execution time of the plurality of the basic blocks of the functional model; a sync point identifying module coupled to the processing unit to identify sync points of the software transaction-level modeling model; and a time annotating module coupled to the processing unit to annotate the overall execution time of the basic blocks and the sync points into the functional model to obtain the software transaction-level modeling model.
摘要:
The present invention discloses a shared-variable-based (SVB) approach for fast and accurate multi-core cache coherence simulation. While the intuitive, conventional approach, synchronizing at either every cycle or memory access, gives accurate simulation results, it has poor performance due to huge simulation overloads. In the present invention, timing synchronization is only needed before shared variable accesses in order to maintain accuracy while improving the efficiency in the proposed shared-variable-based approach.
摘要:
In the present disclosure, the DOM approach for the simulation of OS preemptive scheduling has presented and demonstrated. By maintaining the data-dependency between the software tasks, and guaranteeing the order of shared variable accesses, it can accurately simulate the preemption effect. Moreover, the proposed DOM OS model is implemented to enable preemptive scheduling in SystemC.
摘要:
The present invention provides a method for simulating processor power consumption, the method comprises: simulating a simulated processor; utilizing a power analysis model to analyze the simulated processor's execution of at least one fragment of a program, for generating power analysis of a plurality of basic blocks of the at least one fragment; computing at least one power correction factor between the plurality of basic block; utilizing a processing apparatus to generate a simulation model with power annotation based on the power analysis and the at least one power correction factor; and predicting power consumption of the simulated processor based on the simulation model with power annotation.
摘要:
The present invention discloses a method for multi-core instruction-set simulation. The proposed method identifies the shared data segment and the dependency relationship between the different cores and thus effectively reduces the number of sync points and lowers the synchronization overhead, allowing multi-core instruction-set simulation to be performed more rapidly while ensuring that the simulation results are accurate. In addition, the present invention also discloses a device for multi-core instruction-set simulation.
摘要:
A deadlock free synchronization synthesizer for must-happen-before relations in at least two parallel programs or at least two threads each having multiple code segments has an input device to specify a synchronization point to involving code segments for each parallel program or thread and must-happen-before relations to the synchronization point, an analyzing module connected to the input device to detect existence of a deadlock in the parallel programs by using the must-happen-before relations, and a synthesizing module connected to the analyzing module to synthesize a practice code corresponding to the parallel programs if the deadlock existence detection is negative.