摘要:
There is provided a system havingan execution core operable to execute internal instructions.A translation buffer is operable to store a plurality of internal instruction blocks of one or more internal instructions where the internal instruction blocks are a dynamic translation of respective external instruction blocks of one or more external instructions.A remapper is responsive to an execution request for an external instruction that is within one of said external instruction blocks to identify a corresponding internal instruction block stored within said translation buffer. Thus one or more internal instructions from said corresponding internal instruction block can be supplied to execution core.
摘要:
An accelerator 120 is tightly coupled to the normal execution unit 110. The operand store, which could be a register file 130, a stack based operand store or other operand store is shared by the execution unit and the accelerator unit. Operands may also be accessed as immediate values within the instructions themselves. The sequences of individual program instructions corresponding to computational subgraphs remain within a program but can be recognized by the accelerator as suitable for acceleration and when encountered are executed by the accelerator instead of by the normal execution unit. Within such tightly coupled arrangement problems can arise due to a lack of register resources within the system. The present technique provides that at least some intermediate operand values which are generated within the accelerator, but are determined not to be referenced outside of the computational subgraph concerned, are not written to the operand store.
摘要:
A performance range of a processor in a data processing apparatus is dynamically varied by recalculating at least one performance-range limit in dependence upon a quality of service value for a give processing task. The processor performance level is varied by selecting from a plurality of possible performance levels of a performance range having the performance-range limit.
摘要:
An accelerator 120 is tightly coupled to the normal execution unit 110. The operand store, which could be a register file 130, a stack based operand store or other operand store is shared by the execution unit and the accelerator unit. Operands may also be accessed as immediate values within the instructions themselves. The sequences of individual program instructions corresponding to computational subgraphs remain within a program but can be recognized by the accelerator as suitable for acceleration and when encountered are executed by the accelerator instead of by the normal execution unit. Within such tightly coupled arrangement problems can arise due to a lack of register resources within the system. The present technique provides that at least some intermediate operand values which are generated within the accelerator, but are determined not to be referenced outside of the computational subgraph concerned, are not written to the operand store.
摘要:
There is provided an information processor for executing a program comprising a plurality of separate program instructions: processing logic operable to individually execute said separate program instructions of said program; an operand store operable to store operand values; and an accelerator having an array comprising a plurality of functional units, said accelerator being operable to execute a combined operation corresponding to a computational subgraph of said separate program instructions by configuring individual ones of said plurality of functional units to perform particular processing operations associated with one or more processing stages of said combined operation; wherein said accelerator executes said combined operation in dependence upon operand mapping data providing a mapping between operands of said combined operation and storage locations within said operand store and in dependence upon separately specified configuration data providing a mapping between said plurality of functional units and said particular processing operations such that said configuration data can be re-used for different operand mappings.
摘要:
Voltage-binning of individual integrated circuits is achieved by operating those integrated circuits at a plurality of required clock frequencies and for each of those frequencies determining the minimum supply voltage level which produces a pass result for a series of applied test vectors.
摘要:
A data processing apparatus and method are provided for processing data under control of a program having program instructions including sequences of individual program instructions corresponding to computational subgraphs identified within the program. Each computational subgraph has a number of input operands and produces one or more output operands. The apparatus comprises an operand store for storing the input and output operands, and processing logic for executing individual program instructions from the program. Also provided is configurable accelerator logic which, in response to reaching an execution point within the program corresponding to a sequence of individual program instructions corresponding to a computational subgraph, evaluates one or more output functions associated with the computational subgraph. The evaluation of each output function generates an output operand for storing in the operand store, and each output operand corresponds to an output that would have been generated had the sequence of individual program instructions corresponding to the computational subgraph have been executed by the processing logic. Configuration storage stores a single look-up table (LUT) configuration for each output function, and for each output function to be evaluated, the accelerator logic is configured dependent on the associated single LUT configuration from the configuration storage, such that on receipt of the input operands of the computational subgraph, the accelerator logic will generate the output operand. This technique has been found to provide a particularly efficient accelerator logic for evaluating output functions associated with computational subgraphs.
摘要:
A data processing system is provided having a processor and analysing circuitry for identifying a SIMD instruction associated with a first SIMD instruction set and replacing it by a functionally-equivalent scalar representation and marking that functionally-equivalent scalar representation. The marked functionally-equivalent scalar representation is dynamically translated using translation circuitry upon execution of the program to generate one or more corresponding translated instructions corresponding to a instruction set architecture different from the first SIMD architecture corresponding to the identified SIMD instruction.
摘要:
Voltage-binning of individual integrated circuits is achieved by operating those integrated circuits at a plurality of required clock frequencies and for each of those frequencies determining the minimum supply voltage level which produces a pass result for a series of applied test vectors.
摘要:
A method for scheduling a plurality of virtual machines includes: determining a resource requirement (Xi) for each virtual machine (VM); determining an interrupt period (Yi) for each VM; and scheduling the plurality of VMs based, at least in part, on each respective Xi and Yi.
摘要翻译:一种用于调度多个虚拟机的方法包括:为每个虚拟机(VM)确定资源需求(X i i i i); 确定每个VM的中断周期(Y SUB); 并且至少部分地基于每个相应的X i和Y i i调度多个VM。