摘要:
A method and apparatus for changing the configuration of a multi-core processor is disclosed. In one embodiment, a throttle module (or throttle logic) may determine the amount of parallelism present in the currently-executing program, and change the execution of the threads of that program on the various cores. If the amount of parallelism is high, then the processor may be configured to run a larger amount of threads on cores configured to consume less power. If the amount of parallelism is low, then the processor may be configured to run a smaller amount of threads on cores configured for greater scalar performance.
摘要:
A method and apparatus for changing the configuration of a multi-core processor is disclosed. In one embodiment, a throttle module (or throttle logic) may determine the amount of parallelism present in the currently-executing program, and change the execution of the threads of that program on the various cores. If the amount of parallelism is high, then the processor may be configured to run a larger amount of threads on cores configured to consume less power. If the amount of parallelism is low, then the processor may be configured to run a smaller amount of threads on cores configured for greater scalar performance.
摘要:
A computer architecture to process move instructions by allowing multiple mappings between logical registers and the same physical register. In one embodiment, a counter is associated with each physical register to indicate when the physical register is free. A register-to-register move instruction is processed by mapping the logical destination register of the move instruction to the same physical register to which the logical source register of the move instruction is mapped. An immediate-to-register move instruction is processed by mapping the logical destination register of the move instruction to a physical register storing the immediate.
摘要:
An advanced register renamer comprises an associative memory having a plurality of entries, each entry storing a representation of a single operation as an expression paired with a corresponding name. The expression and the name are respectively stored in first and second fields of an entry in the memory. Both fields are available for subsequent assembly level operations to use as pattern matches. A means for converting a subsequent operation in the stream to a new operation searches for a match between an expression of the subsequent operation and the first field of a matching entry. Upon finding a match with the expression field in the table, the subsequent operation is renamed to a new operation by replacing the expression with the corresponding name field of the matching entry taken from the associative memory.
摘要:
A method and apparatus for executing instructions in a pipelined microprocessor. The method includes re-ordering the set of instructions prior to loading the instructions into an instruction cache. In one embodiment, a re-ordering unit receives the set of instructions as a trace segment made of a set of basic blocks of instructions in a logical order of execution. After being re-ordered, the instructions are presented to the reordered instruction cache in bundles. When an instruction is unavailable, possibly due to an unresolved data dependency, no operation codes (nops) are inserted into the bundle in place of an in place of an instruction, creating fixed length bundles. In a second embodiment, nops are not used. Variable length bundles are produced by using an additional bit(s) per instruction to mark the end of the bundles.
摘要:
In one embodiment, the present invention includes a multicore processor having first and second cores to independently execute instructions, the first core visible to an operating system (OS) and the second core transparent to the OS and heterogeneous from the first core. A task controller, which may be included in or coupled to the multicore processor, can cause dynamic migration of a first process scheduled by the OS to the first core to the second core transparently to the OS. Other embodiments are described and claimed.
摘要:
An apparatus and method are described for coupling a front end core to an accelerator component (e.g., such as a graphics accelerator). For example, an apparatus is described comprising: an accelerator comprising one or more execution units (EUs) to execute a specified set of instructions; and a front end core comprising a translation lookaside buffer (TLB) communicatively coupled to the accelerator and providing memory access services to the accelerator, the memory access services including performing TLB lookup operations to map virtual to physical addresses on behalf of the accelerator and in response to the accelerator requiring access to a system memory.
摘要:
A computer system may support one or more techniques to allow dynamic pinning of the memory pages accessed by a non-CPU device, such as a graphics processing unit (GPU). The non-CPU may support virtual to physical address mapping and may thus be aware of the memory pages, which may not be pinned but may be accessed by the non-CPU. The non-CPU may notify or send such information to a run-time component such as a device driver associated with the CPU. The device driver may, dynamically, perform pinning of such memory pages, which may be accessed by the non-CPU. The device driver may even unpin the memory pages, which may be no longer accessed by the non-CPU. Such an approach may allow the memory pages, which may be no longer accessed by the non-CPU to be available for allocation to the other CPUs and/or non-CPUs.
摘要:
Some implementations disclosed herein provide techniques and arrangements for an specialized logic engine that includes translation lookaside buffer to support multiple threads executing on multiple cores. The translation lookaside buffer enables the specialized logic engine to directly access a virtual address of a thread executing on one of the plurality of processing cores. For example, an acceleration compute engine may receive one or more instructions from a thread executed by a processing core. The acceleration compute engine may retrieve, based on an address space identifier associated with the one or more instructions, a physical address associated with the one or more instructions from the translation lookaside buffer to execute the one or more instructions using the physical address.
摘要:
Apparatus and methods to add an extended first bit size data item of a first source operand specified by an instruction to a second source operand specified by the instruction. The first source operand and the second source operand are a second bit size, the second bit size being greater than the first bit size. The result of adding the extended data item of the first source operand to the second source operand can be stored.