摘要:
An apparatus is described having multiple cores, each core having: a) a CPU; b) an accelerator; and, c) a controller and a plurality of order buffers coupled between the CPU and the accelerator. Each of the order buffers is dedicated to a different one of the CPU's threads. Each one of the order buffers is to hold one or more requests issued to the accelerator from its corresponding thread. The controller is to control issuance of the order buffers' respective requests to the accelerator.
摘要:
A computer system may support one or more techniques to allow dynamic pinning of the memory pages accessed by a non-CPU device (e.g., a graphics processing unit, GPU). The non-CPU may support virtual to physical address mapping and may thus be aware of the memory pages, which may not be pinned but may be accessed by the non-CPU. The non-CPU may notify or send such information to a run-time component such as a device driver associated with the CPU. In one embodiment, the device driver may, dynamically, perform pinning of such memory pages, which may be accessed by the non-CPU. The device driver may even unpin the memory pages, which may be no longer accessed by the non-CPU. Such an approach may allow the memory pages, which may be no longer accessed by the non-CPU to be available for allocation to the other CPUs and/or non-CPUs.
摘要:
Some implementations disclosed herein provide techniques and arrangements for a synchronous software interface for a specialized logic engine. The synchronous software interface may receive, from a first core of a plurality of cores, a control block including a transaction for execution by the specialized logic engine. The synchronous software interface may send the control block to the specialized logic engine and wait to receive a confirmation from the specialized logic engine that the transaction was successfully executed.
摘要:
An asymmetric multiprocessor system (ASMP) may comprise computational cores implementing different instruction set architectures and having different power requirements. Program code for execution on the ASMP is analyzed and a determination is made as to whether to allow the program code, or a code segment thereof to execute on a first core natively or to use binary translation on the code and execute the translated code on a second core which consumes less power than the first core during execution.
摘要:
Page faults arising in a graphics processing unit may be handled by an operating system running on the central processing unit. In some embodiments, this means that unpinned memory can be used for the graphics processing unit. Using unpinned memory in the graphics processing unit may expand the capabilities of the graphics processing unit in some cases.
摘要:
In a front-end system for a processor, a recording scheme for instruction segments stores the instructions in reverse program order. Instruction segments may be traces, extended blocks or basic blocks. By storing the instructions in reverse program order, the instruction segment is easily extended to include additional instructions. The instruction segments may be extended without having to re-index tag arrays, pointers that associate instruction segments with other instruction segments.
摘要:
Methods and apparatus relating to power management for multiple processor cores are described. In one embodiment, one or more techniques may be utilized locally (e.g., on a per core basis) to manage power consumption in a processor. In another embodiment, power may be distributed among different power planes of a processor based on energy-based considerations. Other embodiments are also disclosed and claimed.
摘要:
A method and apparatus for changing the configuration of a multi-core processor is disclosed. In one embodiment, a throttle module (or throttle logic) may determine the amount of parallelism present in the currently-executing program, and change the execution of the threads of that program on the various cores. If the amount of parallelism is high, then the processor may be configured to run a larger amount of threads on cores configured to consume less power. If the amount of parallelism is low, then the processor may be configured to run a smaller amount of threads on cores configured for greater scalar performance.
摘要:
A system and corresponding method use a PAUSE instruction as a low power hint in a single threaded or multithreaded environment using “processor slow mode.” One embodiment actually lowers the frequency of the processor clock. Another embodiment virtually lowers the frequency of the processor clock by gating M clock cycles out of every N clock cycles. When all threads have issued a PAUSE instruction, the processor enters slow mode and remains there for a while. After this while, the processor returns to normal mode. Alternatively, an event, such as an interrupt or an exception, can cause the processor to return to normal mode from slow mode.
摘要:
A power aware front-end unit for a processor may include a UOP cache that disables other circuitry within the front-end unit. In an embodiment, a front-end unit may disable instruction synchronization circuitry, instruction decode circuitry and, optionally, instruction fetch circuitry while instruction look-ups are underway in both a block cache and an instruction cache. If the instruction look-up indicates a miss, the disabled circuitry thereafter may be enabled.