摘要:
An apparatus and method are described for coupling a front end core to an accelerator component (e.g., such as a graphics accelerator). For example, an apparatus is described comprising: an accelerator comprising one or more execution units (EUs) to execute a specified set of instructions; and a front end core comprising a translation lookaside buffer (TLB) communicatively coupled to the accelerator and providing memory access services to the accelerator, the memory access services including performing TLB lookup operations to map virtual to physical addresses on behalf of the accelerator and in response to the accelerator requiring access to a system memory.
摘要:
A computer system may support one or more techniques to allow dynamic pinning of the memory pages accessed by a non-CPU device, such as a graphics processing unit (GPU). The non-CPU may support virtual to physical address mapping and may thus be aware of the memory pages, which may not be pinned but may be accessed by the non-CPU. The non-CPU may notify or send such information to a run-time component such as a device driver associated with the CPU. The device driver may, dynamically, perform pinning of such memory pages, which may be accessed by the non-CPU. The device driver may even unpin the memory pages, which may be no longer accessed by the non-CPU. Such an approach may allow the memory pages, which may be no longer accessed by the non-CPU to be available for allocation to the other CPUs and/or non-CPUs.
摘要:
Some implementations disclosed herein provide techniques and arrangements for an specialized logic engine that includes translation lookaside buffer to support multiple threads executing on multiple cores. The translation lookaside buffer enables the specialized logic engine to directly access a virtual address of a thread executing on one of the plurality of processing cores. For example, an acceleration compute engine may receive one or more instructions from a thread executed by a processing core. The acceleration compute engine may retrieve, based on an address space identifier associated with the one or more instructions, a physical address associated with the one or more instructions from the translation lookaside buffer to execute the one or more instructions using the physical address.
摘要:
A processor may support a two-dimensional (2-D) gather instruction and a 2-D cache. The processor may perform the 2-D gather instruction to access one or more sub-blocks of data from a two-dimensional (2-D) image stored in a memory coupled to the processor. The two-dimensional (2-D) cache may store the sub-blocks of data in a multiple cache lines. Further, the 2-D cache may support access of more than one cache lines while preserving a two-dimensional structure of the 2-D image.
摘要:
An embodiment of the present invention provides an apparatus, comprising a network adapter configured for wireless communication using more than one technology using distributed management and wherein the network adapter is configured to share a plurality of shared hardware components by automatically turning all other comms to OFF when one comm is turned to ON.
摘要:
A processor of an aspect includes an instruction pipeline to process a multiple memory address instruction that indicates multiple memory addresses. The processor also includes multiple page fault aggregation logic coupled with the instruction pipeline. The multiple page fault aggregation logic is to aggregate page fault information for multiple page faults that are each associated with one of the multiple memory addresses of the instruction. The multiple page fault aggregation logic is to provide the aggregated page fault information to a page fault communication interface. Other processors, apparatus, methods, and systems are also disclosed.
摘要:
Methods, apparatus and systems for virtualization of a native instruction set are disclosed. Embodiments include a processor core executing the native instructions and a second core, or alternatively only the second processor core consuming less power while executing a second instruction set that excludes portions of the native instruction set. The second core's decoder detects invalid opcodes of the second instruction set. A microcode layer disassembler determines if opcodes should be translated. A translation runtime environment identifies an executable region containing an invalid opcode, other invalid opcodes and interjacent valid opcodes of the second instruction set. An analysis unit determines an initial machine state prior to execution of the invalid opcode. A partial translation of the executable region that includes encapsulations of the translations of invalid opcodes and state recoveries of the machine states is generated and saved to a translation cache memory.
摘要:
A processor saves micro-architectural contexts to increase the efficiency of code execution and power management. A save instruction is executed to store a micro-architectural state and an architectural state of a processor in a common buffer of a memory upon a context switch that suspends the execution of a process. The micro-architectural state contains performance data resulting from the execution of the process. A restore instruction is executed to retrieve the micro-architectural state and the architectural state from the common buffer upon a resumed execution of the process. Power management hardware then uses the micro-architectural state as an intermediate starting point for the resumed execution.
摘要:
A method for detection of underground anomalies including in a system of distributed antennas (10), which are leaky transmission lines, disposed in 16 boreholes (12) formed in a ground, a transmitter (14) being connected to one of the antennas (10), and a receiver (16) being connected to another of the antennas (10), injecting an electromagnetic pulse into one of the antennas (10), wherein the pulse gradually leaks out, and wherein if a speed of propagation in the line in which the pulse is injected is faster than a speed of propagation in the ground, a shock wave is transmitted through the ground, called a transmitted signal, and received as a received signal at another of the antennas (10), and wherein an underground anomaly diffracts the shock wave, resulting in a detectable disturbance in the received signal, and locating the anomaly as a function of a time delay of the disturbance relative to the transmitted signal.
摘要:
Page faults arising in a graphics processing unit may be handled by an operating system running on the central processing unit. In some embodiments, this means that unpinned memory can be used for the graphics processing unit. Using unpinned memory in the graphics processing unit may expand the capabilities of the graphics processing unit in some cases.