摘要:
A processor includes at least one execution unit that executes instructions, at least one register file, coupled to the at least one execution unit, that buffers operands for access by the at least one execution unit, and an instruction sequencing unit that fetches instructions for execution by the execution unit. The processor further includes an operand data structure and an address generation accelerator. The operand data structure specifies a first relationship between addresses of sequential accesses within a first address region and a second relationship between addresses of sequential accesses within a second address region. The address generation accelerator computes a first address of a first memory access in the first address region by reference to the first relationship and a second address of a second memory access in the second address region by reference to the second relationship.
摘要:
A data processing system has a processor, a memory, and an instruction set architecture (ISA) that includes: (1) an asynchronous memory mover (AMM) store (ST) instruction initiates an asynchronous memory move operation that moves data from a first memory location having a first real address to a second memory location having a second real address by: (a) first performing a move of the data in virtual address space utilizing a source effective address a destination effective address; and (b) when the move is completed, completing a physical move of the data to the second memory location, independent of the processor. The ISA further provides (2) an AMM terminate ST instruction for stopping an ongoing AMM operation before completion of the AMM operation, and (3) a LD CMP instruction for checking a status of an AMM operation.
摘要:
A data processing system has an asynchronous memory mover, which includes multiple sets of registers for storing addressing and control parameters utilized to generate one or more asynchronous memory move (AMM) operations. The memory mover detects a receipt of a first set of parameters in a first set of registers from the processor. The processor forwards the parameters after the processor initiates a data move in virtual address space, utilizing a source effective address and a destination effective address. The memory mover responds to receiving the first set of parameters by generating and launching a first asynchronous memory move (AMM) operation. When the memory mover receives a second set of parameters in a second set of registers before the first AMM operation completes, the memory mover generates and launches a second AMM operation concurrently with the first AMM operation if no address conflicts exist.
摘要:
A method performed in a data processing system initiates an asynchronous memory move (AMM) operation, whereby a processor performs a move of data in virtual address space from a first effective address to a second effective address and forwards parameters of the AMM operation to asynchronous memory mover logic for completion of the physical movement of data from a first memory location to a second memory location. The processor executes a second operation, which checks a status of the completion of the data move and returns a notification indicating the status. The notification indicates a status, which includes one of: data move in progress; data move totally done; data move partially done; data move cannot be performed; and occurrence of a translation look-aside buffer invalidate entry (TLBIE) operation. The processor initiates one or more actions in response to the notification received.
摘要:
A data processing system includes a set of architected registers within which the processor places state and other information to communicate with the asynchronous memory mover in order to initiate and control an AMM operation. The asynchronous memory mover performs an asynchronous memory move (AMM) operation in response to receiving a set of parameters within the architected registers, which parameters are associated with an AMM store instruction executed by the processor to initiates a move of data in virtual space before placing the information in the architected registers. The architected registers are processor architected registers, defined on a per thread basis by a compiler, or memory mapped architected registers allocated for communicating with the asynchronous memory mover during a bind and subsequent execution of an application. The architected registers are also utilized to store state information to enable a restore to a point before execution of the AMM operation.
摘要:
A system, method, and computer program product for semi-synchronously copying data from a first portion of memory to a second portion of memory are disclosed. The method comprises receiving, in a processor, a call for a semi-synchronous memory copy operation. The semi-synchronous memory copy operation preserves temporal persistence of validity for a virtual source address corresponding to a source location in a memory and a virtual target address corresponding to a target location in the memory by setting a flag bit. The call includes at least the virtual source address, the virtual target address, and an indicator identifying a number of bytes to be copied. The memory copy operation is placed in a queue for execution by a memory controller. The queue is coupled to the memory controller. At least one subsequent instruction is continued to be executed as the subsequent instruction becomes available from an instruction pipeline.
摘要:
A method within a data processing system by which a processor executes an asynchronous memory move (AMM) store (ST) instruction to complete a corresponding AMM operation in parallel with an ongoing (not yet completed), previously issued barrier operation. The processor receives the AMM ST instruction after executing the barrier operation (or SYNC instruction) and before the completion of the barrier operation or SYNC on the system fabric. The processor continues executing the AMM ST instruction, which performs a move in virtual address space and then triggers the generation of the AMM operation. The AMM operation proceeds while the barrier operation continues, independent of the processor. The processor stops further execution of all other memory access requests, excluding AMM ST instructions that are received after the barrier operation, but before completion of the barrier operation.
摘要:
A method within a data processing system in which a processor handles conflicts, which occur during performance by an asynchronous memory mover of an asynchronous memory move (AMM) operation. The asynchronous memory mover performs an asynchronous memory move (AMM) operation by which the actual data is moved from a source to a destination memory location, independent of the processor. The memory mover sets a flag bit to indicate that the asynchronous memory mover is currently performing an AMM operation at the memory. When the processor receives a memory access operation, the processor checks the value of the flag bit before issuing the new memory access operation, and checks the associated address of the AMM operation to determine possible address conflicts. The processor then evaluates and responds to address conflicts to prevent corruption of data during an AMM operation.
摘要:
A technique for performing cache injection includes monitoring an instruction stream for a specific instruction sequence. Addresses on a bus are then monitored, at a cache, in response to detecting the specific instruction sequence a determined number of times. Ownership of input/output data on the bus is then acquired by the cache when an address on the bus (that is associated with the input/output data) corresponds to an address of a data block stored in the cache.
摘要:
A processor includes a first address translation engine, a second address translation engine, and a prefetch engine. The first address translation engine is configured to determine a first memory address of a pointer associated with a data prefetch instruction. The prefetch engine is coupled to the first translation engine and is configured to fetch content, included in a first data block (e.g., a first cache line) of a memory, at the first memory address. The second address translation engine is coupled to the prefetch engine and is configured to determine a second memory address based on the content of the memory at the first memory address. The prefetch engine is also configured to fetch (e.g., from the memory or another memory) a second data block (e.g., a second cache line) that includes data at the second memory address.