摘要:
One embodiment of the present invention provides a system that avoids read-after-write (RAW) hazards while speculatively executing instructions on a processor. The system starts in a normal execution mode, wherein the system issues instructions for execution in program order. Upon encountering a stall condition during execution of an instruction, the system generates a checkpoint, and executes the instruction and subsequent instructions in a speculative-execution mode. The system also maintains dependency information for each register indicating whether or not a value in the register depends on an unresolved data-dependency. The system uses this dependency information to avoid RAW hazards during the speculative-execution mode.
摘要:
Techniques have been developed whereby likely pointer values are identified at runtime and contents of corresponding storage location can be prefetched into a cache hierarchy to reduce effective memory access latencies. In some realizations, one or more writable stores are defined in a processor architecture to delimit a portion or portions of a memory address space.
摘要:
One embodiment of the present invention provides a system that facilitates deferring execution of instructions with unresolved data dependencies as they are issued for execution in program order. During a normal execution mode, the system issues instructions for execution in program order. Upon encountering an unresolved data dependency during execution of an instruction, the system generates a checkpoint that can subsequently be used to return execution of the program to the point of the instruction. Next, the system executes subsequent instructions in an execute-ahead mode, wherein instructions that cannot be executed because of an unresolved data dependency are deferred, and wherein other non-deferred instructions are executed in program order.
摘要:
One embodiment of the present invention provides a system that prefetches from memory by using an assist processor that performs data speculation and that executes in advance of a primary processor. The system operates by executing executable code on the primary processor while simultaneously executing a reduced version of the executable code on the assist processor. This allows the assist processor to generate the same pattern of memory references that the primary processor generates in advance of when the primary processor generates the memory references. While executing the reduced version of the executable code, the system predicts a data value returned by a long latency operation within the executable code. The system subsequently uses the predicted data value to continue executing the reduced version of the executable code without having to wait for the long latency operation to complete.
摘要:
One embodiment of the present invention provides a system that facilitates multi-dimensional space and time dimensional execution of computer programs. The system includes a head thread that executes program instructions and a series of speculative threads that execute program instructions in advance of the head thread, wherein each speculative thread executes program instructions in advance of preceding speculative threads in the series. The head thread accesses a primary version of the memory element and the series of speculative threads access space-time dimensioned versions of the memory element. The system starts by receiving a memory access to the memory element. If the memory access is a write operation by the head thread or a speculative thread, the system determines if a version of the memory element associated with the head thread or speculative thread exists. If not, the system creates a version of the memory element for the thread. Next, the system performs the write operation to the version of the memory element. After performing the write operation, the system checks status information associated with the memory element to determine if the memory element has been read by a following speculative thread in the series of speculative threads. If so, the system causes the following speculative thread and any successive speculative threads in the series to roll back so that the following speculative thread and any successive speculative threads in the series can read a result of the write operation. If not, the system performs the write operation to all successive space-time dimensioned versions of the memory element.
摘要:
One embodiment provides for a system that uses a time stamp in order to more efficiently mark objects to keep track of accesses to fields with the objects. Upon receiving a first reference to a first field in an object, the system determines whether the first field has been accessed within a current time period. The system does so by retrieving a time stamp associated with the object. This time stamp indicates the last time any marking bit associated with any field in the object was updated. The system compares the time stamp with a current time value associated with the current time period. The system additionally retrieves a first marking bit associated with the first field and determines if the first marking bit is set. If the first marking bit is set and if the time stamp equals the current time value, the system determines that the first field has been accessed in the current time period. The system indicates that a second field in the object has been accessed in the current time period upon receiving a second reference to the second field. In response to the second reference, the system sets a second marking bit associated with the second field. The system also updates the time stamp associated with the object, if necessary, so that the time stamp contains the current time value.
摘要:
One embodiment of the present invention provides a system that supports space and time dimensional program execution by facilitating accesses to different versions of a memory element. The system supports a head thread that executes program instructions and a speculative thread that executes program instructions in advance of the head thread. The head thread accesses a primary version of the memory element, and the speculative thread accesses a space-time dimensioned version of the memory element. During a reference to the memory element by the head thread, the system accesses the primary version of the memory element. During a reference to the memory element by the speculative thread, the speculative thread accesses a pointer associated with the primary version of the memory element, and accesses a version of the memory element through the pointer. Note that the pointer points to the space-time dimensioned version of the memory element if the space-time dimensioned version of the memory element exists. In one embodiment of the present invention, the pointer points to the primary version of the memory element if the space-time dimensioned version of the memory element does not exist.
摘要:
A system is provided that facilitates space and time dimensional execution of computer programs through selective versioning of memory elements located in a system heap. The system includes a head thread that executes program instructions and a speculative thread that simultaneously executes program instructions in advance of the head thread with respect to the time dimension of sequential execution of the program. The collapsing of the time dimensions is facilitated by expanding the heap into two space-time dimensions, a primary dimension (dimension zero), in which the head thread operates, and a space-time dimension (dimension one), in which the speculative thread operates. In general, each dimension contains its own version of an object and objects created by the thread operating in the dimension. The head thread generally accesses a primary version of a memory element and the speculative thread generally accesses a corresponding space-time dimensioned version of the memory element. During a write by the head thread, the system performs the write to all dimensions of the memory element. Note that if the dimensions are collapsed at this address a single update will update all time dimensions. It also checks status information associated with the memory element to determine if the memory element has been read by the speculative thread. If so, the system causes the speculative thread to roll back so that the speculative thread can read a result of the write operation.
摘要:
One embodiment of the present invention provides a processor that supports multiple-issue execution. This processor includes a register file, which contains an array of memory cells, wherein the memory cells contain bits for architectural registers of the processor. The register file also includes multiple read ports and multiple write ports to support multiple-issue execution. During operation, if multiple read ports simultaneously read from a given register, the register file is configured to: read each bit of the given register out of the array of memory cells through a single bitline associated with the bit; and to use a driver located outside of the array of memory cells to drive the bit to the multiple read ports. In this way, each memory cell only has to drive a single bitline (instead of multiple bitlines) during a multiple-port read operation, thereby allowing memory cells to use smaller and more power-efficient drivers for read operations.
摘要:
One embodiment of the present invention provides a system that synchronizes threads on a multi-threaded processor. The system starts by executing instructions from a multi-threaded program using a first thread and a second thread. When the first thread reaches a predetermined location in the multi-threaded program, the first thread executes a Start-Transactional-Execution (STE) instruction to commence transactional execution, wherein the STE instruction specifies a location to branch to if transactional execution fails. During the subsequent transactional execution, the first thread accesses a mailbox location in memory (which is also accessible by the second thread) and then executes instructions that cause the first thread to wait. When the second thread reaches a second predetermined location in the multi-threaded program, the second thread signals the first thread by accessing the mailbox location, which causes the transactional execution of the first thread to fail, thereby causing the first thread to resume non-transactional execution from the location specified in the STE instruction. In this way, the second thread can signal to the first thread without the first thread having to poll a shared variable.