摘要:
Embodiments of the present invention provide a class of computer architectures generally referred to as lightweight multi-threaded architectures (LIMA). Other embodiments may be described and claimed.
摘要:
Described herein are techniques for generating invocation stubs for a data parallel programming model so that a data parallel program written in a statically-compiled high-level programming language may be more declarative, reusable, and portable than traditional approaches. With some of the described techniques, invocation stubs are generated by a compiler and those stubs bridge a logical arrangement of data parallel computations to the actual physical arrangement of a target data parallel hardware for that data parallel computation.
摘要:
A method and system that prepares a task for being swapped out from processor utilization that is executing on a computer with multiple processors that each support multiple streams. The task has one or more teams of threads, where each team represents threads executing on a single processor. The task designates, for each stream that is executing a thread, one stream as a team master stream and one stream as a task master stream. For each team master stream, the task notifies the operating system that the team is ready to be swapped out when each other thread of the team has saved its state and has quit its stream. Finally, for the task master stream, the task notifies the operating system that the task is ready to be swapped when it has saved its state and each other team has notified that it is ready to be swapped out.
摘要:
A method system for optimizing a computer program. In one embodiment, the system identifies depths of blocks of a computer program and identifies the availability of expressions of the computer program. The system then modifies the computer program when he identified availability of the expression and the identified depth of a block indicate that the expression can be moved to the block. The depth of the block may represent the number of dominator blocks of that block. The availability of the expression may represent the depth of a block to which the expression may be moved. In one embodiment, when the identified availability of the expression is less than the identified depth of the block, the expression can be moved to the block.
摘要:
A computer-based method and system for allocating target registers to branch operations and for determining the location of target definitions for the branch operations within a computer program. The target register allocation system of the present invention allocates a target register to be specified by each branch operation. The target register is to contain the address of the target that is loaded by the target definition. The target register allocation system determines a location in the computer program for a target definition such that whenever the branch operation is executed, the allocated target register contains the address of the target of the branch. The target allocation system may determine the location to be in a dominator block of the branch operation. The target allocation system may also determine the location a target definition so that the address of the target that is loaded by the target definition can be used by multiple branch operations. The target allocation system may also determine the location of the target definition based on execution frequency of locations. The target allocation system may, when a branch operation is in a loop, determine the location of the target definition to be outside the loop. The target allocation system may, when the program is a function, give preference to a non-callee save register in allocating a target register. The target allocation system may give preference to a callee save register of a function whose invocation is located in between the determined location and the location of the branch operation on a path of execution when allocating a target register.
摘要:
A method to exchange data in a shared memory system includes the use of a buffer in communication with a producer processor and a consumer processor. The cache data is temporarily stored in the buffer. The method includes for the consumer and the producer to indicate intent to acquire ownership of the buffer. In response to the indication of intent, the producer, consumer, buffer are prepared for the access. If the consumer intends to acquire the buffer, the producer places the cache data into the buffer. If the producer intends to acquire the buffer, the consumer removes the cache data from the buffer. The access to the buffer, however, is delayed until the producer, consumer, and the buffer are prepared.
摘要:
Various techniques for manipulating data using access states of memory, access control fields of pointers and operations, and exception raising and exception trapping in a multithreaded computer system. In particular, the techniques include synchronization support for a thread blocked in a word, demand evaluation of values, parallel access of multiple threads to a list, synchronized and unsynchronized access to a data buffer, use of forwarding to avoid checking for an end of a buffer, use of sentinel word to detect access past a data structure, concurrent access to a word of memory using different synchronization access modes, and use of trapping to detect access to restricted memory.
摘要:
Various techniques for manipulating data using access states of memory, access control fields of pointers and operations, and exception raising and exception trapping in a multithreaded computer system. In particular, the techniques include synchronization support for a thread blocked in a word, demand evaluation of values, parallel access of multiple threads to a list, synchronized and unsynchronized access to a data buffer, use of forwarding to avoid checking for an end of a buffer, use of sentinel word to detect access past a data structure, concurrent access to a word of memory using different synchronization access modes, and use of trapping to detect access to restricted memory.
摘要:
A method and system in a multithreaded processor for processing events without interrupt notifications. In one aspect of the present invention, an operating system creates a thread to execute on a stream of the processor. During execution of the thread, the thread executes a loop that determines whether an event has occurred and, in response to determining whether an event has occurred, assigns a different thread to process the event so that multiple events can be processed in parallel and so that interrupts are not needed to signal that the event has occurred. Another aspect of the present invention provides a method and system for processing asynchronously occurring events without interrupt notifications. To achieve this processing, a first thread is executed to generate a notification that the event has occurred upon receipt of the asynchronously occurring event. A second thread is also executed that loops determining whether a notification has been generated and, in response to determining that a notification has been generated, performing the processing necessary for the event.
摘要:
Various techniques for manipulating data using access states of memory, access control fields of pointers and operations, and exception raising and exception trapping in a multithreaded computer system. In particular, the techniques include synchronization support for a thread blocked in a word, demand evaluation of values, parallel access of multiple threads to a list, synchronized and unsynchronized access to a data buffer, use of forwarding to avoid checking for an end of a buffer, use of sentinel word to detect access past a data structure, concurrent access to a word of memory using different synchronization access modes, and use of trapping to detect access to restricted memory.