摘要:
Techniques for improved transactional memory management are described. In one embodiment, for example, an apparatus may comprise a processor element, an execution component for execution by the processor element to concurrently execute a software transaction and a hardware transaction according to a transactional memory process, a tracking component for execution by the processor element to activate a global lock to indicate that the software transaction is undergoing execution, and a finalization component for execution by the processor element to commit the software transaction and deactivate the global lock when execution of the software transaction completes, the finalization component to abort the hardware transaction when the global lock is active when execution of the hardware transaction completes. Other embodiments are described and claimed.
摘要:
A method and apparatus to facilitate shared pointers in a heterogeneous platform. In one embodiment of the invention, the heterogeneous or non-homogeneous platform includes, but is not limited to, a central processing core or unit, a graphics processing core or unit, a digital signal processor, an interface module, and any other form of processing cores. The heterogeneous platform has logic to facilitate sharing of pointers to a location of a memory shared by the CPU and the GPU. By sharing pointers in the heterogeneous platform, the data or information sharing between different cores in the heterogeneous platform can be simplified.
摘要:
A method and apparatus for providing optimized strong atomicity operations for non-transactional writes is herein described. Locks are acquired upon initial non-transactional writes to memory locations. The locks are maintained until an event is detected resulting in the release of the locks. As a result, in the intermediary period between acquiring and releasing the locks, any subsequent writes to memory locations that are locked are accelerated through non-execution of lock acquire operations.
摘要:
A method and apparatus for providing optimized strong atomicity operations for non-transactional writes is herein described. Locks are acquired upon initial non-transactional writes to memory locations. The locks are maintained until an event is detected resulting in the release of the locks. As a result, in the intermediary period between acquiring and releasing the locks, any subsequent writes to memory locations that are locked are accelerated through non-execution of lock acquire operations.
摘要:
A method and apparatus for optimizing quiescence in a transactional memory system is herein described. Non-ordering transactions, such as read-only transactions, transactions that do not access non-transactional data, and write-buffering hardware transactions, are identified. Quiescence in weak atomicity software transactional memory (STM) systems is optimized through selective application of quiescence. As a result, transactions may be decoupled from dependency on quiescing/waiting on previous non-ordering transaction to increase parallelization and reduce inefficiency based on serialization of transactions.
摘要:
Thread synchronization with lock inflation methods and apparatus for managed run-time environments are disclosed. An example method disclosed herein comprises determining a locking operation to perform on a lock corresponding to the object, performing an optimistically balanced synchronization of the lock if the locking operation is not unbalanced, and modifying a lock shape of the lock if the locking operation is unbalanced.
摘要:
A method and apparatus for optimizing quiescence in a transactional memory system is herein described. Non-ordering transactions, such as read-only transactions, transactions that do not access non-transactional data, and write-buffering hardware transactions, are identified. Quiescence in weak atomicity software transactional memory (STM) systems is optimized through selective application of quiescence. As a result, transactions may be decoupled from dependency on quiescing/waiting on previous non-ordering transaction to increase parallelization and reduce inefficiency based on serialization of transactions.
摘要:
An apparatus to facilitate compute optimization is disclosed. The apparatus includes sorting logic to sort processing threads into thread groups based on bit depth of floating point thread operations.
摘要:
A machine-learning decision system includes an online decision system and an offline decision system. The online decision system produces a first time slice-specific decision output corresponding to a first time slice based on one or more situational inputs received in the first time slice. The offline decision system produces a second Lime slice-specific decision output corresponding to the first time slice based on one or more situational inputs received in the first time slice and in a plurality of subsequent time slices occurring after the first time slice. The system further includes an online training system that conducts negative-reinforcement training of the online decision system in response to a nonconvergence between the first and the second time slice-specific decision outputs.
摘要:
A method and apparatus for optimizing weak atomicity overhead is herein described. A state table is maintained either during static or dynamic compilation of code to track data non-transactionally accessed. Within execution of a transaction, such as at transactional memory accesses or within a commit function, it is determined if data associated with memory access within the transaction is to be conflictingly accessed outside the transaction from the state table. If the data is not accessed outside the transaction, then the transaction potentially commits without weak atomicity safety mechanisms, such as privatization. Furthermore, even if data is accessed outside the transaction, optimized safety mechanisms may be performed to ensure isolation between the potentially conflicting accesses, while eliding the mechanisms for data not accessed outside the transaction.