Abstract:
Adaptive data collections may include various type of data arrays, sets, bags, maps, and other data structures. A simple interface for each adaptive collection may provide access via a unified API to adaptive implementations of the collection. A single adaptive data collection may include multiple, different adaptive implementations. A system configured to implement adaptive data collections may include the ability to adaptively select between various implementations, either manually or automatically, and to map a given workload to differing hardware configurations. Additionally, hardware resource needs of different configurations may be predicted from a small number of workload measurements. Adaptive data collections may provide language interoperability, such as by leveraging runtime compilation to build adaptive data collections and to compile and optimize implementation code and user code together. Adaptive data collections may also provide language-independent such that implementation code may be written once and subsequently used from multiple programming languages.
Abstract:
Adaptive data collections may include various type of data arrays, sets, bags, maps, and other data structures. A simple interface for each adaptive collection may provide access via a unified API to adaptive implementations of the collection. A single adaptive data collection may include multiple, different adaptive implementations. A system configured to implement adaptive data collections may include the ability to adaptively select between various implementations, either manually or automatically, and to map a given workload to differing hardware configurations. Additionally, hardware resource needs of different configurations may be predicted from a small number of workload measurements. Adaptive data collections may provide language interoperability, such as by leveraging runtime compilation to build adaptive data collections and to compile and optimize implementation code and user code together. Adaptive data collections may also provide language-independent such that implementation code may be written once and subsequently used from multiple programming languages.
Abstract:
Fast modern interconnects may be exploited to control when garbage collection is performed on the nodes (e.g., virtual machines, such as JVMs) of a distributed system in which the individual processes communicate with each other and in which the heap memory is not shared. A garbage collection coordination mechanism (a coordinator implemented by a dedicated process on a single node or distributed across the nodes) may obtain or receive state information from each of the nodes and apply one of multiple supported garbage collection coordination policies to reduce the impact of garbage collection pauses, dependent on that information. For example, if the information indicates that a node is about to collect, the coordinator may trigger a collection on all of the other nodes (e.g., synchronizing collection pauses for batch-mode applications where throughput is important) or may steer requests to other nodes (e.g., for interactive applications where request latencies are important).
Abstract:
A computer system including one or more processors and persistent, word-addressable memory implements a persistent atomic multi-word compare-and-swap operation. On entry, a list of persistent memory locations of words to be updated, respective expected current values contained the persistent memory locations and respective new values to write to the persistent memory locations are provided. The operation atomically performs the process of comparing the existing contents of the persistent memory locations to the respective current values and, should they match, updating the persistent memory locations with the new values and returning a successful status. Should any of the contents of the persistent memory locations not match a respective current value, the operation returns a failed status. The operation is performed such that the system can recover from any failure or interruption by restoring the list of persistent memory locations.
Abstract:
Adaptive data collections may include various type of data arrays, sets, bags, maps, and other data structures. A simple interface for each adaptive collection may provide access via a unified API to adaptive implementations of the collection. A single adaptive data collection may include multiple, different adaptive implementations. A system configured to implement adaptive data collections may include the ability to adaptively select between various implementations, either manually or automatically, and to map a given workload to differing hardware configurations. Additionally, hardware resource needs of different configurations may be predicted from a small number of workload measurements. Adaptive data collections may provide language interoperability, such as by leveraging runtime compilation to build adaptive data collections and to compile and optimize implementation code and user code together. Adaptive data collections may also provide language-independent such that implementation code may be written once and subsequently used from multiple programming languages.
Abstract:
When performing non-sequential accesses to large data sets, hot spots may be avoided by permuting the memory locations being accesses to more evenly spread those accesses across the memory and across multiple memory channels. A permutation step may be used when accessing data, such as to improve the distribution of those memory accesses within the system. Instead of accessing one memory address, that address may be permuted so that another memory address is accessed. Non-sequential accesses to an array may be modified such that each index to the array is permuted to another index in the array. Collisions between pre- and post-translation addresses may be prevented and one-to-one mappings may be used. Permutation mechanisms may be implemented in software, hardware, or a combination of both, with or without the knowledge of the process performing the memory accesses.
Abstract:
Multi-core computers may implement a resource management layer between the operating system and resource-management-enabled parallel runtime systems. The resource management components and runtime systems may collectively implement dynamic co-scheduling of hardware contexts when executing multiple parallel applications, using a spatial scheduling policy that grants high priority to one application per hardware context and a temporal scheduling policy for re-allocating unused hardware contexts. The runtime systems may receive resources on a varying number of hardware contexts as demands of the applications change over time, and the resource management components may co-ordinate to leave one runnable software thread for each hardware context. Periodic check-in operations may be used to determine (at times convenient to the applications) when hardware contexts should be re-allocated. Over-subscription of worker threads may reduce load imbalances between applications. A co-ordination table may store per-hardware-context information about resource demands and allocations.
Abstract:
Transactional Lock Elision allows hardware transactions to execute unmodified critical sections protected by the same lock concurrently, by subscribing to the lock and verifying that it is available before committing the transaction. A “lazy subscription” optimization, which delays lock subscription, can potentially cause behavior that cannot occur when the critical sections are executed under the lock. Hardware extensions may provide mechanisms to ensure that lazy subscriptions are safe (e.g., that they result in correct behavior). Prior to executing a critical section transactionally, its lock and subscription code may be identified (e.g., by writing their locations to special registers). Prior to committing the transaction, the thread executing the critical section may verify that the correct lock was correctly subscribed to. If not, or if locations identified by the special registers have been modified, the transaction may be aborted. Nested critical sections associated with different lock types may invoke different subscription code.