摘要:
The Heartbeat Failure Detector of the present invention, is a new, simple method and device that has several identical modules, and each module is attached to a processor in the system. Roughly speaking, the module of a processor x maintains a heartbeat counter for every other process. For each process y, the counter of y at x (called the heartbeat of y at x) periodically increases while y is alive and in the same network partition, and the counter stops increasing after y crashes or becomes partitioned away. Using such a device, x can solve the communication problem above by resending m only if the heartbeat of y at x increases. Note that if y is crashed (or partitioned away from x), its heartbeat at x stops almost immediately, and so x will also immediately stop sending copies of m to y.
摘要:
A remote procedure call chain is provided that replaces multiple consecutive remote procedure calls to multiple servers from a client by allowing a client to specify multiple functions to be performed consecutively at multiple servers in a single remote procedure call chain. The remote procedure call chain is executed by a sequence of multiple servers. Each server executes a service function and a chaining function of the remote procedure call chain. The chaining function uses the state of the remote procedure call chain in the sequence of servers to determine the next server to receive the remote procedure call chain, and the service function to be executed by that server. After the last service function is performed, the last server in the sequence of servers sends the results of the executed service functions to the client that originated the remote procedure call chain.
摘要:
Embodiments include methods, apparatus, and systems for snapshots in distributed storage systems. One method of software execution includes using a version tree to determine what data blocks are shared between various storage nodes in the version tree in order to create a clone or a snapshot of a storage volume in a distributed storage system that uses quorum-based replication.
摘要:
A cloud statistics server generates statistics for a cloud service based on an identified data item and an identified operation. The cloud service may include various computing nodes and storage nodes. The cloud statistics may include expected completion times for the identified operation and the identified data item with respect to each of the computing nodes. A computing node may be selected to execute the identified operation based on the expected completion times. The generated statistics may be generated by the cloud statistics server using a network topology associated with the data item that is based on the latencies or expected transfer times between the various storage nodes and computing nodes, and a replication strategy used by the cloud service. The topology may be implemented as a directed graph with edge weights corresponding to expected transfer times between each node.
摘要:
A transactional shared memory system has a plurality of discrete application nodes; a plurality of discrete memory nodes; a network interconnecting the application nodes and the memory nodes, and a controller for directing transactions in a distributed system utilizing the shared memory. The memory nodes collectively provide an address space of shared memory that is provided to the application nodes via the network. The controller has instructions to transfer a batched transaction instruction set from an application node to at least one memory node. This instruction set includes one or more write, compare and read instruction subsets, and/or combinations thereof. At least one subset has a valid non null memory node identifier and memory address range. The memory node identifier may be indicated by the memory address range. The controller controls the memory node responsive to receipt of the batched transaction instruction set, to safeguard the associated memory address range during execution of the transaction instruction set. The batched transaction instruction set is collectively executed atomically. A notification instruction set may also be used to establish a notification, triggered upon a subsequent write event upon at least a portion of a specified address range.
摘要:
A cloud statistics server generates statistics for a cloud service based on an identified data item and an identified operation. The cloud service may include various computing nodes and storage nodes. The cloud statistics may include expected completion times for the identified operation and the identified data item with respect to each of the computing nodes. A computing node may be selected to execute the identified operation based on the expected completion times. The generated statistics may be generated by the cloud statistics server using a network topology associated with the data item that is based on the latencies or expected transfer times between the various storage nodes and computing nodes, and a replication strategy used by the cloud service. The topology may be implemented as a directed graph with edge weights corresponding to expected transfer times between each node.
摘要:
According to at least one embodiment, a method comprises identifying at least one causal path that includes a node of a distributed computing environment that is of interest. The method further comprises analyzing the identified at least one causal path to determine at least one time interval when the node is active in such causal path, and correlating consumption of a resource by the node to the node's activity in the at least one causal path.
摘要:
To perform information tracing, at least one signature to be traced is received. It is detected that a first process causes data to be provided to at least one of a second process and a file. It is determined whether the data contains the at least one signature. In response to determining that the data contains the at least one signature, a log is updated. The log contains information identifying at least one of processes and files that are part of a flow of the at least one signature.
摘要:
In accordance with an embodiment of the invention, a method of writing and reading redundant data is provided. Data is written by storing a copy of the data along with a timestamp and a signature at each of a set of storage devices. The data is read by retrieving the copy of the data, the timestamp and the signature from each of a plurality of the set of data storage devices. One of the copies of the data is selected to be provided to a requestor of the data. Each of the storage devices of the set is requested to certify the selected copy of the data. Provided that a proof of certification of the selected copy of the data is valid, the storage devices of the set are instructed to store the selected copy of the data along with a new timestamp.
摘要:
A transactional shared memory system has a plurality of discrete application nodes; a plurality of discrete memory nodes; a network interconnecting the application nodes and the memory nodes, and a controller for directing transactions in a distributed system utilizing the shared memory. The memory nodes collectively provide an address space of shared memory that is provided to the application nodes via the network. The controller has instructions to transfer a batched transaction instruction set from an application node to at least one memory node. This instruction set includes one or more write, compare and read instruction subsets, and/or combinations thereof. At least one subset has a valid non null memory node identifier and memory address range. The memory node identifier may be indicated by the memory address range. The controller controls the memory node responsive to receipt of the batched transaction instruction set, to safeguard the associated memory address range during execution of the transaction instruction set. The batched transaction instruction set is collectively executed atomically. A notification instruction set may also be used to establish a notification, triggered upon a subsequent write event upon at least a portion of a specified address range.