Abstract:
Implementing a fair share of resources among one or more scheduling peers. Resource allocations are received for a plurality of scheduling peers. For each scheduling peer, a usage percentage difference is determined between their respective usage percentage and configured share ratio. For a first competing peer that is served more than a second competing peer, resource allocation is adjusted such that resources from the first competing peer are allocated to the second competing peer based, at least in part, on a time decay factor function that gives less weight to the usage percentage difference as an age of the usage percentage difference increases.
Abstract:
A request arrival rate is obtained at a given computing node in a computing network comprising a plurality of distributed computing nodes. A topology of the computing network is determined at the given computing node so as to identify neighboring computing nodes with respect to the given computing node. A probability is computed at the given computing node based on the obtained request arrival rate and the detected network topology. The computed probability is used to select a decision from a set of decision candidates in response to a request received at the given computing node in a given time slot. The selected decision is a decision with a top average reward attributed thereto across the given computing node and the neighboring computing nodes determined based on information shared by the neighboring computing node with the given computing node.
Abstract:
Various embodiments manage dynamic memory allocation data. In one embodiment, a set of memory allocation metadata is extracted from a memory heap space. Process dependent information and process independent information is identified from the set of memory allocation metadata based on the set of memory allocation metadata being extracted. The process dependent information and the process independent information at least identify a set of virtual memory addresses available in the memory heap space and a set of virtual memory addresses allocated to a process associated with the memory heap space. A set of allocation data associated with the memory heap space is stored in a persistent storage based on the process dependent information and the process independent information having been identified. The set of allocation data includes the process independent allocation information and a starting address associated with the memory heap space.
Abstract:
Methods and arrangements for assembling tasks in a progressive queue. At least one job is received, each job comprising a dependee set of tasks and a depender set of at least one task. The dependee tasks are assembled in a progressive queue for execution, and the dependee tasks are executed. Other variants and embodiments are broadly contemplated herein.
Abstract:
A method for selecting a consensus protocol comprises separating a consensus protocol into one or more communication steps, wherein the consensus protocol is useable to substantially maintain data consistency between nodes in a distributed computing system, and wherein a communication step comprises a message transfer, attributable to the consensus protocol, in the distributed computing system, and computing an estimated protocol-level delay based on one or more attributes associated with the separated communication steps of the consensus protocol.
Abstract:
A method, apparatus, and computer program product for configuring a computer cluster. Job information identifying a data processing job to be performed is received by a processor unit. The data processing job to be performed comprises a plurality of stages. Cluster information identifying a candidate computer cluster is also received by the processor unit. The processor unit identifies stage performance models for modeled stages that are similar to the plurality of stages. The processor unit predicts predicted stage performance times for performing the plurality of stages on the candidate computer cluster using the stage performance models and combines the predicted stage performance times for the plurality of stages to determine a predicted job performance time. The predicted job performance time may be used to configure the computer cluster.
Abstract:
Identifying resource bottleneck in multi-stage workflow processing may include identifying dependencies between logical stages and physical resources in a computing system to determine which logical stage involves what set of resources; for each of the identified dependencies, determining a functional relationship between a usage level of a physical resource and concurrency level of a logical stage; estimating consumption of the physical resources by each of the logical stages based on the functional relationship determined for each of the logical stages; and performing a predictive modeling based on the estimated consumption to determine a concurrency level at which said each of the logical stages will become bottleneck.
Abstract:
A method for selecting a consensus protocol comprises separating a consensus protocol into one or more communication steps, wherein the consensus protocol is useable to substantially maintain data consistency between nodes in a distributed computing system, and wherein a communication step comprises a message transfer, attributable to the consensus protocol, in the distributed computing system, and computing an estimated protocol-level delay based on one or more attributes associated with the separated communication steps of the consensus protocol.
Abstract:
Methods and arrangements for task scheduling. A plurality of jobs is received, each job comprising at least a map phase, a copy/shuffle phase and a reduce phase. For each job, there are determined a map phase execution time and a copy/shuffle phase execution time. Each job is classified into at least one group based on at least one of: the determined map phase execution time and the determined copy/shuffle phase execution time. The plurality of jobs are executed via processor sharing, and the executing includes determining a similarity measure between jobs based on current job execution progress. Other variants and embodiments are broadly contemplated herein.
Abstract:
Methods and arrangements for yielding resources in data processing. At least one job is received, each job comprising a dependee set of tasks and a depender set of at least one task, and the at least one of the dependee set of tasks is executed. At least one resource of the at least one of the dependee set of tasks is yielded upon detection of resource underutilization in at least one other location. Other variants and embodiments are broadly contemplated herein. Other variants and embodiments are broadly contemplated herein.