摘要:
Techniques for determining feasibility of a set of one or more operator partitioning constraints are provided. The techniques include receiving one or more sets of operator partitioning constraints, wherein each set of one or more constraints define one or more desired conditions for grouping together of operators into partitions and placing partitions on hosts, wherein each operator is embodied as software that performs a particular function, processing each set of one or more operator partitioning constraints to determine feasibility of each set of one or more operator partitioning constraints, creating and outputting one or more candidate partitions and one or more host placements for each set of feasible partitioning constraints, and creating and outputting a certificate of infeasibility for each set of infeasible partitioning constraints, wherein the certificate of infeasibility outlines one or more reasons for infeasibility.
摘要:
An apparatus and method for making fractional assignments of processing elements to processing nodes for stream-based applications in a distributed computer system includes determining an amount of processing power to give to each processing element. Based on a list of acceptable processing nodes, a determination of fractions of which processing nodes will work on each processing element is made. To update allocations of the amount of processing power and the fractions, the process is repeated.
摘要:
A method, system, and computer program product for implementing stream processing are provided. The system includes an application framework and applications containing dataflow graphs managed by the application framework running on a first network. The system also includes at least one circuit switch in the first network having a configuration that is controlled by the application framework, a plurality of processing nodes interconnected by the first network over one of wireline and wireless links, and a second network for providing at least one of control and additional data transfer over the first network. The application framework reconfigures circuit switches in response to monitoring aspects of the applications and the first network.
摘要:
A method of choosing jobs to run in a stream based distributed computer system includes determining jobs to be run in a distributed stream-oriented system by deciding a priority threshold above which jobs will be accepted, below which jobs will be rejected. Overall importance is maximized relative to the priority threshold based on importance values assigned to all jobs. System constraints are applied to ensure jobs meet set criteria.
摘要:
An apparatus and method for scheduling stream-based applications in a distributed computer system includes a scheduler configured to schedule work using three temporal levels. Each temporal level includes a method. A macro method is configured to schedule jobs that will run, in a highest temporal level, in accordance with a plurality of operation constraints to optimize importance of work. A micro method is configured to fractionally allocate, at a medium temporal level, processing elements to processing nodes in the system to react to changing importance of the work. A nano method is configured to revise, at a lowest temporal level, fractional allocations on a continual basis.
摘要:
A method, apparatus, and computer program product for scheduling stream-based applications in a distributed computer system with configurable networks are provided. The method includes choosing, at a highest temporal level, jobs that will run, an optimal template alternative for the jobs that will run, network topology, and candidate processing nodes for processing elements of the optimal template alternative for each running job to maximize importance of work performed by the system. The method further includes making, at a medium temporal level, fractional allocations and re-allocations of the candidate processing elements to the processing nodes in the system to react to changing importance of the work. The method also includes revising, at a lowest temporal level, the fractional allocations and re-allocations on a continual basis to react to burstiness of the work, and to differences between projected and real progress of the work.
摘要:
A method for scheduling a data processing job includes receiving the data processing job formed of a plurality of computing units, combining the plurality of computing units into a plurality of sets of tasks, each set including tasks of about equal estimated size, and different sets having different sized tasks, and assigning the tasks to a plurality of processors using a dynamic longest processing time (DLPT) scheme.
摘要:
Techniques for scheduling multiple flows in a multi-platform cluster environment are provided. The techniques include partitioning a cluster into one or more platform containers associated with one or more platforms in the cluster, scheduling one or more flows in each of the one or more platform containers, wherein the one or more flows are created as one or more flow containers, scheduling one or more individual jobs into the one or more flow containers to create a moldable schedule of one or more jobs, flows and platforms, and automatically converting the moldable schedule into a malleable schedule.
摘要:
An apparatus and method for making fractional assignments of processing elements to processing nodes for stream-based applications in a distributed computer system includes determining an amount of processing power to give to each processing element. Based on a list of acceptable processing nodes, a determination of fractions of which processing nodes will work on each processing element is made. To update allocations of the amount of processing power and the fractions, the process is repeated.
摘要:
Techniques for scheduling a plurality of jobs sharing input are provided. The techniques include partitioning one or more input datasets into multiple subcomponents, analyzing a plurality of jobs to determine which of the plurality of jobs require scanning of one or more common subcomponents of the one or more input datasets, and scheduling a plurality of jobs that require scanning of one or more common subcomponents of the one or more input datasets, facilitating a single scanning of the one or more common subcomponents to be used as input by each of the plurality of jobs.