摘要:
Described herein are methods, systems, apparatuses and products for cost-aware replication of intermediate data in dataflows. An aspect provides receiving at least one measurement indicative of a reliability cost associated with executing a dataflow; computing a degree of replication of at least one intermediate data set in the dataflow based on the reliability cost; and communicating at least one replication factor to at least one component of a system responsible for replication of the at least one intermediate data set in the dataflow; wherein the at least one intermediate data set is replicated according to the replication factor. Other embodiments are disclosed.
摘要:
Described herein are methods, systems, apparatuses and products for cost-aware replication of intermediate data in dataflows. An aspect provides receiving at least one measurement indicative of a reliability cost associated with executing a dataflow; computing a degree of replication of at least one intermediate data set in the dataflow based on the reliability cost; and communicating at least one replication factor to at least one component of a system responsible for replication of the at least one intermediate data set in the dataflow; wherein the at least one intermediate data set is replicated according to the replication factor. Other embodiments are disclosed.
摘要:
A technique is provided for creating virtual units in a computing environment. A virtual system definition is received by a processor that is utilized to create the virtual units for a virtual system. Relationship constraints between the virtual units in the virtual system are received by the processor. The relationship constraints between the virtual units include a communication link requirement between the virtual units and/or a location requirement between the virtual units. The virtual units in the virtual system are deployed by the processor according to the relationship constraints between virtual units.
摘要:
A technique is provided for creating virtual units in a computing environment. A virtual system definition is received by a processor that is utilized to create the virtual units for a virtual system. Relationship constraints between the virtual units in the virtual system are received by the processor. The relationship constraints between the virtual units include a communication link requirement between the virtual units and/or a location requirement between the virtual units. The virtual units in the virtual system are deployed by the processor according to the relationship constraints between virtual units.
摘要:
A method, system and computer program product for distributing intermediate data of a multistage computer application to a plurality of computers. In one embodiment, a data manager calculates data usage demand of generated intermediate data. A computer manager calculates a computer usage, which is the sum of all data usage demand of each stored intermediate data at the computer. A scheduler selects a target computer from the plurality of computers for storage of the generated intermediate data at such that a variance of the computer usage demand across the plurality of computers is minimized.
摘要:
A method and structure for processing an application program on a computer. In a memory of the computer executing the application, an in-memory cache structure is provided for normally temporarily storing data produced in the processing. An in-memory storage outside the in-memory cache structure is provided in the memory, for by-passing the in-memory cache structure for temporarily storing data under a predetermined condition. A sensor detects an amount of usage of the in-memory cache structure used to store data during the processing. When it is detected that the amount of usage exceeds the predetermined threshold, the processing is controlled so that the data produced in the processing is stored in the in-memory storage rather than in the in-memory cache structure.
摘要:
A method, system and computer program product for storing data in memory. An example system includes at least one multistage application configured to generate intermediate data in a generating stage of the application and consume the intermediate data in a subsequent consuming stage of the application. A runtime profiler is configured to monitor the application's execution and dynamically allocate memory to the application from an in-memory data grid.
摘要:
A method and structure for processing an application program on a computer. In a memory of the computer executing the application, an in-memory cache structure is provided for normally temporarily storing data produced in the processing. An in-memory storage outside the in-memory cache structure is provided in the memory, for by-passing the in-memory cache structure for temporarily storing data under a predetermined condition. A sensor detects an amount of usage of the in-memory cache structure used to store data during the processing. When it is detected that the amount of usage exceeds the predetermined threshold, the processing is controlled so that the data produced in the processing is stored in the in-memory storage rather than in the in-memory cache structure.
摘要:
A method, system and computer program product for storing data in memory. An example system includes at least one multistage application configured to generate intermediate data in a generating stage of the application and consume the intermediate data in a subsequent consuming stage of the application. A runtime profiler is configured to monitor the application's execution and dynamically allocate memory to the application from an in-memory data grid.
摘要:
A method, system and computer program product for distributing intermediate data of a multistage computer application to a plurality of computers. In one embodiment, a data manager calculates data usage demand of generated intermediate data. A computer manager calculates a computer usage, which is the sum of all data usage demand of each stored intermediate data at the computer. A scheduler selects a target computer from the plurality of computers for storage of the generated intermediate data at such that a variance of the computer usage demand across the plurality of computers is minimized.