摘要:
A grid change controller within a particular grid environment detects an unintended change within that grid environment. In particular, grid change controller monitors potential change indicators received from multiple disparate resource managers across the grid environment, where each resource manage manages a selection of resources within the grid environment. The grid change controller then determines a necessary response to the unintended change within the grid environment and communicates with at least one independent manager within the grid environment to resolve the unintended change, such that the grid environment to maintain performance requirements within the grid environment.
摘要:
A method, system, and program for facilitating overall grid environment management by monitoring grid activity across disparate grid resources and distributing grid activity to decisional grid modules are provided. A grid workload controller within a computational grid environment monitors real-time grid activity at an application level from multiple disparate grid application environments. The grid workload controller then determines a selection of grid modules within the computational grid environment that require the real-time grid activity to make decisions about the management of the computational grid environment. The grid workload controller distributes the real-time grid activity to the selection of grid modules, wherein the selection of grid modules then make automated decisions within the grid environment to maintain performance requirements.
摘要:
A grid service detects a current software environment for a grid job within a grid environment, wherein the grid environment includes multiple grid resources. The grid service searches a catalog of multiple software images to determine whether an image for the current software environment matches any software images in the catalog. Each of the software images includes an index into at least one installation image. Storage of the software images is structured in the catalog for automated efficient access to each software image by multiple resource nodes within the grid environment. If the grid service does not locate a software image for the current software environment in the catalog, the grid service captures at least one installation image for the current software environment for storage in the catalog as an additional software image.
摘要:
A method, system, and program for verifying resource functionality before use by a grid job submitted to a grid environment are provided. When a new resource is allocated to a particular execution environment within a grid environment managed by a grid management system, then a grid verification service automatically selects and runs at least one functionality test on the new resource as controlled by the grid management system. Responsive to a result of the functionality test, the grid verification system verifies whether the result meets an expected result before enabling routing of the grid job to the new resource, such that the functionality of the new resource is automatically verified before access to the new resource is allowed to maintain quality of service in processing grid jobs.
摘要:
Computing environments within a grid computing system are dynamically built in response to specific job resource requirements from a grid resource allocator, including activating needed hardware, provisioning operating systems, application programs, and software drivers. Optimally, prior to building a computing environment for a particular job, cost/revenue analysis is performed, and if operational objectives would not be met by building the environment and executing the job, a job sell-off process is initiated.
摘要:
The present invention is method for scheduling jobs in a grid computing environment without having to monitor the state of the resource on the gird comprising a Global Scheduling Program (GSP) and a Local Scheduling Program (LSP). The GSP receives jobs submitted to the grid and distributes the job to the closest resource. The resource then runs the LSP to determine if the resource can execute the job under the conditions specified in the job. The LSP either rejects or accepts the job based on the current state of the resource properties and informs the GSP of the acceptance or rejection. If the job is rejected, the GSP randomly selects another resource to send the job to using a resource table. The resource table contains the state-independent properties of every resource on the grid.
摘要:
A method for managing network errors communicated in a message transaction with error information using a troubleshooting agent. A network facilitates message transactions between a requester and a responder for facilitating web services. When a non-application specific error occurs in relation to a particular message transaction, such as a network error, a protocol layer assigns an error code and either the requester or responder encodes the error code in the body of an envelope added to the particular message transaction. The message transaction is an XML message with a Simple Object Access Protocol (SOAP) envelope encoded with the error code to which the XML message is then attached. The error encoded message transaction is forwarded to a troubleshooting agent. The troubleshooting agent facilitates resolution of the non-application specific error and returns a descriptive message indicating the resolution of the non-application specific error to at least one of the requester and the responder.
摘要:
The present invention is method for scheduling jobs in a grid computing environment without having to monitor the state of the resource on the gird comprising a Global Scheduling Program (GSP) and a Local Scheduling Program (LSP). The GSP receives jobs submitted to the grid and distributes the job to the closest resource. The resource then runs the LSP to determine if the resource can execute the job under the conditions specified in the job. The LSP either rejects or accepts the job based on the current state of the resource properties and informs the GSP of the acceptance or rejection. If the job is rejected, the GSP randomly selects another resource to send the job to using a resource table. The resource table contains the state-independent properties of every resource on the grid.
摘要:
A system, method, and service associated with a computing grid or a virtual organization include a request for proposal (RFP) generator, where the RFP describes a data processing task. The RFP is provided to multiple resource providers via the computing grid where each of the resource providers is potentially suitable for performing the data processing task on behalf of the resource consumer. An RFP response processor receives and evaluates RFP responses generated by one or more of the resource providers. An exception processor accessible to the RFP response processor evaluates any exception in the RFP to determine if the exception disqualifies the RFP response. The exceptions may include, for example, job time limit exceptions, resource requirement exceptions, hardware/software platform requirement exceptions and others. Exception rules may be defined to guide the evaluation of the exception.
摘要:
A method, system, and program for managing network errors communicated in a message transaction with error information using a troubleshooting agent. A network facilitates message transactions between a requester and a responder for facilitating web services. When a non-application specific error occurs in relation to a particular message transaction, such as a network error, a protocol layer assigns an error code and either the requester or responder encodes the error code in the body of an envelope added to the particular message transaction. In particular, the message transaction is an XML message with a Simple Object Access Protocol (SOAP) envelope encoded with the error code to which the XML message is then attached. The error encoded message transaction is forwarded to a troubleshooting agent. The troubleshooting agent facilitates resolution of the non-application specific error and returns a descriptive message indicating the resolution of the non-application specific error to at least one of the requester and the responder.