Abstract:
A distributed work processing system for processing computational tasks is scalable and fault-tolerant without requiring centralized control. Worker processes running on worker hosts are organized into a logical group and worker coordinators running on worker coordinator hosts coordinate tasks assigned to worker processes. A task store might hold a collection of tasks to be performed by the logical group. A lock database can be used for locking the logical group for coordination by one worker coordinator process at a time. A membership store contains mappings of worker processes to logical groups, and an assignment store indicates which tasks are assigned to which workers. The worker coordinator process has a scanner process to deal with unassigned tasks and deduplicating duplicate assignments. If a worker coordinator does not see enough worker processes, it can instantiate more. If a worker process does not see a worker coordinator, it can instantiate one.
Abstract:
A distributed work processing system for processing computational tasks is scalable and fault-tolerant without requiring centralized control. Worker processes running on worker hosts are organized into a logical group and worker coordinators running on worker coordinator hosts coordinate tasks assigned to worker processes. A task store might hold a collection of tasks to be performed by the logical group. A lock database can be used for locking the logical group for coordination by one worker coordinator process at a time. A membership store contains mappings of worker processes to logical groups, and an assignment store indicates which tasks are assigned to which workers. The worker coordinator process has a scanner process to deal with unassigned tasks and deduplicating duplicate assignments. If a worker coordinator does not see enough worker processes, it can instantiate more. If a worker process does not see a worker coordinator, it can instantiate one.
Abstract:
A distributed work processing system for processing computational tasks IS scalable and fault-tolerant without requiring centralized control. Worker processes running on worker hosts and worker coordinators running on worker coordinator hosts interact with a task store that holds a collection of tasks to be performed by a logical group of worker processes, a lock database used for locking the logical group for coordination by one worker coordinator process at a time, a membership store that contains mappings of worker processes to logical groups, and an assignment store indicating which tasks are assigned to which workers. The worker coordinator process has a scanner process to deal with unassigned tasks and deduplicating duplicate assignments. If a worker coordinator does not see enough worker processes, it can instantiate more. If a worker process does not see a worker coordinator, it can instantiate one.