-
公开(公告)号:US20080077925A1
公开(公告)日:2008-03-27
申请号:US11535083
申请日:2006-09-26
申请人: Yariv Aridor , Tamar Domany , Yevgeny Kliteynik , Edi Shmueli
发明人: Yariv Aridor , Tamar Domany , Yevgeny Kliteynik , Edi Shmueli
IPC分类号: G06F9/46
CPC分类号: G06F9/4843
摘要: The present invention provides a fault tolerant system and method for parallel job execution. In the proposed solution the job state and the state transition control are decoupled. The job execution infrastructure maintains the state information for all the executing jobs, and the job control units, one per-job, control the state transitions of their jobs. Due to the stateless nature of the control units, the system and method allow jobs to continue uninterrupted execution even when the corresponding control units fail.
摘要翻译: 本发明提供了用于并行作业执行的容错系统和方法。 在提出的解决方案中,作业状态和状态转换控制被解耦。 作业执行基础架构维护所有执行作业的状态信息,作业控制单元(每个作业一个)控制其作业的状态转换。 由于控制单元的无状态,即使相应的控制单元出现故障,系统和方法也允许作业继续执行不间断的执行。
-
公开(公告)号:US08291419B2
公开(公告)日:2012-10-16
申请号:US11535083
申请日:2006-09-26
申请人: Yariv Aridor , Tamar Domany , Yevgeny Kliteynik , Edi Shmueli
发明人: Yariv Aridor , Tamar Domany , Yevgeny Kliteynik , Edi Shmueli
IPC分类号: G06F9/46 , G06F15/16 , G06F15/173 , G06F11/00
CPC分类号: G06F9/4843
摘要: The present invention provides a fault tolerant system and method for parallel job execution. In the proposed solution the job state and the state transition control are decoupled. The job execution infrastructure maintains the state information for all the executing jobs, and the job control units, one per-job, control the state transitions of their jobs. Due to the stateless nature of the control units, the system and method allow jobs to continue uninterrupted execution even when the corresponding control units fail.
摘要翻译: 本发明提供了用于并行作业执行的容错系统和方法。 在提出的解决方案中,作业状态和状态转换控制被解耦。 作业执行基础架构维护所有执行作业的状态信息,作业控制单元(每个作业一个)控制其作业的状态转换。 由于控制单元的无状态,即使相应的控制单元出现故障,系统和方法也允许作业继续执行不间断的执行。
-