发明授权
- 专利标题: Recording a communication pattern and replaying messages in a parallel computing system
- 专利标题(中): 记录通信模式并在并行计算系统中重播消息
-
申请号: US12500715申请日: 2009-07-10
-
公开(公告)号: US08407376B2公开(公告)日: 2013-03-26
- 发明人: Philip Heidelberger , Sameer Kumar
- 申请人: Philip Heidelberger , Sameer Kumar
- 申请人地址: US NY Armonk
- 专利权人: International Business Machines Corporation
- 当前专利权人: International Business Machines Corporation
- 当前专利权人地址: US NY Armonk
- 代理机构: Ryan, Mason & Lewis, LLP
- 主分类号: G06F13/28
- IPC分类号: G06F13/28
摘要:
A parallel computer system includes a plurality of compute nodes. Each of the compute nodes includes at least one processor, at least one memory, and a direct memory address engine coupled to the at least one processor and the at least one memory. The system also includes a network interconnecting the plurality of compute nodes. The network operates a global message-passing application for performing communications across the network. Local instances of the global message-passing application operate at each of the compute nodes to carry out local processing operations independent of processing operations carried out at another one of the compute nodes. The direct memory address engines are configured to interact with the local instances of the global message-passing application via injection FIFO metadata describing an injection FIFO in a corresponding one of the memories. The local instances of the global message passing application are configured to record, in the injection FIFO in the corresponding one of the memories, message descriptors associated with messages of an arbitrary communication pattern in an iteration of an executing application program. The local instances of the global message passing application are configured to replay the message descriptors during a subsequent iteration of the executing application program.
公开/授权文献
信息查询