摘要:
Visual representations of multiple call stacks in a parallel programming system include a stack segments graph constructed by coalescing data from multiple stacks. The graph has nodes that represent stack segments and has arcs between adjacent segments. Similar stack frames are represented by the same node. In a stack prefix view of the graph, arcs are directed from a node representing stack frames to a node representing subsequently executed stack frames. In a method-centered view, an arc is shown between a node representing stack frames of a selected method and a node representing adjacent stack frames. The graph can be based on call stacks of all tasks or all threads, or based on call stacks of tasks or threads flagged by a user. Stack frame, thread, and/or task details are also displayed.
摘要:
Visual representations of multiple call stacks in a parallel programming system include a stack segments graph constructed by coalescing data from multiple stacks. The graph has nodes that represent stack segments and has arcs between adjacent segments. Similar stack frames are represented by the same node. In a stack prefix view of the graph, arcs are directed from a node representing stack frames to a node representing subsequently executed stack frames. In a method-centered view, an arc is shown between a node representing stack frames of a selected method and a node representing adjacent stack frames. The graph can be based on call stacks of all tasks or all threads, or based on call stacks of tasks or threads flagged by a user. Stack frame, thread, and/or task details are also displayed.
摘要:
A technology is described for debugging in a cluster processing network. A scheduler can dispatch a process that is part of the cluster job for execution. Further, a compute node can be used to execute the process dispatched by the scheduler to the compute node. A debugger can be activated in response to an unhandled suspension event in the process on the compute node. In addition, the debugger can send notification messages regarding the unhandled suspension event. A job monitor can receive a notification from the debugger that an unhandled suspension event has occurred. The notification can be displayed to a user via the job monitor.
摘要:
A technology is described for debugging in a cluster processing network. A scheduler can dispatch a process that is part of the cluster job for execution. Further, a compute node can be used to execute the process dispatched by the scheduler to the compute node. A debugger can be activated in response to an unhandled suspension event in the process on the compute node. In addition, the debugger can send notification messages regarding the unhandled suspension event. A job monitor can receive a notification from the debugger that an unhandled suspension event has occurred. The notification can be displayed to a user via the job monitor.
摘要:
Technology is described for debugging in a multi-processor environment. An example system can include a plurality of process icons representing processes executing on compute nodes. A plurality of relationship arc icons between the process icons can represent messages being sent between source processes and destination processes on the compute nodes. A tabular display control can have rows to display attributes for relationship arc icons representing the messages being sent. In addition, a grouping module can be used to identify groups of messages that are related and to highlight relationship arc icons which are part of a group.
摘要:
Technology is described for debugging in a multi-processor environment. An example system can include a plurality of process icons representing processes executing on compute nodes. A plurality of relationship arc icons between the process icons can represent messages being sent between source processes and destination processes on the compute nodes. A tabular display control can have rows to display attributes for relationship arc icons representing the messages being sent. In addition, a grouping module can be used to identify groups of messages that are related and to highlight relationship arc icons which are part of a group.
摘要:
Launching a debugging process. A method includes at a compute node on a cluster private network, receiving a debug job via a scheduler of a head node from a client on a public network. The head node is connected to both the cluster private network and the public network. The public network is external to the cluster private network. The method further includes beginning processing the debug job, and as a result initiating debugging by starting one or more debugger remote agents at the compute node. The method further includes beginning processing a user job in the presence of the started debugger remote agents at the compute node. The client is informed that the one or more debugger remote agents are ready to debug the user job. A debugger client at the client is connected to the one or more debugger remote agents.
摘要:
The display of a debugging interface for use with parallel computing. When a break state has been entered in a particular code context (such as a method) by a particular execution context (such as a thread), related execution contexts are found that were also executing in the particular code context. While in the break state, multiple expressions are then evaluated for each of the execution contexts. The results are then displayed with perhaps navigation controls that allow the results to be efficiently navigated.
摘要:
Launching a debugging process. A method includes at a compute node on a cluster private network, receiving a debug job via a scheduler of a head node from a client on a public network. The head node is connected to both the cluster private network and the public network. The public network is external to the cluster private network. The method further includes beginning processing the debug job, and as a result initiating debugging by starting one or more debugger remote agents at the compute node. The method further includes beginning processing a user job in the presence of the started debugger remote agents at the compute node. The client is informed that the one or more debugger remote agents are ready to debug the user job. A debugger client at the client is connected to the one or more debugger remote agents.
摘要:
The display of a debugging interface for use with parallel computing. When a break state has been entered in a particular code context (such as a method) by a particular execution context (such as a thread), related execution contexts are found that were also executing in the particular code context. While in the break state, multiple expressions are then evaluated for each of the execution contexts. The results are then displayed with perhaps navigation controls that allow the results to be efficiently navigated.