摘要:
Call stack protection, including executing at least one application program on the one or more computer processors, including initializing threads of execution, each thread having a call stack, each call stack characterized by a separate guard area defining a maximum extent of the call stack, dispatching one of the threads of the process, including loading a guard area specification for the dispatched thread's call stack guard area from thread context storage into address comparison registers of a processor; determining by use of address comparison logic in dependence upon a guard area specification for the dispatched thread whether each access of memory by the dispatched thread is a precluded access of memory in the dispatched thread's call stack's guard area; and effecting by the address comparison logic an address comparison interrupt for each access of memory that is a precluded access of memory in the dispatched thread's guard area.
摘要:
An apparatus and method for evaluating a state of an electronic or integrated circuit (IC), each IC including one or more processor elements for controlling operations of IC sub-units, and each the IC supporting multiple frequency clock domains. The method comprises: generating a synchronized set of enable signals in correspondence with one or more IC sub-units for starting operation of one or more IC sub-units according to a determined timing configuration; counting, in response to one signal of the synchronized set of enable signals, a number of main processor IC clock cycles; and, upon attaining a desired clock cycle number, generating a stop signal for each unique frequency clock domain to synchronously stop a functional clock for each respective frequency clock domain; and, upon synchronously stopping all on-chip functional clocks on all frequency clock domains in a deterministic fashion, scanning out data values at a desired IC chip state. The apparatus and methodology enables construction of a cycle-by-cycle view of any part of the state of a running IC chip, using a combination of on-chip circuitry and software.
摘要:
In an embodiment, if a self thread has more than one conflict, a transaction of the self thread is aborted and restarted. If the self thread has only one conflict and an enemy thread of the self thread has more than one conflict, the transaction of the self thread is committed. If the self thread only conflicts with the enemy thread and the enemy thread only conflicts with the self thread and the self thread has a key that has a higher priority than a key of the enemy thread, the transaction of the self thread is committed. If the self thread only conflicts with the enemy thread, the enemy thread only conflicts with the self thread, and the self thread has a key that has a lower priority than the key of the enemy thread, the transaction of the self thread is aborted.
摘要:
Executing application function calls in response to an interrupt including creating a thread; receiving an interrupt having an interrupt type; determining whether a value of a semaphore represents that interrupts are disabled; if the value of the semaphore represents that interrupts are not disabled: calling, by the thread, one or more preconfigured functions in dependence upon the interrupt type of the interrupt; yielding the thread; and if the value of the semaphore represents that interrupts are disabled: setting the value of the semaphore to represent to a kernel that interrupts are hard-disabled; and hard-disabling interrupts at the kernel.
摘要:
Executing application function calls in response to an interrupt including creating a thread; receiving an interrupt having an interrupt type; determining whether a value of a semaphore represents that interrupts are disabled; if the value of the semaphore represents that interrupts are not disabled: calling, by the thread, one or more preconfigured functions in dependence upon the interrupt type of the interrupt; yielding the thread; and if the value of the semaphore represents that interrupts are disabled: setting the value of the semaphore to represent to a kernel that interrupts are hard-disabled; and hard-disabling interrupts at the kernel.
摘要:
A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC). Each ASIC computing node comprises a system-on-chip ASIC utilizing four or more processors integrated into one die, with each having full access to all system resources and enabling adaptive partitioning of the processors to functions such as compute or messaging I/O on an application by application basis, and preferably, enable adaptive partitioning of functions in accordance with various algorithmic phases within an application, or if I/O or other processors are underutilized, then can participate in computation or communication nodes are interconnected by a five dimensional torus network with DMA that optimally maximize the throughput of packet communications between nodes and minimize latency.
摘要翻译:具有100 petaOPS规模计算的多Petascale高效并行超级计算机,其成本,功耗和占地面积都在降低,并且允许从互连角度来看处理节点的最大封装密度。 超级计算机利用了VLSI的技术进步,实现了许多处理器可以集成到单个专用集成电路(ASIC)中的计算模型。 每个ASIC计算节点包括利用集成到一个管芯中的四个或更多个处理器的片上系统ASIC,每个处理器具有对所有系统资源的完全访问,并且使得处理器能够对诸如计算或消息传递I / O 并且优选地,根据应用内的各种算法阶段实现功能的自适应分割,或者如果I / O或其他处理器未被充分利用,则可以参与计算或通信节点通过五维环面网络互连 使用DMA来最大限度地最大化节点之间的分组通信的吞吐量并最小化等待时间。
摘要:
An apparatus and method for evaluating a state of an electronic or integrated circuit (IC), each IC including one or more processor elements for controlling operations of IC sub-units, and each the IC supporting multiple frequency clock domains. The method comprises: generating a synchronized set of enable signals in correspondence with one or more IC sub-units for starting operation of one or more IC sub-units according to a determined timing configuration; counting, in response to one signal of the synchronized set of enable signals, a number of main processor IC clock cycles; and, upon attaining a desired clock cycle number, generating a stop signal for each unique frequency clock domain to synchronously stop a functional clock for each respective frequency clock domain; and, upon synchronously stopping all on-chip functional clocks on all frequency clock domains in a deterministic fashion, scanning out data values at a desired IC chip state. The apparatus and methodology enables construction of a cycle-by-cycle view of any part of the state of a running IC chip, using a combination of on-chip circuitry and software.
摘要:
An apparatus and method for controlling power usage in a computer includes a plurality of computers communicating with a local control device, and a power source supplying power to the local control device and the computer. A plurality of sensors communicate with the computer for ascertaining power usage of the computer, and a system control device communicates with the computer for controlling power usage of the computer.
摘要:
In an embodiment, if a self thread has more than one conflict, a transaction of the self thread is aborted and restarted. If the self thread has only one conflict and an enemy thread of the self thread has more than one conflict, the transaction of the self thread is committed. If the self thread only conflicts with the enemy thread and the enemy thread only conflicts with the self thread and the self thread has a key that has a higher priority than a key of the enemy thread, the transaction of the self thread is committed. If the self thread only conflicts with the enemy thread, the enemy thread only conflicts with the self thread, and the self thread has a key that has a lower priority than the key of the enemy thread, the transaction of the self thread is aborted.
摘要:
A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC). Each ASIC computing node comprises a system-on-chip ASIC utilizing four or more processors integrated into one die, with each having full access to all system resources and enabling adaptive partitioning of the processors to functions such as compute or messaging I/O on an application by application basis, and preferably, enable adaptive partitioning of functions in accordance with various algorithmic phases within an application, or if I/O or other processors are underutilized, then can participate in computation or communication nodes are interconnected by a five dimensional torus network with DMA that optimally maximize the throughput of packet communications between nodes and minimize latency.
摘要翻译:具有100 petaOPS规模计算的多Petascale高效并行超级计算机,其成本,功耗和占地面积都在降低,并且允许从互连角度来看处理节点的最大封装密度。 超级计算机利用了VLSI的技术进步,实现了许多处理器可以集成到单个专用集成电路(ASIC)中的计算模型。 每个ASIC计算节点包括利用集成到一个管芯中的四个或更多个处理器的片上系统ASIC,每个处理器具有对所有系统资源的完全访问,并且使得处理器能够对诸如计算或消息传递I / O 并且优选地,根据应用内的各种算法阶段实现功能的自适应分割,或者如果I / O或其他处理器未被充分利用,则可以参与计算或通信节点通过五维环面网络互连 使用DMA来最大限度地最大化节点之间的分组通信的吞吐量并最小化等待时间。