摘要:
A method, system, and computer program product for on-chip control of thermal cycling in an integrated circuit (IC) are provided in the illustrative embodiments. A first circuit is configured on the IC for adjusting a first voltage being applied to a first part of the IC. A first temperature of the first part is measured at a first time. A determination is made that the first temperature is outside a temperature range defined by an upper temperature threshold and a lower temperature threshold. The first voltage is adjusted by reducing the first voltage when the first temperature exceeds the upper temperature threshold and by increasing the first voltage when the first temperature is below the lower temperature threshold, thereby causing the first temperature of the first part to attain a value within the temperature range.
摘要:
A method for fast remote communication and computation between processors is provided in the illustrative embodiments. A direct core to core communication unit (DCC) is configured to operate with a first processor, the first processor being a remote processor. A memory associated with the DCC receives a set of bytes, the set of bytes being sent from a second processor. An operation specified in the set of bytes is executed at the remote processor such that the operation is invoked without causing a software thread to execute.
摘要:
Methods, systems, and media for reducing memory latency seen by processors by providing a measure of control over on-chip memory (OCM) management to software applications, implicitly and/or explicitly, via an operating system are contemplated. Many embodiments allow part of the OCM to be managed by software applications via an application program interface (API), and part managed by hardware. Thus, the software applications can provide guidance regarding address ranges to maintain close to the processor to reduce unnecessary latencies typically encountered when dependent upon cache controller policies. Several embodiments utilize a memory internal to the processor or on a processor node so the memory block used for this technique is referred to as OCM.
摘要:
Methods, systems, and media for reducing memory latency seen by processors by providing a measure of control over on-chip memory (OCM) management to software applications, implicitly and/or explicitly, via an operating system are contemplated. Many embodiments allow part of the OCM to be managed by software applications via an application program interface (API), and part managed by hardware. Thus, the software applications can provide guidance regarding address ranges to maintain close to the processor to reduce unnecessary latencies typically encountered when dependent upon cache controller policies. Several embodiments utilize a memory internal to the processor or on a processor node so the memory block used for this technique is referred to as OCM.
摘要:
A method for allocating memory in a data processing system in which a configuration table indicative of the system's physical memory is generated following a boot event. The configuration table is then modified to identify a portion of the system's physical memory thereby hiding the remaining portion from the operating system. Subsequently, a memory allocation request is initiated by an application program. A device driver invoked by the application program then maps physical memory from the hidden portion to the application's virtual address space to satisfy the application request. The application program may be executing on a first node of a multi-node system in which each node is associated with its own local memory, In this embodiment, the node on which the allocated physical memory is located may be derived from the allocation request thereby facilitating application level, allocation of specified portions of physical memory.
摘要:
A method and system for reliably and consistently delivering client requests in a computer network having at least one client connectable to one or more servers among a group of servers, wherein each server among the group of servers replicates a particular network service to ensure that the particular network service remains uninterrupted in the event of a server failure. A particular server is designated among the group of servers to manage client requests which seek to update a particular network service state, prior to any receipt of a client request which seeks to update the particular network service state by any remaining servers among the group of servers. Thereafter, an executable order is specified in which client requests which seek to update the particular network service state are processed among the remaining servers, such that the executable order, upon execution, sequences the client request which seeks to update the particular network service state with respect to all prior and subsequent client requests. The executable order and the client request which seeks to update the particular network service state are automatically transferred to the remaining servers from the particular server, in response to initiating the client request. Thereafter, the client request which seeks to update the particular network service state is processed in a tentative mode at the particular server without waiting for the executable order to be executed through to completion among the remaining servers.
摘要:
A method for a framework for scheduling tasks in a multi-core processor or multiprocessor system is provided in the illustrative embodiments. A thread is selected according to an order in a scheduling discipline, the thread being a thread of an application executing in the data processing system, the thread forming the leader thread in a bundle of threads. A value of a core attribute in a set of core attributes is determined according to a corresponding thread attribute in a set of thread attributes associated with the leader thread. A determination is made whether a second thread can be added to the bundle such that the bundle including the second thread will satisfy a policy. If the determining is affirmative, the second thread is added to the bundle. The bundle is scheduled for execution using a core of the multi-core processor.
摘要:
A method for fast remote communication and computation between processors is provided in the illustrative embodiments. A direct core to core communication unit (DCC) is configured to operate with a first processor, the first processor being a remote processor. A memory associated with the DCC receives a set of bytes, the set of bytes being sent from a second processor. An operation specified in the set of bytes is executed at the remote processor such that the operation is invoked without causing a software thread to execute.
摘要:
A system, and computer usable program product for a framework for scheduling tasks in a multi-core processor or multiprocessor system are provided in the illustrative embodiments. A thread is selected according to an order in a scheduling discipline, the thread being a thread of an application executing in the data processing system, the thread forming the leader thread in a bundle of threads. A value of a core attribute in a set of core attributes is determined according to a corresponding thread attribute in a set of thread attributes associated with the leader thread. A determination is made whether a second thread can be added to the bundle such that the bundle including the second thread will satisfy a policy. If the determining is affirmative, the second thread is added to the bundle. The bundle is scheduled for execution using a core of the multi-core processor.
摘要:
A method and system for accelerating recovery in an MPI environment are provided in the illustrative embodiments. A first portion of a distributed application executes using a first processor and a second portion using a second processor in a distributed computing environment. After a failure of operation of the first portion, the first portion is restored to a checkpoint. A first part of the first portion is distributed to a third processor and a second part to a fourth processor. A computation of the first portion is performed using the first and the second parts in parallel. A first message is computed in the first portion and sent to the second portion, the message having been initially computed after a time of the checkpoint. A second message is replayed from the second portion without computing the second message in the second portion.