摘要:
Systems and methods according to the present invention provide techniques which modify programs having barrier statements. Dependence relations between statements, and enforcement associations between the barrier statements and the dependence relations, in the program are identified. The dependence relations are classified as being either enforceable by point-to-point synchronization or not enforceable by point-to-point synchronization. A subset of the barrier statements, which will enforce those dependence relations that are unenforceable by point-to-point synchronization, are determined. Other(s) of the barrier statements are replaced with a point-to-point synchronization routine.
摘要:
A technique includes identifying an address of a head end of a queue and monitoring a coherent interconnect to identify a data transfer that is communicated by a producer, which targets the address. The technique includes storing the data of the data transfer in the queue and selectively storing at least a portion of the data in a head-of-queue cache memory based at least in part on whether the monitoring identifies the address. At least a portion of the data is selectively retrieved from the head-of-queue cache memory instead of from the queue for transmission to a consumer.
摘要:
Message queue tuning is disclosed. A communication routine is provided that is able to record a log entry for a message. A log is obtained by running a software application with the communication routine. The log is evaluated to determine a desired value of a tuning knob for a message queue parameter of the communication routine, and the tuning knob is adjusted to the desired value for running the software application with the communication routine.
摘要:
A method of scheduling optional instructions in a compiler targets a processor. The scheduling includes indicating a limit on the additional processor computations that are available for executing an optional code, generating one or more required instructions corresponding to a source code and one or more optional instructions corresponding to the optional code used with the source code and scheduling all of the one or more required instructions with as many of the one or more optional instructions as possible without exceeding the indicated limit on the additional processor computations for executing the optional code.
摘要:
Techniques for modifying applications to implement memory allocation are disclosed. The application is executed using a default memory allocation scheme. A log is generated that identifies which memory addresses are requested by which instructions of the application. The log is evaluated to identify changes to be made to the default memory allocation scheme and, after execution, the application is modified by adding instructions to implement the identified changes.
摘要:
Disclosed are embodiments of a method and system for calculating an unrolling factor for software loops. The unrolling factor may be calculated by applying a formula that takes into account issue constraints of a processor. The issue constraints may include the total issue width of the processor, and may also include individual issue constraints for each instruction type. The software loop may be unrolled by the calculated unrolling factor and may be software pipelined. Other embodiments are also described and claimed.
摘要:
In one aspect, machine-executable code is generated. The machine-executable code includes machine-readable instructions for detecting a memory address bounds violation by the program code based on a determination that a boundary memory address stored in a hardware table has been accessed during execution of the program code. The boundary memory address delimits a boundary for a set of memory addresses allocated for execution of the program code. The machine-executable code is stored in a machine-readable medium. In another aspect, a boundary memory address delimiting a boundary for a set of memory addresses allocated for execution of the program code is stored in a hardware table. The program code is executed. A memory address bounds violation by the program code is detected based on a determination that the boundary memory address stored in the hardware table has been accessed during execution of the program code.
摘要:
Disclosed are embodiments of a compiler, methods, and system for resource-aware scheduling of instructions. A list scheduling approach is augmented to take into account resource constraints when determining priority for scheduling of instructions. Other embodiments are also described and claimed.
摘要:
Methods and systems for prioritizing virtual network interface controllers (VNICs) are described. Each VNIC is assigned a priority level and a maximum current priority level associated with VNICs which are requesting service is determined. Fairness is enforced by using a round robin approach to selection among those currently requesting VNICs which have the same, maximum current priority level.
摘要:
Methods and systems for communicating between network interface controllers (NICs) in networked systems are described. Enhanced command functionality for NICs include the ability to perform sequences of operations and/or conditional operations. Messages can be used to communicate embedded commands which are interpreted by NICs to enhance their functionality.