Abstract:
Instructions and logic provide user-level thread synchronization with MONITOR and MWAIT instructions. One or more model specific registers (MSRs) in a processor may be configured in a first execution state to specify support of a user-level thread synchronization architecture. Embodiments include multiple hardware threads or processing cores, corresponding monitored address state storage to store a last monitored address for each of a plurality of execution threads that issues a MONITOR request, cache memory to record MONITOR requests and associated states for addresses of memory storage locations, and responsive to receipt of an MWAIT request for the address, to record an associated wait-to-trigger state of monitored addresses for execution cores associated with an MWAIT request; wherein the execution core is to transition a requesting thread to an optimized sleep state responsive to the receipt of said MWAIT request when said one or more MSRs are configured in the first execution state.
Abstract:
A computer system comprises a processor unit arranged to run a hypervisor running one or more virtual machines; a cache connected to the processor unit and comprising a plurality of cache rows, each cache row comprising a memory address, a cache line and an image modification flag; and a memory connected to the cache and arranged to store an image of at least one virtual machine. The processor unit is arranged to define a log in the memory and the cache further comprises a cache controller arranged to set the image modification flag for a cache line modified by a virtual machine being backed up, but not for a cache line modified by the hypervisor operating in privilege mode; periodically check the image modification flags; and write only the memory address of the flagged cache rows in the defined log.
Abstract:
A computer system comprises a processor unit arranged to run a hypervisor running one or more virtual machines; a cache connected to the processor unit and comprising a plurality of cache rows, each cache row comprising a memory address, a cache line and an image modification flag; and a memory connected to the cache and arranged to store an image of at least one virtual machine. The processor unit is arranged to define a log in the memory and the cache further comprises a cache controller arranged to set the image modification flag for a cache line modified by a virtual machine being backed up, but not for a cache line modified by the hypervisor operating in privilege mode; periodically check the image modification flags; and write only the memory address of the flagged cache rows in the defined log.
Abstract:
Web conference performance monitoring systems enable presenters to monitor their audience's content receiving experience and modify their content's transmission characteristics to resolve technical difficulties. A system for monitoring a Web conference's performance includes a local processor; memory operably connected to the local processor; a monitor operably connected to the local processor; content loaded into memory and operable by the local processor; and an audience screen preview program loaded into the memory and operable by the local processor, wherein the audience screen preview program instructs the local processor to measure network throughput of a network connection between the local processor and a remote processor and display at least a portion of the content on the monitor operably connected to the local processor by simulating the content being transmitted to the monitor operably connected to the local processor over the network connection.
Abstract:
A data processing apparatus comprising a processor for executing a data processing process and a processor for executing a tuning process is disclosed. The data processing apparatus is arranged such that the tuning process which is a different process to the data processing process can access the parameters of speculative mechanisms of the data processing process and tune the parameters so that the mechanisms speculate differently and in this way the performance of this data processing process can be improved.
Abstract:
A dynamic performance profiler is operable to receive, in substantially real-time, raw performance data from a testing platform. A software-based image is executing on a target hardware platform (e.g., either simulated or actual) on the testing platform, and the testing platform monitors such execution to generate corresponding raw performance data, which is communicated, in substantially real-time, as it is generated during execution of the software-based image to a dynamic profiler. The dynamic profiler may be configured to archive select portions of the received raw performance data to data storage. As the raw performance data is received, the dynamic profiler analyzes the data to determine whether the performance of the software-based image on the target hardware platform violates a predefined performance constraint. When the performance constraint is violated, the dynamic profiler archives a portion of the received raw performance.
Abstract:
A system for, method of and computer program product captures performance- characteristic data from the execution of a program and models system performance based on that data. Performance-characterization data based on easily captured reuse distance metrics is targeted, defined as the total number of memory references between two accesses to the same piece of data. Methods for efficiently capturing this kind of metrics are described. These data can be refined into easily interpreted performance metrics, such as performance data related to caches with LRU replacement and random replacement strategies in combination with fully associative as well as limited associativity cache organizations. Methods for assessing cache utilization as well as parallel execution are covered.