摘要:
Virtual-machine-monitor operation and implementation is facilitated by number of easily implemented features and extensions added to the features of a processor architecture. These features, one or more of which are used in various embodiments of the present invention, include a vmsw instruction that provides a means for transitioning between virtualization mode and non-virtualization mode without an interruption, a virtualization fault that faults on an attempt by a priority-0 routine in virtualization mode attempting to execute a privileged instruction, and a flexible highest-implemented-address mechanism to partition virtual address space into a virtualization address space and a non-virtualization address space.
摘要:
Virtual-machine-monitor operation and implementation is facilitated by number of easily implemented features and extensions added to the features of a processor architecture. These features, one or more of which are used in various embodiments of the present invention, include a vmsw instruction that provides a means for transitioning between virtualization mode and non-virtualization mode without an interruption, a virtualization fault that faults on an attempt by a priority-0 routine in virtualization mode attempting to execute a privileged instruction, and a flexible highest-implemented-address mechanism to partition virtual address space into a virtualization address space and a non-virtualization address space.
摘要:
A method and apparatus that generates a simplified, localized version (“a local stall”) of a global stall to improve the performance of a pipelined microprocessor. The local stall is generated when a data-dependency hazard is detected for a local consumer. Utilizing circuitry used in the pipelined microprocessor's data-forwarding circuitry, the local stall is generated with a relatively minor increase in circuitry. The local stall is generated much sooner than the global stall, arriving much sooner in a local pipeline. The local pipeline utilizes the local stall to override the global stall, when appropriate, and to ensure that correct data is read for a local consumer and to operate more efficiently than a standard pipeline without a local stall.
摘要:
A processing unit of the invention has multiple instruction pipelines for processing multi-threaded instructions. Each thread may have an urgency associated with its program instructions. The processing unit has a thread switch controller to monitor processing of instructions through the various pipelines. The thread controller also controls switch events to move from one thread to another within the pipelines. The controller may modify the urgency of any thread such as by issuing an additional instruction. The thread controller preferably utilizes certain heuristics in making switch event decisions. A time slice expiration unit may also monitor expiration of threads for a given time slice.
摘要:
In a computer system including a plurality of resources, a device receives a request from a software program to access a specified one of the plurality of resources, determines whether the specified one of the plurality of resources is a protected resource. If the specified one of the plurality of resources is a protected resource, the device denies the request if the computer system is operating in a protected mode of operation, and processes the request based on access rights associated with the software program if the computer system is not operating in the protected mode of operation.
摘要:
Systems, methodologies, media, and other embodiments associated with mitigating the effects of context switch cache and TLB misses are described. One exemplary system embodiment includes a processor configured to run a multiprocessing, virtual memory operating system. The processor may be operably connected to a memory and may include a cache and a translation lookaside buffer (TLB) configured to store TLB entries. The exemplary system may include a context control logic configured to selectively copy data from the TLB to the data store for a first process being swapped out of the processor and to selectively copy data from the data store to the TLB for a second process being swapped into to the processor.
摘要:
In one embodiment of the present invention, a computer-implemented method is provided for use in a computer system including a plurality of resources. The plurality of resources include protected resources and unprotected resources. The unprotected resources include critical resources and non-critical resources. The method includes steps of: (A) receiving a request from a software program to access a specified one of the unprotected resources; (B) granting the request if the computer system is operating in a non-protected mode of operation; and (C) if the computer system is operating in a protected mode of operation, performing a step of denying the request if the computer system is not operating in a protected diagnostic mode of operation.
摘要:
In a computer system including a plurality of resources, techniques are disclosed for receiving a request from a software program to access a specified one of the plurality of resources, determining whether the specified one of the plurality of resources is a protected resource, and, if the specified one of the plurality of resources is a protected resource, for denying the request if the computer system is operating in a protected mode of operation, and processing the request based on access rights associated with the software program if the computer system is not operating in the protected mode of operation.
摘要:
In accordance with embodiments disclosed herein, there are provided methods, systems, mechanisms, techniques, and apparatuses for implementing a balanced P-LRU tree for a “multiple of 3” number of ways cache. For example, in one embodiment, such means may include an integrated circuit having a cache and a plurality of ways. In such an embodiment the plurality of ways include a quantity that is a multiple of three and not a power of two, and further in which the plurality of ways are organized into a plurality of pairs. In such an embodiment, means further include a single bit for each of the plurality of pairs, in which each single bit is to operate as an intermediate level decision node representing the associated pair of ways and a root level decision node having exactly two individual bits to point to one of the single bits to operate as the intermediate level decision nodes representing an associated pair of ways. In this exemplary embodiment, the total number of bits is N−1, wherein N is the total number of ways in the plurality of ways. Alternative structures are also presented for full LRU implementation, a “multiple of 5” number of cache ways, and variations of the “multiple of 3” number of cache ways.
摘要:
In an embodiment, a page miss handler includes paging caches and a first walker to receive a first linear address portion and to obtain a corresponding portion of a physical address from a paging structure, a second walker to operate concurrently with the first walker, and a logic to prevent the first walker from storing the obtained physical address portion in a paging cache responsive to the first linear address portion matching a corresponding linear address portion of a concurrent paging structure access by the second walker. Other embodiments are described and claimed.