摘要:
An allocator assigns entries for a circular buffer. The allocator receives requests for storing data in entries of the circular buffer, and generates a head pointer to identify a starting entry in the circular buffer for which circular buffer entries are not allocated. In addition to pointing to an entry in the circular buffer, the head pointer includes a wrap bit. The allocator toggles the wrap bit each time the allocator traverses the linear queue of the circular buffer. A tail pointer is generated, including the wrap bit, to identify an ending entry in the circular buffer for which circular buffer entries are allocated. In response to the request for entries, the allocator sequentially assigns entries for the requests located between the head pointer and the tail pointer. The allocator has application for use in a microprocessor performing out-of-order dispatch anti speculative execution. The allocator is coupled to a reorder buffer, configured as a circular buffer, to permit allocation of entries. The allocator utilizes an all or nothing allocation policy, such that either all or no incoming instructions are allocated during an allocation period.
摘要:
A circuit comprising a number of address range registers and complimentary decoding/matching circuits is provided to a processor for determining the memory type of a physical address, thereby allowing memory type to be determined as soon as the physical address is available in an execution stage preceding cache access. Additionally, a memory type field is provided to each address translation lookaside buffer entry of the data and instruction memory subsystem for storing the determined memory type. The memory type determination circuit is disposed in the page miss handler, thereby allowing memory type to be determined at the same time when the physical address is determined.
摘要:
Entries in a reservation station are efficiently scanned to find data-ready instructions for dispatch. A pseudo-FIFO scheduling approach is implemented wherein, rather than scanning every entry in the reservation station, the reservation station is segmented into groups of entries with each entry being scanned to determine which has the oldest entry in it. It is from the group of entries having the oldest entry that a ready pointer is cycled to search for data-ready instructions for dispatch to waiting execution units.
摘要:
A system and method are described for integrating a memory and storage hierarchy including a non-volatile memory tier within a computer system. In one embodiment, PCMS memory devices are used as one tier in the hierarchy, sometimes referred to as “far memory.” Higher performance memory devices such as DRAM placed in front of the far memory and are used to mask some of the performance limitations of the far memory. These higher performance memory devices are referred to as “near memory.” In one embodiment, the “near memory” is configured to operate in a plurality of different modes of operation including (but not limited to) a first mode in which the near memory operates as a memory cache for the far memory and a second mode in which the near memory is allocated a first address range of a system address space with the far memory being allocated a second address range of the system address space, wherein the first range and second range represent the entire system address space.
摘要:
Examples are disclosed for establishing a window for a queue structure maintained in a cache for a processing element for a network device. The processing element may be configured to operate in cooperation with an input/output device such as a network interface card. In some of these examples, the window may include portions of the queue structure having identifiers to active allocated buffers maintained in memory for the network device. The active allocated buffers may be configured to maintain or store data received or to be forwarded by the input/output device. For these examples, the window may be adjusted based on information gathered while the identifiers are read from or written to the portions of the queue structure.
摘要:
A technology for implementing a method for a machine check architecture environment. A method of the disclosure includes obtaining an occurrence of an error. The occurrence of the error causes a non-microcoded processing device to enter an error monitoring state. The method further processes the error using a dedicated memory portion for the error monitoring state while the non-microcoded processing device is in the error monitoring state. The error monitoring state is dedicated to error processing. The method further determines information associated with the error. The information associated with the error is in a predefined format.
摘要:
A method and apparatus are disclosed for staggering execution of an instruction. According to one embodiment of the invention, a single macro instruction is received wherein the single macro instruction specifies at least two logical registers and wherein the two logical registers respectively store a first and second packed data operands having corresponding data elements. An operation specified by the single macro instruction is then performed independently on a first and second plurality of the corresponding data elements from said first and second packed data operands at different times using the same circuit to independently generate a first and second plurality of resulting data elements. The first and second plurality of resulting data elements are stored in a single logical register as a third packed data operand.
摘要:
A method and apparatus are disclosed for staggering execution of an instruction. According to one embodiment of the invention, a single macro instruction is received wherein the single macro instruction specifies at least two logical registers and wherein the two logical registers respectively store a first and second packed data operands having corresponding data elements. An operation specified by the single macro instruction is then performed independently on a first and second plurality of the corresponding data elements from said first and second packed data operands at different times using the same circuit to independently generate a first and second plurality of resulting data elements. The first and second plurality of resulting data elements are stored in a single logical register as a third packed data operand.
摘要:
Maximum throughput or “back-to-back” scheduling of dependent instructions in a pipelined processor is achieved by maximizing the efficiency in which the processor determines the availability of the source operands of a dependent instruction and provides those operands to an execution unit executing the dependent instruction. These two operations are implemented through a number of mechanisms. One mechanism for determining the availability of source operands, and hence the readiness of a dependent instruction for dispatch to an available execution unit, relies on the early setting of a source valid bit during allocation when a source operand is a retired or immediate value. This allows the ready logic of a reservation station to begin scheduling the instruction for dispatch.
摘要:
A cache memory is constituted with a data array and control logic. The data array includes a number of data lines, and the control logic operates to store a number of trace segments of instructions in the data lines, including trace segments that span multiple data lines. In one embodiment, each trace segment includes one or more trace segment members having one or more instructions, with each trace segment member occupying one data line, and the data lines of a multi-line trace segment being sequentially associated (logically). Retrieval of the trace segment members of a multi-line trace segment is accomplished by first locating the data line storing the first trace segment member of the trace segment, and then successively locating the remaining data lines storing the remaining trace segment members based on the data lines' logical sequential associations.