Abstract:
A plurality of memory modules, which may be used to form a heterogeneous memory system, are connected to a plurality of prefetchers. Each prefetcher is independently configured to prefetch information from a corresponding one of the plurality of memory modules in response to feedback from the corresponding one of the plurality of memory modules.
Abstract:
According to one embodiment, a data storage device includes a first storage unit, a second storage unit, a first queue, a second queue, and a distributor. The second storage unit is used as a cache of the first storage unit and has a lower write transfer rate and a faster response time than the first storage unit. The first queue corresponds to the first storage unit. The second queue corresponds to the second storage unit. The distributor distributes a write command received presently from a host to one of the first and second queues in which the number of write commands registered presently is smaller.
Abstract:
In one embodiment, the present invention is directed to a processor having a plurality of cores and a cache memory coupled to the cores and including a plurality of partitions. The processor can further include a logic to dynamically vary a size of the cache memory based on a memory boundedness of a workload executed on at least one of the cores. Other embodiments are described and claimed.
Abstract:
A translation lookaside buffer (TLB) stores translation entries. The translation entries include a virtual address, a physical address and a memory local/not-local flag. When a processor is in a low power/local memory mode a virtual address is received. A matching translation entry has a local/not-local flag. Upon the local/not-local flag indicating the physical address of the matching translation entry being outside the local memory, an out-of-access-range memory access exception is generated.
Abstract:
A system and method for allocating different temperature data to storage devices within a computer system including inexpensive non-volatile storage, such as hard disk drive (HDD) storage devices; expensive non-volatile storage, such as solid-state drive (SSD) storage devices; and expensive volatile storage, such as system cache memory. The system and method allocates cold to warm data having access frequencies up to a first access frequency threshold to inexpensive non-volatile storage; allocates hot data having access frequencies greater than the first access frequency value and ranging up to a second access frequency threshold, to expensive non-volatile storage; and allocates very hot data having access frequencies greater than the second access frequency value and which resides during normal system operation in expensive volatile storage, to said inexpensive non-volatile storage.
Abstract:
Processing of an instruction fetch from an instruction cache is provided, which includes: determining whether the next instruction fetch is in a same cache line of the instruction cache as a last instruction fetch; and based, at least in part, on determining that the next instruction fetch is in the same cache line, suppressing for the next instruction fetch one or more instruction cache-related directory accesses, and forcing for the next instruction an address match signal for the same cache line. The suppressing may include generating a known-to-hit signal where the next fetch is in the same cache line, and the last fetch is not a branch instruction, and issuing an instruction cache hit where a cache line segment of the same cache line having the next instruction has a valid validity bit, the valid validity bit having been retrieved and maintained based on a most-recent, instruction cache-directory-accessed fetch.
Abstract:
A hardware prefetch tablewalk system for a microprocessor including a tablewalk engine that is configured to perform hardware prefetch tablewalk operations without blocking software-based tablewalk operations. Tablewalk requests include a priority value, in which the tablewalk engine is configured to compare priorities of requests in which a higher priority request may terminate a current tablewalk operation. Hardware prefetch tablewalk requests having the lowest possible priority so that they do not bump higher priority tablewalk operations and are bumped by higher priority tablewalk requests. The priority values may be in the form of age values indicative of relative ages of operations being performed. The microprocessor may include a hardware prefetch engine that performs boundless hardware prefetch pattern detection that is not limited by page boundaries to provide the hardware prefetch tablewalk requests.
Abstract:
Provided are methods and systems for de-duplicating cache lines in physical memory by detecting cache line data patterns and building a link-list between multiple physical addresses and their common data value. In this manner, the methods and systems are applied to achieve de-duplication of an on-chip cache. A cache line filter includes one table that defines the most commonly duplicated content patterns and a second table that saves pattern numbers from the first table and the physical address for she duplicated cache line. Since a cache line duplicate can be detected during a write operation, each write can involve table lookup and comparison. If there is a hit in the table, only the address is saved instead of the entire data string.
Abstract:
A method of managing peripherals is performed in a device coupled to a processor in a computer system. In the method, information associated with I/O activity for one or more peripherals is recorded in a first segment of a log. A second segment of the log is identified based on a next-segment pointer associated with the first segment of the log. In response to detecting a lack of available capacity in the first segment of the log, information associated with further I/O activity for the one or more peripherals is recorded in the second segment of the log.
Abstract:
An arithmetic processing device promotes transmission efficiency between a processor and a memory. The arithmetic processing device has an arithmetic processing unit which issues an instruction accompanying with data which is sent to the memory, a judgment unit which judges whether or not a redundancy degree of the data which is accompanied with the instruction is more than a predetermined value, a compression unit which judges whether or not compress the data based on an waiting time and a compression time when the redundancy degree of the data is more than the predetermined value, and compress the data when judging that performs the compression, and an instruction arbitration unit which transfers the instruction accompanying with the compressed data to the memory when the compression unit performs the compression and transfers the instruction accompanying with the non-compressed data to the memory when the compression unit does not perform the compression.