摘要:
A multi-processor computer system is disclosed that reduces the occurrences of invalidate and copyback operations through a memory interconnect by disabling a first write optimization of a cache coherency protocol for data that is not likely to be written by a requesting processor. Such data include read-only code segments. The code segments, including instructions and data, are shared among the multiple processors. The requesting processor generates a Read to Share Always request upon a cache miss of a read-only datablock, and generates a Read to Share request otherwise. The Read to Share Always request results in the datablock stored in cache memory being labeled as in a "shared" state, while the Read to Share request results in the datablock being labeled as in an "exclusive" state.
摘要:
A computer system is disclosed including a memory subsystem and a processor subsystem having an external cache and an external mechanism for invalidating cached datablocks in the processor subsystem and for reducing false invalidation operations. The processor subsystem issues a write invalidate message to the memory subsystem that specifies a datablock and that includes an invalidate advisory indication that indicates whether the datablock is present in the external cache. The invalidate advisory indication determines whether the memory subsystem returns an invalidate message to the processor subsystem for the write invalidate operation.
摘要:
A multiprocessor computer system is provided having a multiplicity of sub-systems and a main memory coupled to a system controller. An interconnect module, interconnects the main memory and sub-systems in accordance with interconnect control signals received from the system controller. At least two of the sub-systems are data processors, each having a respective cache memory that stores multiple blocks of data and a respective master cache index. Each master cache index has a set of master cache tags (Etags), including one cache tag for each data block stored by the cache memory. Each data processor includes a master interface for sending memory transaction requests to the system controller and for receiving cache access requests from the system controller corresponding to memory transaction requests by other ones of the data processors. In the preferred embodiment, each memory transaction request is classified into one of two distinct master classes: a first transaction class including read memory access requests and a second transaction class including writeback memory access requests. The master interface and system controller have corresponding parallel request queues, one for each master class, for transmitting and receiving memory access requests. The system controller further includes memory transaction request logic for processing each memory transaction request and a duplicate cache index having a set of duplicate cache tags (Dtags), including one cache tag corresponding to each master cache tag in an associated data processor.
摘要:
A multiprocessor computer system has a multiplicity of sub-systems and a main memory coupled to a system controller. An interconnect module, interconnects the main memory and sub-systems in accordance with interconnect control signals received from the system controller. All of the sub-systems include a port that transmits and receives data as data packets of a fixed size. At least two of the sub-systems are data processors, each having a respective cache memory and a respective set of master cache tags (Etags), including one cache tag for each data block stored by the cache memory. The system controller maintains a set of duplicate cache tags (Dtags) for each of the data processors. The data processors each include master cache logic for updating the master cache tags, while the system controller includes logic for updating the duplicate cache tags. Memory transaction request logic simultaneously looks up the second cache tag in each of the sets of duplicate cache tags corresponding to the memory transaction request. It then determines which one of the cache memories and main memory to couple to the requesting data processor based on the second cache states and the address tags stored in the corresponding second cache tags. Duplicate cache update logic simultaneously updates all of the corresponding second cache tags in accordance with predefined cache tag update criteria.
摘要:
This invention describes a link-by-link flow control method for packet-switched uniprocessor and multiprocessor computer systems that maximizes system resource utilization and throughput, and minimizes system latency. The computer system comprises one or more master interfaces, one or more slave interfaces, and an interconnect system controller which provides dedicated transaction request queues for each master interface and controls the forwarding of transactions to each slave interface. The master interface keeps track of the number of requests in the dedicated queue in the system controller, and the system controller keeps track of the number of requests in each slave interface queue. Both the master interface, and system controller know the maximum capacity of the queue immediately downstream from it, and does not issue more transaction requests than what the downstream queue can accommodate. An acknowledgment from the downstream queue indicates to the sender that there is space in it for another transaction. Thus no system resources are wasted trying to send a request to a queue that is already full.
摘要:
A method and apparatus for actively managing the overall power consumption of a computer network which includes a plurality of computer systems interconnected to each other. In turn, each computer system has one or more modules. Each computer system of the computer network is capable of independently initiating a transition into a power-conserving mode, i.e., a "sleep" state, while keeping its network interface "alive" and fully operational. Subsequently, each computer system can independently transition back into fully operational state, i.e., an "awake" state, when triggered by either a deterministic or an asynchronous event. As a result, the sleep states of the computer systems are transparent to the computer network. Deterministic events are events triggered internally by a computer system, e.g., an internal timer waking the computer system up at midnight to perform housekeeping chores such as daily tape backups. Conversely, the source of asynchronous events are external in nature and include input/output (I/O) activity. The illusion of the entire network being always fully operational is possible because the system controllers, the interconnects and network interfaces of each computer system remain fully operational while selected modules and peripheral devices are powered down. As a result, each computer system is able to rapidly awake from sleep state in response to stimuli by powering down selected modules thereby accomplishing power conservation without requiring a static shut down of the computer network, i.e., without the overall performance and response of the computer network.
摘要:
A camera system and a method for zooming the camera system is disclosed. The method generally includes the steps of (A) generating an electronic image by sensing an optical image received by the camera, the sensing including electronic cropping to a window size to establish an initial resolution for the electronic image, (B) generating a final image by decimating the electronic image by a decimation factor to a final resolution smaller than the initial resolution and (C) changing a zoom factor for the final image by adjusting both of the decimation factor and the window size.
摘要:
In one embodiment, a processor is provided. The processor includes at least two cores, where each of the cores include a first level cache memory. Each of the cores are multi-threaded. In another embodiment, each of the cores includes four threads. In another embodiment a crossbar is included. A plurality of cache bank memories in communication with the at cores through the crossbar is provided. Each of the plurality of cache bank memories are in communication with a main memory interface. In another embodiment a buffer switch core in communication with each of the plurality of cache bank memories is also included. A server and a method for optimizing the utilization of a multithreaded processor core are also provided.
摘要:
A non-fault-only (NFO) bit is included in the translation table entry for each page. If the NFO bit is set, non-faulting loads accessing the page will cause translations to occur. Any other access to the non-fault-only page is an error, and will cause the processor to fault. A non-faulting load behaves like a normal load except that it never produces a fault even when applied to a page with the NFO bit set. The NFO bit in a translation table entry marks a page that is mapped for safe access by non-faulting loads, but can still cause a fault by other, normal accesses. The NFO bit indicates which pages are illegal. Selected pages, such as the virtual page 0x0, can be mapped in the translation table. Whenever a null-pointer is dereferenced by a non-faulting load, a translation lookaside buffer (TLB) hit will occur, and zero will be returned immediately without trapping to software to find the requested page. A second embodiment provides that when the operating system software routine invoked by a TLB miss discovers that a non-faulting load has attempted to access an illegal virtual page that was not previously translated in the translation table, the operating system creates a translation table entry for that virtual page mapping it to a physical page of all zeros and asserting the NFO bit for that virtual page.
摘要:
A dual-ported tag array of a cache allows simultaneous access of the tag array by miss data of older LOAD instructions being returned during the same cycle that a new LOAD instruction is accessing the tag array to check for a cache hit. Because a load buffer queues LOAD instructions, the cache tags for older LOAD instructions which missed the cache return later when new LOAD instructions are accessing a tag array to check for cache hits. A method and apparatus for calculating and maintaining a hit bit in a load buffer perform the determination of whether or not a newly dispatched LOAD will hit the cache after it has been queued into the load buffer and waited for all older LOADs to be processed. A load buffer data entry includes the hit bit and all information necessary to process the LOAD instruction and calculate the hit bits for future LOAD instructions which must be buffered. A method and apparatus for servicing LOAD instructions, in which the access of the data array portion of a cache and the tag array portion are decoupled, allows the delayed access of the data array after a LOAD has been delayed in the load buffer without reaccessing the tag array.