摘要:
According to embodiments of the invention, a distributed time base signal may be coupled to a memory directory which provides address translation for data located within a memory cache. The memory directory may have attribute bits which indicate whether or not the memory entries have been accessed by the distributed time base signal. Furthermore, the memory directory may have attribute bits which indicate whether or not a memory directory entry should be considered invalid after an access to the memory entry by the distributed time base signal. If the memory directory entry has been accessed by the distributed time base signal and the memory directory entry should be considered invalid after the access by the time base signal, any attempted address translation using the memory directory entry may cause a cache miss. The cache miss may initiate the retrieval of valid data from memory.
摘要:
Disclosed is an apparatus and method for granting guaranteed bandwidth between one or more data transmission priority requesting sources and one or more resources upon request. Data sources that do not request an assigned bandwidth are served on a “best efforts” basis. The system allows additional bandwidth to priority requesting sources when it is determined that the resource and/or the communication path to the resource is under-utilized. The system further allows the granted bandwidth to be shared by more than one source in a multiprocessor system.
摘要:
A component of a microprocessor-based data processing system, which includes features for regulating power consumption in snoopable components and has gating off memory coherency properties, is determined to be in a relatively inactive state and is transitioned to a non-snoopable low power mode. Then, when a snoop request occurs, a retry protocol is sent in response to the snoop request. In conjunction with the retry protocol, a signal is sent to bring the component into snoopable mode. When the retry snoop is requested, the component is in full power mode and can properly respond to the snoop request. After the snoop request has been satisfied, the component again enters into a low power mode. Therefore, the component is able to enter into a low power mode in between snoops
摘要:
According to embodiments of the invention, rays may be stochastically culled before they are issued into the three-dimensional scene. Stochastically culling rays may reduce the number of rays which need to be traced by the image processing system. Furthermore, by stochastically culling rays before they are issued into the three-dimensional scene, minor imperfections may be added to the final rendered image, thereby improving the realism of the rendered image. Therefore, stochastic culling of rays may improve the performance of the image processing system by reducing workload imposed on the image processing system and improving the realism of the images rendered by the image processing system. According to another embodiment of the invention, the realism of images rendered by the image processing system may also be improved by stochastically adding secondary rays after ray-primitive intersections have occurred.
摘要:
By mapping leaf nodes of a spatial index to processing elements, efficient distribution of workload in an image processing system may be achieved. In addition, processing elements may use a thread table to redistribute workload from processing elements which are experiencing an increased workload to processing elements which may be idle. Furthermore, the workload experienced by processing elements may be monitored in order to determine if workload is balanced. Periodically the leaf nodes for which processing elements are responsible may be remapped in response to a detected imbalance in workload. By monitoring the workload experienced by the processing elements and remapping leaf nodes to different processing elements in response to unbalanced workload, efficient distribution of workload may be maintained. Efficient distribution of workload may improve the performance of the image processing system.
摘要:
The present invention provides for an integrated circuit (IC) bus system. A local IC is coupled to a remote IC through a bus interface. A local memory is coupled to the local IC. A bus interface controller is employable to track data transfer requests from the remote IC for data address that are contained within at least one segment of the first partitioned memory range. The bus interface controller is further employable to stop the forwarding of a data transfer request generated within the local IC to the remote IC, if the memory segment count corresponding to the data address of the locally generated data transfer request equals zero.
摘要:
An improved method and apparatus for resource arbitration. Four priority classes, managed high (MH), managed low (ML), opportunistic high (OH) and opportunistic low (OL), are defined. A priority class is assigned to each resource access request. An access request concentrator (ARC) is created for each resource, through which the resource is accessed. An access request is chosen at each ARC using the priority order MH, ML, OH, and OL, in decreasing order of priority. If OH priority class resource access requests are locked out, the priority order is temporarily changed to OH, OL, MH, and ML, in decreasing order of priority. If OL priority class resource access requests are locked out, the priority order is temporarily changed to MH, OL, OH, and ML, in decreasing order of priority.
摘要:
By mapping leaf nodes of a spatial index to processing elements, efficient distribution of workload in an image processing system may be achieved. In addition, processing elements may use a thread table to redistribute workload from processing elements which are experiencing an increased workload to processing elements which may be idle. Furthermore, the workload experienced by processing elements may be monitored in order to determine if workload is balanced. Periodically the leaf nodes for which processing elements are responsible may be remapped in response to a detected imbalance in workload. By monitoring the workload experienced by the processing elements and remapping leaf nodes to different processing elements in response to unbalanced workload, efficient distribution of workload may be maintained. Efficient distribution of workload may improve the performance of the image processing system.
摘要:
According to embodiments of the invention, secondary rays may be pooled after they are generated by a vector throughput engine. After pooling the secondary rays, they may be reordered according to similarities in trajectory and originating location. The secondary rays may be sent in the new order to a workload manager for spatial index traversal. The reordering of the secondary rays may cause rays which traverse similar portions of the spatial index to be traversed immediately following (or shortly thereafter) one another. Consequently, the necessary portions of the spatial index may remain within the workload manager's memory cache, thereby reducing the number of cache misses and the amount of time necessary to traverse secondary rays through the spatial index. The reduction in time necessary to traverse the secondary rays through the spatial index may improve the overall performance of the image processing system.
摘要:
In a first aspect, a first method of reducing command processing latency while maintaining memory coherence is provided. The first method includes the steps of (1) providing a memory map including memory addresses available to a system; and (2) arranging the memory addresses into a plurality of groups. At least one of the groups does not require the system, in response to a command that requires access to a memory address in the group from a bus unit, to get permission from all remaining bus units included in the system to maintain memory coherence. Numerous other aspects are provided.