摘要:
According to embodiments of the invention, a step value and a step-interval cache coherency protocol may be used to update and invalidate data stored within cache memory. A step value may be an integer value and may be stored within a cache directory entry associated with data in the memory cache. Upon reception of a cache read request, along with the normal address comparison to determine if the data is located within the cache a current step value may be compared with the stored step value to determine if the data is current. If the step values match, the data may be current and a cache hit may occur. However, if the step values do not match, the requested data may be provided from another source. Furthermore, an application may update the current step value to invalidate old data stored within the cache and associated with a different step value.
摘要:
According to embodiments of the invention, a step value and a step-interval cache coherency protocol may be used to update and invalidate data stored within cache memory. A step value may be an integer value and may be stored within a cache directory entry associated with data in the memory cache. Upon reception of a cache read request, along with the normal address comparison to determine if the data is located within the cache a current step value may be compared with the stored step value to determine if the data is current. If the step values match, the data may be current and a cache hit may occur. However, if the step values do not match, the requested data may be provided from another source. Furthermore, an application may update the current step value to invalidate old data stored within the cache and associated with a different step value.
摘要:
According to one embodiment of the invention, by increasing the number of rays issued through adjacent pixels with colors of high contrast while maintaining the number of rays issued through adjacent pixels which do not have colors of high contrast, a ray tracing image processing system may render an anti-aliased image while minimizing the increase in workload experienced by the image processing system. Additionally, according to another embodiment of the invention, by maintaining the number of rays issued through adjacent pixels which have colors of low contrast while increasing the number of rays issued through adjacent pixels which do not have colors of low contrast, the image processing system may reduce workload experienced while performing ray tracing while maintaining the quality of the rendered image.
摘要:
According to embodiments of the invention, a normally recursive ray tracing algorithm may be partitioned to form an iterative ray tracing algorithm. The resulting portions of the iterative ray tracing algorithm may be executed by a plurality of processing elements. Furthermore, according to embodiments of the invention, a network of inboxes may be used to transfer information which defines original rays and secondary rays (information unlikely to be reused for subsequently issued rays and subsequently rendered frames) between processing elements, and a shared memory cache may store information relating to a three dimensional scene (information likely to be reused for subsequently issued rays and subsequently rendered frames). Using a plurality of processing elements to perform ray tracing and storing information in the shared memory cache which is likely to be reused for subsequent rays and subsequent frames, the performance of a ray tracing image processing system may be improved.
摘要:
According to embodiments of the invention, a distributed time base signal may be coupled to a memory directory which provides address translation for data located within a memory cache. The memory directory may have attribute bits which indicate whether or not the memory entries have been accessed by the distributed time base signal. Furthermore, the memory directory may have attribute bits which indicate whether or not a memory directory entry should be considered invalid after an access to the memory entry by the distributed time base signal. If the memory directory entry has been accessed by the distributed time base signal and the memory directory entry should be considered invalid after the access by the time base signal, any attempted address translation using the memory directory entry may cause a cache miss. The cache miss may initiate the retrieval of valid data from memory.
摘要:
According to embodiments of the invention, rays may be stochastically culled before they are issued into the three-dimensional scene. Stochastically culling rays may reduce the number of rays which need to be traced by the image processing system. Furthermore, by stochastically culling rays before they are issued into the three-dimensional scene, minor imperfections may be added to the final rendered image, thereby improving the realism of the rendered image. Therefore, stochastic culling of rays may improve the performance of the image processing system by reducing workload imposed on the image processing system and improving the realism of the images rendered by the image processing system. According to another embodiment of the invention, the realism of images rendered by the image processing system may also be improved by stochastically adding secondary rays after ray-primitive intersections have occurred.
摘要:
By mapping leaf nodes of a spatial index to processing elements, efficient distribution of workload in an image processing system may be achieved. In addition, processing elements may use a thread table to redistribute workload from processing elements which are experiencing an increased workload to processing elements which may be idle. Furthermore, the workload experienced by processing elements may be monitored in order to determine if workload is balanced. Periodically the leaf nodes for which processing elements are responsible may be remapped in response to a detected imbalance in workload. By monitoring the workload experienced by the processing elements and remapping leaf nodes to different processing elements in response to unbalanced workload, efficient distribution of workload may be maintained. Efficient distribution of workload may improve the performance of the image processing system.
摘要:
According to embodiments of the invention, a data structure may be created which may be used by both a ray tracing unit and by a rendering engine. The data structure may have an initial or upper portion representing bounding volumes which partition a three-dimensional scene and a second or lower portion representing objects within the three-dimensional scene. The integrated acceleration data structure may be used by a rendering engine to render a two-dimensional image from a three-dimensional scene, and by a ray tracing unit to perform intersection tests.
摘要:
According to embodiments of the invention, rays may be stochastically culled before they are issued into the three-dimensional scene. Stochastically culling rays may reduce the number of rays which need to be traced by the image processing system. Furthermore, by stochastically culling rays before they are issued into the three-dimensional scene, minor imperfections may be added to the final rendered image, thereby improving the realism of the rendered image. Therefore, stochastic culling of rays may improve the performance of the image processing system by reducing workload imposed on the image processing system and improving the realism of the images rendered by the image processing system. According to another embodiment of the invention, the realism of images rendered by the image processing system may also be improved by stochastically adding secondary rays after ray-primitive intersections have occurred.
摘要:
By mapping leaf nodes of a spatial index to processing elements, efficient distribution of workload in an image processing system may be achieved. In addition, processing elements may use a thread table to redistribute workload from processing elements which are experiencing an increased workload to processing elements which may be idle. Furthermore, the workload experienced by processing elements may be monitored in order to determine if workload is balanced. Periodically the leaf nodes for which processing elements are responsible may be remapped in response to a detected imbalance in workload. By monitoring the workload experienced by the processing elements and remapping leaf nodes to different processing elements in response to unbalanced workload, efficient distribution of workload may be maintained. Efficient distribution of workload may improve the performance of the image processing system.