-
1.
公开(公告)号:US12061555B1
公开(公告)日:2024-08-13
申请号:US18199784
申请日:2023-05-19
IPC分类号: G06F12/0888 , G06F12/0837 , G06F12/0891 , G06F12/1045
CPC分类号: G06F12/0888 , G06F12/0837 , G06F12/0891 , G06F12/1063
摘要: A load/store circuit performs a first lookup of a load virtual address in a virtually-indexed, virtually-tagged first-level data cache (VIVTFLDC) that misses and generates a fill request that causes translation of the load virtual address into a load physical address, receives a response that indicates the load physical address is in a non-cacheable memory region and is without data from the load physical address, allocates a VIVTFLDC data-less entry that includes an indication that the data-less entry is associated with a non-cacheable memory region, performs a second lookup of the load virtual address in the VIVTFLDC and determines the load virtual address hits on the data-less entry, determines from the hit data-less entry it is associated with a non-cacheable memory region, and generates a read request to read data from a processor bus at the load physical address rather than providing data from the hit data-less entry.
-
公开(公告)号:US11789869B2
公开(公告)日:2023-10-17
申请号:US17580360
申请日:2022-01-20
申请人: NVIDIA Corporation
发明人: Anurag Chaudhary , Christopher Richard Feilbach , Jasjit Singh , Manuel Gautho , Aprajith Thirumalai , Shailender Chaudhry
IPC分类号: G06F12/00 , G06F12/0837
CPC分类号: G06F12/0837 , G06F2212/1032
摘要: The technology disclosed herein involves tracking contention and using the tracked contention to reduce latency of exclusive memory operations. The technology enables a processor to track which locations in main memory are contentious and to modify the order exclusive memory operations are processed based on the contentiousness. A thread can include multiple exclusive operations for the same memory location (e.g., exclusive load and a complementary exclusive store). The multiple exclusive memory operations can be added to a queue and include one or more intervening operations between them in the queue. The processor may process the operations in the queue based on the order they were added and may use the tracked contention to perform out-of-order processing for some of the exclusive operations. For example, the processor can execute the exclusive load operation and because the corresponding location is contentious can process the complementary exclusive store operation before the intervening operations.
-
公开(公告)号:US11768602B2
公开(公告)日:2023-09-26
申请号:US17395781
申请日:2021-08-06
申请人: Ultrata, LLC
发明人: Steven J. Frank , Larry Reback
IPC分类号: G06F3/06 , H04L67/1097 , G06F12/0837 , G06F12/0817 , G06F12/06
CPC分类号: G06F3/061 , G06F3/0604 , G06F3/065 , G06F3/067 , G06F3/0613 , G06F3/0631 , G06F3/0632 , G06F3/0644 , G06F3/0647 , G06F3/0659 , G06F3/0673 , G06F3/0683 , G06F3/0685 , G06F12/0646 , G06F12/0824 , G06F12/0837 , H04L67/1097 , G06F2212/2542
摘要: Embodiments of the invention provide systems and methods for managing processing, memory, storage, network, and cloud computing to significantly improve the efficiency and performance of processing nodes. More specifically, embodiments of the present invention are directed to an instruction set of an object memory fabric. This object memory fabric instruction set can be used to provide a unique instruction model based on triggers defined in metadata of the memory objects. This model represents a dynamic dataflow method of execution in which processes are performed based on actual dependencies of the memory objects. This provides a high degree of memory and execution parallelism which in turn provides tolerance of variations in access delays between memory objects. In this model, sequences of instructions are executed and managed based on data access. These sequences can be of arbitrary length but short sequences are more efficient and provide greater parallelism.
-
公开(公告)号:US11755201B2
公开(公告)日:2023-09-12
申请号:US15001494
申请日:2016-01-20
申请人: ULTRATA, LLC
发明人: Steven J. Frank , Larry Reback
IPC分类号: G06F12/00 , G06F13/00 , G06F3/06 , G06F12/0817 , G06F12/0837 , G06F12/0877
CPC分类号: G06F3/0604 , G06F3/061 , G06F3/064 , G06F3/067 , G06F3/0631 , G06F3/0635 , G06F3/0638 , G06F3/0644 , G06F3/0647 , G06F3/0659 , G06F3/0683 , G06F3/0685 , G06F12/0817 , G06F12/0837 , G06F12/0877 , G06F2212/2542 , G06F2212/6012
摘要: Embodiments of the invention provide systems and methods to implement an object memory fabric including hardware-based processing nodes having memory modules storing and managing memory objects created natively within the memory modules and managed by the memory modules at a memory layer, where physical address of memory and storage is managed with the memory objects based on an object address space that is allocated on a per-object basis with an object addressing scheme. Each node may utilize the object addressing scheme to couple to additional nodes to operate as a set of nodes so that all memory objects of the set are accessible based on the object addressing scheme, which defines invariant object addresses for the memory objects that are invariant with respect to physical memory storage locations and storage location changes of the memory objects within the memory module and across all modules interfacing the object memory fabric.
-
公开(公告)号:US11645197B2
公开(公告)日:2023-05-09
申请号:US17091101
申请日:2020-11-06
申请人: SK hynix Inc.
发明人: Dong Young Seo
IPC分类号: G06F12/02 , G06F12/0837 , G06F12/06
CPC分类号: G06F12/0246 , G06F12/0646 , G06F12/0837 , G06F2212/7201
摘要: Memory controller devices, memory systems, and operating methods for memory controller devices and memory systems are disclosed. In one aspect, a memory controller having improved wear leveling performance is disclosed. The memory controller may control a first memory area and a second memory area, and include a first software layer configured to control the first memory area based on first logical addresses, a second software layer configured to control the second memory area based on second logical addresses, and a logical address manager configured to compare a logical address received from a host with a reference address selected from among a plurality of logical addresses to be used by the host, and transmit the logical address received from the host to the first software layer or the second software layer according to a criterion selected from between a first criterion and a second criterion based on the comparison.
-
公开(公告)号:US11582299B2
公开(公告)日:2023-02-14
申请号:US15397374
申请日:2017-01-03
发明人: Ilir Iljazi , Jason K. Resch , Ethan S. Wozniak
IPC分类号: G06F12/0871 , G06F12/122 , G06F12/128 , H04L67/1095 , H04L67/1097 , G06F16/21 , G06F16/22 , G06F16/23 , G06F3/06 , H03M13/37 , H04L67/55 , H04L67/568 , G06F11/10 , G06F12/0813 , G06F12/0837 , H03M13/15 , H04H60/27 , G06N3/00 , G06F9/50 , G06F12/12
摘要: A method for execution by a dispersed storage network (DSN) managing unit includes receiving access information from a plurality of distributed storage and task (DST) processing units via a network. Cache memory utilization data is generated based on the access information. Configuration instructions are generated for transmission via the network to the plurality of DST processing units based on the cache memory utilization data.
-
公开(公告)号:US20220261289A1
公开(公告)日:2022-08-18
申请号:US17686089
申请日:2022-03-03
申请人: Intel Corporation
发明人: Ben Ashbaugh , Jonathan Pearce , Murali Ramadoss , Vikranth Vemulapalli , William B. Sadler , Sungye Kim , Marian Alin Petre
摘要: Embodiments are generally directed to thread group scheduling for graphics processing. An embodiment of an apparatus includes a plurality of processors including a plurality of graphics processors to process data; a memory; and one or more caches for storage of data for the plurality of graphics processors, wherein the one or more processors are to schedule a plurality of groups of threads for processing by the plurality of graphics processors, the scheduling of the plurality of groups of threads including the plurality of processors to apply a bias for scheduling the plurality of groups of threads according to a cache locality for the one or more caches.
-
公开(公告)号:US20220075746A1
公开(公告)日:2022-03-10
申请号:US17014023
申请日:2020-09-08
申请人: Intel Corporation
发明人: Hema Chand Nalluri , Ankur Shah , Joydeep Ray , Aditya Navale , Altug Koker , Murali Ramadoss , Niranjan L. Cooray , Jeffery S. Boles , Aravindh Anantaraman , David Puffer , James Valerio , Vasanth Ranganathan
IPC分类号: G06F13/40 , G06F13/16 , G06F9/30 , G06F9/52 , G06F12/0837 , G06F12/0888
摘要: An apparatus to facilitate memory barriers is disclosed. The apparatus comprises an interconnect, a device memory, a plurality of processing resources, coupled to the device memory, to execute a plurality of execution threads as memory data producers and memory data consumers to a device memory and a system memory and fence hardware to generate fence operations to enforce data ordering on memory operations issued to the device memory and a system memory coupled via the interconnect.
-
公开(公告)号:US11086521B2
公开(公告)日:2021-08-10
申请号:US15001366
申请日:2016-01-20
申请人: ULTRATA, LLC
发明人: Steven J. Frank , Larry Reback
IPC分类号: G06F3/06 , H04L29/08 , G06F12/0837 , G06F12/0817 , G06F12/06
摘要: Embodiments of the invention provide systems and methods for managing processing, memory, storage, network, and cloud computing to significantly improve the efficiency and performance of processing nodes. More specifically, embodiments of the present invention are directed to an instruction set of an object memory fabric. This object memory fabric instruction set can be used to provide a unique instruction model based on triggers defined in metadata of the memory objects. This model represents a dynamic dataflow method of execution in which processes are performed based on actual dependencies of the memory objects. This provides a high degree of memory and execution parallelism which in turn provides tolerance of variations in access delays between memory objects. In this model, sequences of instructions are executed and managed based on data access. These sequences can be of arbitrary length but short sequences are more efficient and provide greater parallelism.
-
公开(公告)号:US10956330B2
公开(公告)日:2021-03-23
申请号:US16727127
申请日:2019-12-26
申请人: Intel Corporation
发明人: Chandrasekaran Sakthivel , Prasoonkumar Surti , John C. Weast , Sara S. Baghsorkhi , Justin E. Gottschlich , Abhishek R. Appu , Nicolas C. Galoppo Von Borries , Joydeep Ray , Narayan Srinivasa , Feng Chen , Ben J. Ashbaugh , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha , Eriko Nurvitadhi , Balaji Vembu , Altug Koker
IPC分类号: G06F12/0837 , G06N3/08 , G06N20/00 , G06T1/20 , G06F12/0815 , G06N3/063 , G06N3/04
摘要: In an example, an apparatus comprises a plurality of processing unit cores, a plurality of cache memory modules associated with the plurality of processing unit cores, and a machine learning model communicatively coupled to the plurality of processing unit cores, wherein the plurality of cache memory modules share cache coherency data with the machine learning model. Other embodiments are also disclosed and claimed.
-
-
-
-
-
-
-
-
-