Hardware support for optimizing huge memory page selection

    公开(公告)号:US12130750B2

    公开(公告)日:2024-10-29

    申请号:US18118020

    申请日:2023-03-06

    CPC classification number: G06F12/1027

    Abstract: Computer systems often employ virtual address translation hierarchies in which virtual memory addresses are mapped to physical memory. Use of the virtual address translation hierarchy speeds up the virtual address translation when the required mapping is stored in one of the higher levels of the hierarchy. To reduce a number of misses occurring in the virtual address translation hierarchy, huge memory pages may be selectively employed, which map larger continuous regions of virtual memory to continuous regions of physical memory, thereby increasing the coverage of each entry in the virtual address translation hierarchy. The present disclosure provides hardware support for optimizing this huge memory page selection.

    AUTOMATIC METHOD FOR POWER MANAGEMENT TUNING IN COMPUTING SYSTEMS

    公开(公告)号:US20230079978A1

    公开(公告)日:2023-03-16

    申请号:US17709720

    申请日:2022-03-31

    Abstract: A system, method, and apparatus of power management for computing systems are included herein that optimize individual frequencies of components of the computing systems using machine learning. The computing systems can be tightly integrated systems that consider an overall operating budget that is shared between the components of the computing system while adjusting the frequencies of the individual components. An example of an automated method of power management includes: (1) learning, using a power management (PM) agent, frequency settings for different components of a computing system during execution of a repetitive application, and (2) adjusting the frequency settings of the different components using the PM agent, wherein the adjusting is based on the repetitive application and one or more limitations corresponding to a shared operating budget for the computing system.

    HARDWARE SUPPORT FOR OPTIMIZING HUGE MEMORY PAGE SELECTION

    公开(公告)号:US20240303201A1

    公开(公告)日:2024-09-12

    申请号:US18118020

    申请日:2023-03-06

    CPC classification number: G06F12/1027

    Abstract: Computer systems often employ virtual address translation hierarchies in which virtual memory addresses are mapped to physical memory. Use of the virtual address translation hierarchy speeds up the virtual address translation when the required mapping is stored in one of the higher levels of the hierarchy. To reduce a number of misses occurring in the virtual address translation hierarchy, huge memory pages may be selectively employed, which map larger continuous regions of virtual memory to continuous regions of physical memory, thereby increasing the coverage of each entry in the virtual address translation hierarchy. The present disclosure provides hardware support for optimizing this huge memory page selection.

    Read-write page replication for multiple compute units

    公开(公告)号:US11625279B2

    公开(公告)日:2023-04-11

    申请号:US16787967

    申请日:2020-02-11

    Abstract: In general, an application executes on a compute unit, such as a central processing unit (CPU) or graphics processing unit (GPU), to perform some function(s). In some circumstances, improved performance of an application, such as a graphics application, may be provided by executing the application across multiple compute units. However, when using multiple compute units in this manner, synchronization must be provided between the compute units. Synchronization, including the sharing of the data, is typically accomplished through memory. While a shared memory may cause bottlenecks, employing local memory for each compute unit may itself require synchronization (coherence) which can be costly in terms of resources, delay, etc. The present disclosure provides read-write page replication for multiple compute units that avoids the traditional challenges associated with coherence.

    Automatic method for power management tuning in computing systems

    公开(公告)号:US11880261B2

    公开(公告)日:2024-01-23

    申请号:US17709720

    申请日:2022-03-31

    CPC classification number: G06F1/324 G06F1/206 G06F11/3495

    Abstract: A system, method, and apparatus of power management for computing systems are included herein that optimize individual frequencies of components of the computing systems using machine learning. The computing systems can be tightly integrated systems that consider an overall operating budget that is shared between the components of the computing system while adjusting the frequencies of the individual components. An example of an automated method of power management includes: (1) learning, using a power management (PM) agent, frequency settings for different components of a computing system during execution of a repetitive application, and (2) adjusting the frequency settings of the different components using the PM agent, wherein the adjusting is based on the repetitive application and one or more limitations corresponding to a shared operating budget for the computing system.

    TECHNIQUE FOR AUTONOMOUSLY MANAGING CACHE USING MACHINE LEARNING

    公开(公告)号:US20230137205A1

    公开(公告)日:2023-05-04

    申请号:US17514735

    申请日:2021-10-29

    Abstract: Introduced herein is a technique that uses ML to autonomously find a cache management policy that achieves an optimal execution of a given workload of an application. Leveraging ML such as reinforcement learning, the technique trains an agent in an ML environment over multiple episodes of a stabilization process. For each time step in these training episodes, the agent executes the application while making an incremental change to the current policy, i.e., cache-residency statuses of memory address space associated with the workload, until the application can be executed at a stable level. The stable level of execution, for example, can be indicated by performance variations, such as standard deviations, between a certain number of neighboring measurement periods remaining within a certain threshold. The agent, who has been trained in the training episodes, infers the final cache management policy during the final, inferring episode.

    READ-WRITE PAGE REPLICATION FOR MULTIPLE COMPUTE UNITS

    公开(公告)号:US20210248014A1

    公开(公告)日:2021-08-12

    申请号:US16787967

    申请日:2020-02-11

    Abstract: In general, an application executes on a compute unit, such as a central processing unit (CPU) or graphics processing unit (GPU), to perform some function(s). In some circumstances, improved performance of an application, such as a graphics application, may be provided by executing the application across multiple compute units. However, when using multiple compute units in this manner, synchronization must be provided between the compute units. Synchronization, including the sharing of the data, is typically accomplished through memory. While a shared memory may cause bottlenecks, employing local memory for each compute unit may itself require synchronization (coherence) which can be costly in terms of resources, delay, etc. The present disclosure provides read-write page replication for multiple compute units that avoids the traditional challenges associated with coherence.

Patent Agency Ranking