Memory hierarchy using page-based compression

    公开(公告)号:US11132300B2

    公开(公告)日:2021-09-28

    申请号:US13939380

    申请日:2013-07-11

    Abstract: A system includes a device coupleable to a first memory. The device includes a second memory to cache data from the first memory. The second memory is to store a set of compressed pages of the first memory and a set of page descriptors. Each compressed page includes a set of compressed data blocks. Each page descriptor represents a corresponding page and includes a set of location identifiers that identify the locations of the compressed data blocks of the corresponding page in the second memory. The device further includes compression logic to compress data blocks of a page to be stored to the second memory and decompression logic to decompress compressed data blocks of a page accessed from the second memory.

    Expandable buffer for memory transactions

    公开(公告)号:US10740029B2

    公开(公告)日:2020-08-11

    申请号:US15824539

    申请日:2017-11-28

    Abstract: A processing system employs an expandable memory buffer that supports enlarging the memory buffer when the processing system generates a large number of long latency memory transactions. The hybrid structure of the memory buffer allows a memory controller of the processing system to store a larger number of memory transactions while still maintaining adequate transaction throughput and also ensuring a relatively small buffer footprint and power consumption. Further, the hybrid structure allows different portions of the buffer to be placed on separate integrated circuit dies, which in turn allows the memory controller to be used in a wide variety of integrated circuit configurations, including configurations that use only one portion of the memory buffer.

    RUNTIME EXTENSION FOR NEURAL NETWORK TRAINING WITH HETEROGENEOUS MEMORY

    公开(公告)号:US20200042859A1

    公开(公告)日:2020-02-06

    申请号:US16194958

    申请日:2018-11-19

    Abstract: Systems, apparatuses, and methods for managing buffers in a neural network implementation with heterogeneous memory are disclosed. A system includes a neural network coupled to a first memory and a second memory. The first memory is a relatively low-capacity, high-bandwidth memory while the second memory is a relatively high-capacity, low-bandwidth memory. During a forward propagation pass of the neural network, a run-time manager monitors the usage of the buffers for the various layers of the neural network. During a backward propagation pass of the neural network, the run-time manager determines how to move the buffers between the first and second memories based on the monitored buffer usage during the forward propagation pass. As a result, the run-time manager is able to reduce memory access latency for the layers of the neural network during the backward propagation pass.

    Mechanism for reducing page migration overhead in memory systems

    公开(公告)号:US10339067B2

    公开(公告)日:2019-07-02

    申请号:US15626623

    申请日:2017-06-19

    Abstract: A technique for use in a memory system includes swapping a first plurality of pages of a first memory of the memory system with a second plurality of pages of a second memory of the memory system. The first memory has a first latency and the second memory has a second latency. The first latency is less than the second latency. The technique includes updating a page table and triggering a translation lookaside buffer shootdown to associate a virtual address of each of the first plurality of pages with a corresponding physical address in the second memory and to associate a virtual address for each of the second plurality of pages with a corresponding physical address in the first memory.

    Activation Function Functional Block for Electronic Devices

    公开(公告)号:US20190180182A1

    公开(公告)日:2019-06-13

    申请号:US15836080

    申请日:2017-12-08

    Inventor: Gabriel H. Loh

    Abstract: An electronic device has an activation function functional block that implements an activation function. During operation, the activation function functional block receives an input including a plurality of bits representing a numerical value. The activation function functional block then determines a range from among a plurality of ranges into which the input falls, each range including a separate portion of possible numerical values of the input. The activation function functional block next generates a result of a linear function associated with the range. Generating the result includes using a separate linear function that is associated with each range in the plurality of ranges to approximate results of the activation function within that range.

    Preemptive cache management policies for processing units

    公开(公告)号:US10303602B2

    公开(公告)日:2019-05-28

    申请号:US15475435

    申请日:2017-03-31

    Abstract: A processing system includes at least one central processing unit (CPU) core, at least one graphics processing unit (GPU) core, a main memory, and a coherence directory for maintaining cache coherence. The at least one CPU core receives a CPU cache flush command to flush cache lines stored in cache memory of the at least one CPU core prior to launching a GPU kernel. The coherence directory transfers data associated with a memory access request by the at least one GPU core from the main memory without issuing coherence probes to caches of the at least one CPU core.

    Pinning objects in multi-level memory hierarchies

    公开(公告)号:US10055359B2

    公开(公告)日:2018-08-21

    申请号:US15040195

    申请日:2016-02-10

    Abstract: The described embodiments include a computer system having a multi-level memory hierarchy with two or more levels of memory, each level being one of two or more types of memory. The computer system handles storing objects in the multi-level memory hierarchy. During operation, a system runtime in the computer system identifies an object to be stored in the multi-level memory hierarchy. The system runtime then determines, based on one or more attributes of the object, that the object is to be pinned in a level of the multi-level memory hierarchy. The system runtime then pins the object in the level of the multi-level memory hierarchy. In the described embodiments, the pinning includes hard pinning and soft pinning, which are each associated with corresponding retention policies for pinned objects.

Patent Agency Ranking