-
公开(公告)号:US11860755B2
公开(公告)日:2024-01-02
申请号:US17861435
申请日:2022-07-11
Applicant: Advanced Micro Devices, Inc.
Inventor: Sergey Blagodurov , Jinyoung Choi
CPC classification number: G06F11/3037 , G06F11/076 , G06F11/3471 , G06F11/3476
Abstract: An approach is provided for implementing memory profiling aggregation. A hardware aggregator provides memory profiling aggregation by controlling the execution of a plurality of hardware profilers that monitor memory performance in a system. For each hardware profiler of the plurality of hardware profilers, a hardware counter value is compared to a threshold value. When a threshold value is satisfied, execution of a respective hardware profiler of the plurality of hardware profilers is initiated to monitor memory performance. Multiple hardware profilers of the plurality of hardware profilers may execute concurrently and each generate a result counter value. The result counter values generated by each hardware profiler of the plurality of hardware profilers are aggregated to generate an aggregate result counter value. The aggregate result counter value is stored in memory that is accessible by a software processes for use in optimizing memory-management policy decisions.
-
公开(公告)号:US20230420036A1
公开(公告)日:2023-12-28
申请号:US18240770
申请日:2023-08-31
Applicant: Advanced Micro Devices, Inc.
Inventor: Sriseshan Srikanth , Vignesh Adhinarayanan , Jagadish B. Kotra , Sergey Blagodurov
IPC: G11C11/4093 , G11C8/18 , H03K19/17728 , G11C11/4096 , H03K19/173 , G11C11/408
CPC classification number: G11C11/4093 , G11C8/18 , H03K19/17728 , G11C11/4096 , H03K19/1737 , G11C11/4087
Abstract: A fine-grained dynamic random-access memory (DRAM) includes a first memory bank, a second memory bank, and a dual mode I/O circuit. The first memory bank includes a memory array divided into a plurality of grains, each grain including a row buffer and input/output (I/O) circuitry. The dual-mode I/O circuit is coupled to the I/O circuitry of each grain in the first memory bank, and operates in a first mode in which commands having a first data width are routed to and fulfilled individually at each grain, and a second mode in which commands having a second data width different from the first data width are fulfilled by at least two of the grains in parallel.
-
公开(公告)号:US11854652B2
公开(公告)日:2023-12-26
申请号:US17984796
申请日:2022-11-10
Applicant: Advanced Micro Devices, Inc.
Inventor: Russell J. Schreiber , Ryan T. Freese , Eric W. Busta
IPC: G11C7/06
CPC classification number: G11C7/065
Abstract: A sense amplifier is biased to reduce leakage current equalize matched transistor bias during an idle state. A first read select transistor couples a true bit line and a sense amplifier true (SAT) signal line and a second read select transistor couples a complement bit line and a sense amplifier complement (SAC) signal line. The SAT and SAC signal lines are precharged during a precharge state. An equalization circuit shorts the SAT and SAC signal lines during the precharge state. A differential sense amplifier circuit for latching the memory cell value is coupled to the SAT signal line and the SAC signal line. The precharge circuit and the differential sense amplifier circuit are turned off during a sleep state to cause the SAT and SAC signal lines to float. A sleep circuit shorts the SAT and SAC signal lines during the sleep state.
-
公开(公告)号:US20230409868A1
公开(公告)日:2023-12-21
申请号:US17844204
申请日:2022-06-20
Applicant: Advanced Micro Devices, Inc.
Inventor: Hai Xiao , Adam H Li , Harris Eleftherios Gasparakis
Abstract: Activation scaled clipping layers for neural networks are described. An activation scaled clipping layer processes an output of a neuron in a neural network using a scaling parameter and a clipping parameter. The scaling parameter defines how numerical values are amplified relative to zero. The clipping parameter specifies a numerical threshold that causes the neuron output to be expressed as a value defined by the numerical threshold if the neuron output satisfies the numerical threshold. In some implementations, the scaling parameter is linear and treats numbers within a numerical range as being equivalent, such that any number in the range is scaled by a defined magnitude, regardless of value. Alternatively, the scaling parameter is nonlinear, which causes the activation scaled clipping layer to amplify numbers within a range by different magnitudes. Each scaling and clipping parameter is learnable during training of a machine learning model implementing the neural network.
-
公开(公告)号:US11847055B2
公开(公告)日:2023-12-19
申请号:US17364854
申请日:2021-06-30
Applicant: Advanced Micro Devices, Inc.
Inventor: Shaizeen Aga , Nuwan Jayasena
IPC: G06F12/00 , G06F12/084 , G06F12/02 , G06F12/0862 , G06F12/0811
CPC classification number: G06F12/084 , G06F12/0238 , G06F12/0811 , G06F12/0862
Abstract: A technical solution to the technical problem of how to reduce the undesirable side effects of offloading computations to memory uses read hints to preload results of memory-side processing into a processor-side cache. A cache controller, in response to identifying a read hint in a memory-side processing instruction, causes results of the memory-side processing to be preloaded into a processor-side cache. Implementations include, without limitation, enabling or disabling the preloading based upon cache thrashing levels, preloading results, or portions of results, of memory-side processing to particular destination caches, preloading results based upon priority and/or degree of confidence, and/or during periods of low data bus and/or command bus utilization, last stores considerations, and enforcing an ordering constraint to ensure that preloading occurs after memory-side processing results are complete.
-
公开(公告)号:US11837588B2
公开(公告)日:2023-12-05
申请号:US17978389
申请日:2022-11-01
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Milind S. Bhagavat , Rahul Agarwal
CPC classification number: H01L25/16 , H01L21/561 , H01L21/565 , H01L21/568 , H01L23/49838 , H05K1/0231 , H05K3/284 , H05K3/303 , H05K3/3442 , H05K2201/10015 , H05K2201/10522 , H05K2201/10636
Abstract: Various circuit boards with mounted passive components and method of making the same are disclosed. In one aspect, a method of manufacturing is provided that includes at least partially encapsulating a first plurality of passive components in a molding material to create a first molded passive component group. The first molded passive component group is mounted on a surface of a circuit board. The first plurality of passive components are electrically connected to the circuit board.
-
公开(公告)号:US11836549B2
公开(公告)日:2023-12-05
申请号:US17071876
申请日:2020-10-15
Applicant: Advanced Micro Devices, Inc.
Inventor: Samantray Biplab Raut
Abstract: Computer-implemented techniques for fast block-based parallel message passing interface (MPI) transpose are disclosed. The techniques achieve an in-place parallel matrix transpose of an input matrix in a distributed-memory multiprocessor environment with reduced consumption of computer processing time and storage media resources. An in-memory copy of the input matrix or a submatrix thereof to use as the send buffer for MPI send operations is not needed. Instead, by dividing the input matrix in-place into data blocks having up to at most a predetermined size and sending the corresponding data block(s) for a given submatrix using an MPI API before receiving any data block(s) for the given submatrix using an MPI API in the place of the sent data block(s), making the in-memory copy to use a send buffer can be avoided and yet the input matrix can be transposed in-place.
-
公开(公告)号:US11836088B2
公开(公告)日:2023-12-05
申请号:US17557793
申请日:2021-12-21
Applicant: Advanced Micro Devices, Inc.
Inventor: Jeffrey Christopher Allan
IPC: G06F12/08 , G06F12/0893
CPC classification number: G06F12/0893 , G06F2212/6042
Abstract: Guided cache replacement is described. In accordance with the described techniques, a request to access a cache is received, and a cache replacement policy which controls loading data into the cache is accessed. The cache replacement policy includes a tree structure having nodes corresponding to cachelines of the cache and a traversal algorithm controlling traversal of the tree structure to select one of the cachelines. Traversal of the tree structure is guided using the traversal algorithm to select a cacheline to allocate to the request. The guided traversal modifies at least one decision of the traversal algorithm to avoid selection of a non-replaceable cacheline.
-
公开(公告)号:US11836085B2
公开(公告)日:2023-12-05
申请号:US17514792
申请日:2021-10-29
Applicant: Advanced Micro Devices, Inc.
Inventor: Paul J. Moyer
IPC: G06F12/08 , G06F12/0891 , G06F9/30 , G06F12/0831
CPC classification number: G06F12/0891 , G06F9/30043 , G06F9/30047 , G06F12/0833
Abstract: Techniques for performing cache operations are provided. The techniques include, recording an entry indicating that a cache line is exclusive-upgradeable; removing the cache line from a cache; and converting a request to insert the cache line into the cache into a request to insert the cache line in the cache in an exclusive state.
-
公开(公告)号:US20230384947A1
公开(公告)日:2023-11-30
申请号:US18208639
申请日:2023-06-12
Applicant: ADVANCED MICRO DEVICES, INC. , ATI TECHNOLOGIES ULC
Inventor: Joseph L. Greathouse , Alan D. Smith , Francisco L. Duran , Felix Kuehling , Anthony Asaro
CPC classification number: G06F3/0619 , G06F3/064 , G06F12/0607 , G06F3/0659 , G06F3/0673 , G06F3/0644
Abstract: Systems and methods for dynamic repartitioning of physical memory address mapping involve relocating data stored at one or more physical memory locations of one or more memory devices to another memory device or mass storage device, repartitioning one or more corresponding physical memory maps to include new mappings between physical memory addresses and physical memory locations of the one or more memory devices, then loading the relocated data back onto the one or more memory devices at physical memory locations determined by the new physical address mapping. Such dynamic repartitioning of the physical memory address mapping does not require a processing system to be rebooted and has various applications in connection with interleaving reconfiguration and error correcting code (ECC) reconfiguration of the processing system.
-
-
-
-
-
-
-
-
-