-
公开(公告)号:US20190286362A1
公开(公告)日:2019-09-19
申请号:US16432391
申请日:2019-06-05
Applicant: Advanced Micro Devices, Inc.
Inventor: Arkaprava Basu , Mitesh R. Meswani , Dibakar Gope , Sooraj Puthoor
Abstract: A processing apparatus is provided that includes NVRAM and one or more processors configured to process a first set and a second set of instructions according to a hierarchical processing scope and process a scoped persistence barrier residing in the program after the first instruction set and before the second instruction set. The barrier includes an instruction to cause first data to persist in the NVRAM before second data persists in the NVRAM. The first data results from execution of each of the first set of instructions processed according to the one hierarchical processing scope. The second data results from execution of each of the second set of instructions processed according to the one hierarchical processing scope. The processing apparatus also includes a controller configured to cause the first data to persist in the NVRAM before the second data persists in the NVRAM based on the scoped persistence barrier.
-
公开(公告)号:US20190286209A1
公开(公告)日:2019-09-19
申请号:US15923153
申请日:2018-03-16
Applicant: Advanced Micro Devices, Inc.
Inventor: Shijia Wei , Joseph L. Greathouse , John Kalamatianos
IPC: G06F1/32
Abstract: A processor utilizes instruction based sampling to generate sampling data sampled on a per instruction basis during execution of an instruction. The sampling data indicates what processor hardware was used due to the execution of the instruction. Software receives the sampling data and generates an estimate of energy used by the instruction based on the sampling data. The sampling data may include microarchitectural events and the energy estimate utilizes a base energy amount corresponding to the instruction executed along with energy amounts corresponding to the microarchitectural events in the sampling data. The sampling data may include switching events associated with hardware blocks that switched due to execution of the instruction and the energy estimate for the instruction is based on the switching events and capacitance estimates associated with the hardware blocks.
-
公开(公告)号:US10417140B2
公开(公告)日:2019-09-17
申请号:US15442487
申请日:2017-02-24
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Wade K. Smith , Kostantinos Danny Christidis
IPC: G06F12/1027 , G06F12/1081 , G06F12/1009 , G06F9/38
Abstract: Techniques are provided for using a translation lookaside buffer to provide low latency memory address translations for data streams. Clients of a memory system first prepare the address translation cache hierarchy by requesting that a translation pre-fetch stream is initialized. After the translation pre-fetch stream is initialized, the cache hierarchy returns an acknowledgment of completion to the client, which then begins to access memory. Pre-fetch streams are specified in terms of address ranges and are performed for large contiguous portions of the virtual memory address space.
-
公开(公告)号:US10409524B1
公开(公告)日:2019-09-10
申请号:US15962997
申请日:2018-04-25
Applicant: Advanced Micro Devices, Inc.
Inventor: Alexander J. Branover , Thomas James Gibney
IPC: G06F3/06 , G06F1/3234 , G06F1/3218
Abstract: Systems, apparatuses, and methods for dynamically optimizing memory traffic in multi-client systems are disclosed. A system includes a plurality of client devices, a memory subsystem, and a communication fabric coupled to the client devices and the memory subsystem. The system includes a first client which generates memory access requests targeting the memory subsystem. Prior to sending a given memory access request to the fabric, the first client analyzes metadata associated with data targeted by the given memory access request. If the metadata indicates the targeted data is the same as or is able to be derived from previously retrieved data, the first client prevents the request from being sent out on the fabric on the data path to memory subsystem. This helps to reduce memory bandwidth consumption and allows the fabric and the memory subsystem to stay in a low-power state for longer periods of time.
-
公开(公告)号:US10402327B2
公开(公告)日:2019-09-03
申请号:US15358318
申请日:2016-11-22
Applicant: Advanced Micro Devices, Inc.
Inventor: David A. Roberts , Ehsan Fatehi
IPC: G06F12/08 , G06F12/0815 , G06F12/084 , G06F12/0817 , G06F13/16
Abstract: A non-uniform memory access system includes several nodes that each have one or more processors, caches, local main memory, and a local bus that connects a node's processor(s) to its memory. The nodes are coupled to one another over a collection of point-to-point interconnects, thereby permitting processors in one node to access data stored in another node. Memory access time for remote memory takes longer than local memory because remote memory accesses have to travel across a communications network to arrive at the requesting processor. In some embodiments, inter-cache and main-memory-to-cache latencies are measured to determine whether it would be more efficient to satisfy memory access requests using cached copies stored in caches of owning nodes or from main memory of home nodes.
-
公开(公告)号:US20190268086A1
公开(公告)日:2019-08-29
申请号:US15903253
申请日:2018-02-23
Applicant: Advanced Micro Devices, Inc.
Inventor: John Wuu , Samuel Naffziger , Michael K. Ciraula , Russell Schreiber
Abstract: An integrated circuit includes first and second through-silicon via (TSV) circuits and a steering logic circuit. The first TSV circuit has a first TSV and a first multiplexer for selecting between a first TSV data signal received from the first TSV and a first local data signal for transmission to a first TSV output terminal. The second TSV circuit includes a second TSV and a second multiplexer for selecting between a second TSV data signal received from the second TSV and the first local data signal for transmission to a second TSV output terminal. The steering logic circuit controls the first multiplexer to select the first local data signal and the second multiplexer to select the second TSV data signal in a first mode, and the first multiplexer to select the first TSV data signal and the second multiplexer to select the first local data signal in a second mode.
-
公开(公告)号:US10394726B2
公开(公告)日:2019-08-27
申请号:US15229708
申请日:2016-08-05
Applicant: Advanced Micro Devices, Inc.
Inventor: Gabriel Loh
Abstract: A memory network includes a plurality of memory nodes each identifiable by an ordinal number m, and a set of links divided into N subsets of links, where each subset of links is identifiable by an ordinal number n. For each subset of the plurality of N subsets of links, each link in the subset connects two memory nodes that have ordinal numbers m differing by b(n-1), where b is a positive number. Each of the memory nodes is communicatively coupled to a processor via at least two non-overlapping pathways through the plurality of links.
-
公开(公告)号:US20190257696A1
公开(公告)日:2019-08-22
申请号:US15902101
申请日:2018-02-22
Applicant: Advanced Micro Devices, Inc.
Inventor: Ravinder Reddy Rachala , Stephen Victor Kosonocky , Stephen C. Ennis
Abstract: A calibrated temperature sensor includes a power on oscillator responsive to a calibration enable signal for providing a power on clock signal, a temperature dependent oscillator responsive to said calibration enable signal for providing a temperature dependent clock signal, and a measurement logic circuit. The measurement logic circuit counts a first number of pulses of the temperature dependent clock signal during a first calibration period using the power on clock signal, a second number of pulses of the temperature dependent clock signal during a second calibration period using a system clock signal, and a third number of pulses of the power on clock signal over a third calibration period using the system clock signal, and a fourth number of pulses of the temperature dependent clock signal using the system clock signal during a normal operation mode, wherein the first calibration period precedes both the second and third calibration periods.
-
公开(公告)号:US20190235838A1
公开(公告)日:2019-08-01
申请号:US16378055
申请日:2019-04-08
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Greg SADOWSKI , Wayne Burleson
CPC classification number: G06F7/4824 , G06F7/729
Abstract: A conversion unit converts operands from a conventional number system that represents each binary number in the operands as one bit to redundant number system (RNS) operands that represent each binary number as a plurality of bits. An arithmetic logic unit performs an arithmetic operation on the RNS operands in a direction from a most significant bit (MSB) to a least significant bit (LSB). The arithmetic logic unit stops performing the arithmetic operation prior to performing the arithmetic operation on a target binary number indicated by a dynamic precision associated with the RNS operands. In some cases, a power supply provides power to bit slices in the arithmetic logic unit and a clock signal generator provides clock signals to the bit slices. Gate logic is configured to gate the power or the clock signals provided to a subset of the bit slices.
-
公开(公告)号:US10366734B2
公开(公告)日:2019-07-30
申请号:US15424418
申请日:2017-02-03
Applicant: Advanced Micro Devices, Inc.
Inventor: Alexander W. Schaefer , Ravi T. Jotwani , Samiul Haque Khan , David Hugh McIntyre , Stephen Victor Kosonocky , John J. Wuu , Russell Schreiber
IPC: G11C8/08 , G11C11/418 , G11C5/14 , G11C11/413 , G11C11/419
Abstract: A system and method for efficient power, performance and stability tradeoffs of memory accesses under a variety of conditions are described. A system management unit in a computing system interfaces with a memory and a processing unit, and uses boosting of word line voltage levels in the memory to assist write operations. The computing system supports selecting one of multiple word line boost values, each with an associated cross-over region. A cross-over region is a range of operating voltages for the memory used for determining whether to enable or disable boosting of word line voltage levels in the memory. The system management unit selects between enabling and disabling the boosting of word line voltage levels based on a target operational voltage for the memory and the cross-over region prior to updating the operating parameters of the memory to include the target operational voltage.
-
-
-
-
-
-
-
-
-