-
公开(公告)号:US20240004721A1
公开(公告)日:2024-01-04
申请号:US17853294
申请日:2022-06-29
Applicant: Advanced Micro Devices, Inc.
Inventor: Indrani Paul , Alexander J. Branover , Benjamin Tsien , Elliot H. Mednick
IPC: G06F9/50
CPC classification number: G06F9/5083 , G06F9/5038 , G06F9/5033 , G06F9/5016
Abstract: An apparatus and method for efficiently performing power management for a multi-client computing system. In various implementations, a computing system includes multiple clients that process tasks corresponding to applications. The clients store generated requests of a particular type while processing tasks. A client receives an indication specifying that another client is having requests of the particular type being serviced. In response to receiving this indication, the client inserts a first urgency level in one or more stored requests of the particular type prior to sending the requests for servicing. When the client determines a particular time interval has elapsed, the client sends an indication to other clients specifying that requests of the particular type are being serviced. The client also inserts a second urgency level different from the first urgency level in one or more stored requests of the particular type prior to sending the requests for servicing.
-
公开(公告)号:US20190171420A1
公开(公告)日:2019-06-06
申请号:US15833287
申请日:2017-12-06
Applicant: Advanced Micro Devices, Inc.
Inventor: Nicholas P. Malaya , Elliot H. Mednick
IPC: G06F7/57 , G06N3/04 , G06F7/544 , H03K19/177
Abstract: A method includes providing a set of one or more computational units implemented in a set of one or more field programmable gate array (FPGA) devices, where the set of one or more computational units is configured to generate a plurality of output values based on one or more input values. The method further includes, for each computational unit of the set of computational units, performing a first calculation in the computational unit using a first number representation, where a first output of the plurality of output values is based on the first calculation, determining a second number representation based on the first output value, and performing a second calculation in the computational unit using the second number representation, where a second output of the plurality of output values is based on the second calculation.
-
公开(公告)号:US20190102175A1
公开(公告)日:2019-04-04
申请号:US15843965
申请日:2017-12-15
Applicant: Advanced Micro Devices, Inc.
Inventor: David A. Roberts , Elliot H. Mednick , David John Cownie
Abstract: A hybrid floating-point arithmetic processor includes a scheduler, a hybrid register file, and a hybrid arithmetic operation circuit. The scheduler has an input for receiving floating-point instructions, and an output for providing decoded register numbers in response to the floating-point instructions. The hybrid register file is coupled to the scheduler and contains circuitry for storing a plurality of floating-point numbers each represented by a digital sign bit, a digital exponent, and an analog mantissa. The hybrid register file has an output for providing selected ones of the plurality of floating-point numbers in response to the decoded register numbers. The hybrid arithmetic operation circuit is coupled to the scheduler and to the hybrid register file, for performing a hybrid arithmetic operation between two floating-point numbers selected by the scheduler and providing a hybrid result represented by a result digital sign bit, a result digital exponent, and a result analog mantissa.
-
公开(公告)号:US11216250B2
公开(公告)日:2022-01-04
申请号:US15833287
申请日:2017-12-06
Applicant: Advanced Micro Devices, Inc.
Inventor: Nicholas P. Malaya , Elliot H. Mednick
Abstract: A method includes providing a set of one or more computational units implemented in a set of one or more field programmable gate array (FPGA) devices, where the set of one or more computational units is configured to generate a plurality of output values based on one or more input values. The method further includes, for each computational unit of the set of computational units, performing a first calculation in the computational unit using a first number representation, where a first output of the plurality of output values is based on the first calculation, determining a second number representation based on the first output value, and performing a second calculation in the computational unit using the second number representation, where a second output of the plurality of output values is based on the second calculation.
-
公开(公告)号:US10452548B2
公开(公告)日:2019-10-22
申请号:US15718564
申请日:2017-09-28
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: David A. Roberts , Elliot H. Mednick
IPC: G06F12/0804 , G06F12/0811 , G06F12/0831 , G06F12/084 , G06F12/0891 , G06F12/126
Abstract: A method of preemptive cache writeback includes transmitting, from a first cache controller of a first cache to a second cache controller of a second cache, an unused bandwidth message representing an unused bandwidth between the first cache and the second cache during a first cycle. During a second cycle, a cache line containing dirty data is preemptively written back from the second cache to the first cache based on the unused bandwidth message. Further, the cache line in the second cache is written over in response to a cache miss to the second cache.
-
公开(公告)号:US20190243772A1
公开(公告)日:2019-08-08
申请号:US15891322
申请日:2018-02-07
Applicant: Advanced Micro Devices, Inc.
Inventor: David A. Roberts , Elliot H. Mednick
IPC: G06F12/0895 , G06F12/06 , G06F3/06
CPC classification number: G06F12/0895 , G06F3/0619 , G06F12/0646
Abstract: A method includes, for each data value in a set of one or more data values, determining a boundary between a high order portion of the data value and a low order portion of the data value, storing the low order portion at a first memory location utilizing a low data fidelity storage scheme, and storing the high order portion at a second memory location utilizing a high data fidelity storage scheme for recording data at a higher data fidelity than the low data fidelity storage scheme.
-
公开(公告)号:US10164639B1
公开(公告)日:2018-12-25
申请号:US15812411
申请日:2017-11-14
Applicant: Advanced Micro Devices, Inc.
Inventor: David A. Roberts , Andrew G. Kegel , Elliot H. Mednick
IPC: H03K19/177 , G06F17/50 , G06F15/78
Abstract: A macro scheduler includes a resource tracking module configured to update a database enumerating a plurality of macro components of a set of field programmable gate array (FPGA) devices, a communication interface configured to receive from a first client device a first design definition indicating one or more specified macro components for a design, resource allocation logic configured to allocate a first set of macro components for the design by allocating one of the plurality of macro components for each of the one or more specified macro components indicated in the first design definition, and configuration logic configured to implement the design in the set of FPGA devices by configuring the first set of allocated macro components according to the first design definition.
-
公开(公告)号:US11586472B2
公开(公告)日:2023-02-21
申请号:US16709404
申请日:2019-12-10
Applicant: Advanced Micro Devices, Inc.
Inventor: Alexander J. Branover , Benjamin Tsien , Elliot H. Mednick
Abstract: A method, system, and apparatus determines that one or more tasks should be relocated from a first processor to a second processor by comparing performance metrics to associated thresholds or by using other indications. To relocate the one or more tasks from the first processor to the second processor, the first processor is stalled and state information from the first processor is copied to the second processor. The second processor uses the state information and then services incoming tasks instead of the first processor.
-
公开(公告)号:US10289413B2
公开(公告)日:2019-05-14
申请号:US15843965
申请日:2017-12-15
Applicant: Advanced Micro Devices, Inc.
Inventor: David A. Roberts , Elliot H. Mednick , David John Cownie
Abstract: A hybrid floating-point arithmetic processor includes a scheduler, a hybrid register file, and a hybrid arithmetic operation circuit. The scheduler has an input for receiving floating-point instructions, and an output for providing decoded register numbers in response to the floating-point instructions. The hybrid register file is coupled to the scheduler and contains circuitry for storing a plurality of floating-point numbers each represented by a digital sign bit, a digital exponent, and an analog mantissa. The hybrid register file has an output for providing selected ones of the plurality of floating-point numbers in response to the decoded register numbers. The hybrid arithmetic operation circuit is coupled to the scheduler and to the hybrid register file, for performing a hybrid arithmetic operation between two floating-point numbers selected by the scheduler and providing a hybrid result represented by a result digital sign bit, a result digital exponent, and a result analog mantissa.
-
公开(公告)号:US20170212760A1
公开(公告)日:2017-07-27
申请号:US15353161
申请日:2016-11-16
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Meenakshi Sundaram Bhaskaran , Elliot H. Mednick , David A. Roberts , Anthony Asaro , Amin Farmahini-Farahani
IPC: G06F9/30 , G06F12/0817 , G06F12/0875 , G06F9/38
CPC classification number: G06F9/30043 , G06F9/3005 , G06F9/3016 , G06F9/3802 , G06F12/084 , G06F12/0862 , G06F12/0875 , G06F12/1027 , G06F2212/1024 , G06F2212/452 , G06F2212/6028
Abstract: A system and method for reducing latencies of main memory data accesses are described. A non-blocking load (NBLD) instruction identifies an address of requested data and a subroutine. The subroutine includes instructions dependent on the requested data. A processing unit verifies that address translations are available for both the address and the subroutine. The processing unit continues processing instructions with no stalls caused by younger-in-program-order instructions waiting for the requested data. The non-blocking load unit performs a cache coherent data read request on behalf of the NBLD instruction and requests that the processing unit perform an asynchronous jump to the subroutine upon return of the requested data from lower-level memory.
-
-
-
-
-
-
-
-
-