-
公开(公告)号:US20210096864A1
公开(公告)日:2021-04-01
申请号:US16584775
申请日:2019-09-26
Applicant: Advanced Micro Devices, Inc.
Inventor: Michael John Bedy
IPC: G06F9/30
Abstract: Described herein are techniques for saving registers in the event of a function call. The techniques include modifying a program including a block of code designated as a calling code that calls a function. The modifying includes modifying the calling code to set a register usage mask indicating which registers are in use at the time of the function call. The modifying also includes modifying the function to combine the information of the register usage mask with information indicating registers used by the function to generate registers to be saved and save the registers to be saved.
-
公开(公告)号:US20210096174A1
公开(公告)日:2021-04-01
申请号:US16585963
申请日:2019-09-27
Applicant: Advanced Micro Devices, Inc.
Inventor: Venkat Krishnan Ravikumar , Jiann Min Chin , Joel Yang Kwang Wei , Pei Kin Leong
Abstract: A reconfigurable optic probe is used to measure signals from a device under test. The reconfigurable optic probe is positioned at a target probe location within a cell of the device under test. The cell including a target net to be measured and non-target nets. A test pattern is applied to the cell and a laser probe (LP) waveform is obtained in response. A target net waveform is extracted from the LP waveform by: i) configuring the reconfigurable optic probe to produce a ring-shaped beam having a relatively low-intensity region central to the ring-shaped beam; (ii) re-applying the test pattern to the cell at the target probe location with the relatively low-intensity region applied to the target net and obtaining a cross-talk LP waveform in response; (iii) normalizing the cross-talk LP waveform; and (iv) determining a target net waveform by subtracting the normalized cross-talk LP waveform from the LP waveform.
-
公开(公告)号:US10963299B2
公开(公告)日:2021-03-30
申请号:US16134695
申请日:2018-09-18
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Anthony Gutierrez , Sooraj Puthoor
Abstract: A processor core is configured to execute a parent task that is described by a data structure stored in a memory. A coprocessor is configured to dispatch a child task to the at least one processor core in response to the coprocessor receiving a request from the parent task concurrently with the parent task executing on the at least one processor core. In some cases, the parent task registers the child task in a task pool and the child task is a future task that is configured to monitor a completion object and enqueue another task associated with the future task in response to detecting the completion object. The future task is configured to self-enqueue by adding a continuation future task to a continuation queue for subsequent execution in response to the future task failing to detect the completion object.
-
公开(公告)号:US20210090205A1
公开(公告)日:2021-03-25
申请号:US16580654
申请日:2019-09-24
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Michael MANTOR , Alexander Fuad ASHKAR , Randy RAMSEY , Mangesh P. NIJASURE , Brian EMBERLING
Abstract: The address of the draw or dispatch packet responsible for creating an exception is tied to a shader/wavefront back to the draw command from which it originated. In various embodiments, a method of operating a graphics pipeline and exception handling includes receiving, at a command processor of a graphics processing unit (GPU), an exception signal indicating an occurrence of a pipeline exception at a shader stage of a graphics pipeline. The shader stage generates an exception signal in response to a pipeline exception and transmits the exception signal to the command processor. The command processor determines, based on the exception signal, an address of a command packet responsible for the occurrence of the pipeline exception.
-
公开(公告)号:US20210089304A1
公开(公告)日:2021-03-25
申请号:US16581252
申请日:2019-09-24
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Bin HE , Michael MANTOR , Jiasheng CHEN , Jian HUANG
Abstract: A processing unit such as a graphics processing unit (GPU) includes a plurality of vector signal processors (VSPs) that include multiply/accumulate elements. The processing unit also includes a plurality of registers associated with the plurality of VSPs. First portions of first and second matrices are fetched into the plurality of registers prior to a first round that includes a plurality of iterations. The multiply/accumulate elements perform matrix multiplication and accumulation on different combinations of subsets of the first portions of the first and second matrices in the plurality of iterations prior to fetching second portions of the first and second matrices into the plurality of registers for a second round. The accumulated results of multiplying the first portions of the first and second matrices are written into an output buffer in response to completing the plurality of iterations.
-
公开(公告)号:US10956536B2
公开(公告)日:2021-03-23
申请号:US16176662
申请日:2018-10-31
Applicant: Advanced Micro Devices, Inc.
Inventor: Shaizeen Aga , Nuwan Jayasena , Allen H. Rush , Michael Ignatowski
Abstract: A processing device is provided which comprises memory configured to store data and a plurality of processor cores in communication with each other via first and second hierarchical communication links. Processor cores of a first hierarchical processor core group are in communication with each other via the first hierarchical communication links and are configured to store, in the memory, a sub-portion of data of a first matrix and a sub-portion of data of a second matrix. The processor cores are also configured to determine a product of the sub-portion of data of the first matrix and the sub-portion of data of the second matrix, receive, from another processor core, another sub-portion of data of the second matrix and determine a product of the sub-portion of data of the first matrix and the other sub-portion of data of the second matrix.
-
公开(公告)号:US10956157B1
公开(公告)日:2021-03-23
申请号:US16293154
申请日:2019-03-05
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: David Kaplan , Marius Evers
Abstract: A subset of a set of architectural registers in a processing system is marked (or “tainted”) to indicate that speculative use of data in the subset of the architectural registers is constrained based on a taint handling policy. One or more speculation features supported by the processing system are disabled for the instruction so that the one or more speculation features cannot be used on data in the subset. In some cases, values of bits associated with the subset of architectural registers are modified to indicate that the subset is tainted. The taint handling policy can be indicated by values stored in a policy register. Taint markings are tracked in response to values stored in the tainted architectural registers being written to a memory or read from the memory.
-
公开(公告)号:US20210083677A1
公开(公告)日:2021-03-18
申请号:US16570334
申请日:2019-09-13
Applicant: Advanced Micro Devices, Inc.
Inventor: Achal Kathuria , Pradeep Jayaraman
Abstract: Systems, apparatuses, and methods for conveying and receiving information as electrical signals in a computing system are disclosed. A computing system includes multiple transmitters sending singled-ended data signals to multiple receivers. A termination voltage is generated and sent to the multiple receivers. The termination voltage is coupled to each of signal termination circuitry and signal sampling circuitry within each of the multiple receivers. Any change in the termination voltage affects the termination circuitry and affects comparisons performed by the sampling circuitry. Received signals are reconstructed at the receivers using the received signals, the signal termination circuitry and the signal sampling circuitry.
-
公开(公告)号:US10951892B2
公开(公告)日:2021-03-16
申请号:US16263630
申请日:2019-01-31
Applicant: Advanced Micro Devices, Inc.
Inventor: Adam H. Li
IPC: H04N19/124 , H04N19/176 , H04N19/137 , H04N19/146
Abstract: Systems, apparatuses, and methods for performing efficient bitrate control of video compression are disclosed. Logic in a bitrate controller of a video encoder receives a target block bitstream length for a block of pixels of a video frame. When the logic determines a count of previously compressed blocks does not exceed a count threshold, the logic selects a quantization parameter from a full range of available quantization parameters. After encoding the block, the logic determines a parameter based on a first ratio of the achieved block bitstream length to an exponential value of an actual quantization parameter used to generate the achieved block bitstream length. For another block, when the count exceeds the count threshold, the logic generates a quantization parameter based on a ratio of the target block bitstream length to an average of parameters of previously encoded blocks.
-
公开(公告)号:US10938709B2
公开(公告)日:2021-03-02
申请号:US16224739
申请日:2018-12-18
Applicant: Advanced Micro Devices, Inc.
Inventor: Mohamed Assem Ibrahim , Onur Kayiran , Yasuko Eckert , Jieming Yin
IPC: H04L12/761 , H04L12/781 , H04L12/715 , H04L12/931 , H04L12/729 , H04L12/733
Abstract: A method includes receiving, from an origin computing node, a first communication addressed to multiple destination computing nodes in a processor interconnect fabric, measuring a first set of one or more communication metrics associated with a transmission path to one or more of the multiple destination computing nodes, and for each of the destination computing nodes, based on the set of communication metrics, selecting between a multicast transmission mode and unicast transmission mode as a transmission mode for transmitting the first communication to the destination computing node.
-
-
-
-
-
-
-
-
-