-
公开(公告)号:US11537319B2
公开(公告)日:2022-12-27
申请号:US16710563
申请日:2019-12-11
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Alexander Fuad Ashkar , James R. Klobcar , Harry J. Wise
Abstract: A processing system includes a content addressable memory (CAM) in an input/output path to selectively modify register writes on a per-pipeline basis. The CAM compares an address of a register write to an address field of each entry of the CAM. If a match is found, the CAM modifies the register write data as defined by a function for the matching entry of the CAM. In some embodiments, each entry of the CAM includes a data mask defining subfields of the register write data, wherein each subfield includes subfield data including one or more bits.
-
公开(公告)号:US11900123B2
公开(公告)日:2024-02-13
申请号:US16713432
申请日:2019-12-13
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Alexander Fuad Ashkar , Manu Rastogi , Harry J. Wise
CPC classification number: G06F9/3869 , G06F9/3836 , G06F15/80 , G06T1/20
Abstract: A system includes a processing unit such as a GPU that itself includes a command processor configured to receive instructions for execution from a software application. A processor pipeline coupled to the processing unit includes a set of parallel processing units for executing the instructions in sets. A set manager is coupled to one or more of the processor pipeline and the command processor. The set manager includes at least one table for storing a set start time, a set end time, and a set execution time. The set manager determines an execution time for one or more sets of instructions of a first window of sets of instructions submitted to the processor pipeline. Based on the execution time of the one or more sets of instructions, a set limit is determined and applied to one or more sets of instructions of a second window subsequent to the first window.
-
公开(公告)号:US11809558B2
公开(公告)日:2023-11-07
申请号:US17032969
申请日:2020-09-25
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Harry J. Wise , Alexander Fuad Ashkar , Manu Rastogi
CPC classification number: G06F21/566 , G06F9/30076 , G06F9/3802 , G06F21/54 , G06F21/567 , H04L69/22
Abstract: A method of packet attribute confirmation includes receiving, at a command processor of a parallel processor, a command packet including a received packet attribute, such as a packet size, of the command packet. The command processor compares the received packet attribute of the command packet relative to an expected packet attribute of the command packet. The command processor passes one or more commands to a prefetch parser such that a summed total size of the one or more commands is equal to the received packet size of the command packet. The command processor passes, based at least on determining a match between the received packet size and the expected packet size, the received command packet to the prefetch parser. Otherwise, the command processor passes, based at least on determining a mismatch between the received packet size and the expected packet size, one or more no-operation instructions to the prefetch parser.
-
公开(公告)号:US10558489B2
公开(公告)日:2020-02-11
申请号:US15438466
申请日:2017-02-21
Applicant: Advanced Micro Devices, Inc.
Inventor: Alexander Fuad Ashkar , Michael J. Mantor , Randy Wayne Ramsey , Rex Eldon McCrary , Harry J. Wise
Abstract: Systems, apparatuses, and methods for suspending and restoring operations on a processor are disclosed. In one embodiment, a processor includes at least a control unit, multiple execution units, and multiple work creation units. In response to detecting a request to suspend a software application executing on the processor, the control unit sends requests to the plurality of work creation units to stop creating new work. The control unit waits until receiving acknowledgements from the work creation units prior to initiating a suspend operation. Once all work creation units have acknowledged that they have stopped creating new work, the control unit initiates the suspend operation. Also, when a restore operation is initiated, the control unit prevents any work creation units from launching new work-items until all previously in-flight work-items have been restored to the same work creation units and execution units to which they were previously allocated.
-
公开(公告)号:US20180239635A1
公开(公告)日:2018-08-23
申请号:US15438466
申请日:2017-02-21
Applicant: Advanced Micro Devices, Inc.
Inventor: Alexander Fuad Ashkar , Michael J. Mantor , Randy Wayne Ramsey , Rex Eldon McCrary , Harry J. Wise
IPC: G06F9/48
Abstract: Systems, apparatuses, and methods for suspending and restoring operations on a processor are disclosed. In one embodiment, a processor includes at least a control unit, multiple execution units, and multiple work creation units. In response to detecting a request to suspend a software application executing on the processor, the control unit sends requests to the plurality of work creation units to stop creating new work. The control unit waits until receiving acknowledgements from the work creation units prior to initiating a suspend operation. Once all work creation units have acknowledged that they have stopped creating new work, the control unit initiates the suspend operation. Also, when a restore operation is initiated, the control unit prevents any work creation units from launching new work-items until all previously in-flight work-items have been restored to the same work creation units and execution units to which they were previously allocated.
-
公开(公告)号:US20180210657A1
公开(公告)日:2018-07-26
申请号:US15417011
申请日:2017-01-26
Applicant: Advanced Micro Devices, Inc.
Inventor: Rex Eldon McCrary , Michael J. Mantor , Alexander Fuad Ashkar , Harry J. Wise
CPC classification number: G06F3/0607 , G06F3/0619 , G06F3/0634 , G06F3/065 , G06F3/067 , G06F9/4881 , G06F9/50
Abstract: Systems, apparatuses, and methods for implementing software control of state sets are disclosed. In one embodiment, a processor includes at least an execution unit and a plurality of state registers. The processor is configured to detect a command to allocate a first state set for storing a first state, wherein the command is generated by software, and wherein the first state specifies values for the plurality of state registers. The command is executed on the execution unit while the processor is in a second state, wherein the second state is different from the first state. The first state set of the processor is allocated with the first state responsive to executing the command on the execution unit. The processor is configured to allocate the first state set for the first state prior to the processor entering the first state.
-
公开(公告)号:US20180082398A1
公开(公告)日:2018-03-22
申请号:US15270679
申请日:2016-09-20
Applicant: Advanced Micro Devices, Inc.
Inventor: Alexander Fuad Ashkar , Harry J. Wise , Rex Eldon McCrary , Angel E. Socarras
CPC classification number: G06T1/20 , G06F1/325 , G06F3/0604 , G06F3/0638 , G06F3/0685 , G06T1/60
Abstract: An adaptive list stores previously received hardware state information that has been used to configure a graphics processing core. One or more filters are configured to filter packets from a packet stream directed to the graphics processing core. The packets are filtered based on a comparison of hardware state information included in the packet and hardware state information stored in the adaptive list. The adaptive list is modified in response to filtering the first packet. The filters can include a hardware filter and a software filter that selectively filters the packets based on whether the graphics processing core is limiting throughput. The adaptive list can be implemented as content-addressable memory (CAM), a cache, or a linked list.
-
公开(公告)号:US20240378790A1
公开(公告)日:2024-11-14
申请号:US18374736
申请日:2023-09-29
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Alexander Fuad Ashkar , Manu Rastogi , Nishank Pathak , Harry J. Wise
IPC: G06T15/00
Abstract: A processor includes a plurality of state registers and a command processor. The plurality of state registers is configured to maintain context states for a plurality of graphics contexts. The command processor includes a processing unit and a fixed-function hardware circuit. The processing unit is configured to place a plurality of graphics commands into a queue. The fixed-function hardware circuit is configured to monitor a graphics command stream output by the queue and detect a specified graphics command in the monitored graphics command stream. In response to the detected specified graphics command, the fixed-function hardware circuit is further configured to perform at least one graphics command management operation that includes one or more of a graphics context management operation or a graphics persistent state management operation.
-
公开(公告)号:US11900499B2
公开(公告)日:2024-02-13
申请号:US17028803
申请日:2020-09-22
Applicant: Advanced Micro Devices, Inc.
Inventor: Anirudh Rajendra Acharya , Ruijin Wu , Alexander Fuad Ashkar , Harry J. Wise
CPC classification number: G06T1/20 , G06F9/3836 , G06F9/544 , G06T7/60
Abstract: A technique for executing commands for an accelerated processing device is provided. The technique includes obtaining an iteration number and predication data from metadata for an iterative indirect command buffer; for each iteration indicated by the iteration number, performing commands of the iterative indirect command buffer as specified by the predication data; and ending processing of the iterative indirect command buffer in response to processing a number of iterations equal to the iteration number.
-
公开(公告)号:US11144329B2
公开(公告)日:2021-10-12
申请号:US16427407
申请日:2019-05-31
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Alexander Fuad Ashkar , Rakan Khraisha , Rex Eldon McCrary , Harry J. Wise
Abstract: A processing unit employs microcode wherein the jump table associated with the microcode is embedded in the microcode itself. When the microcode is compiled based on a set of programmer instructions, the compiler prepares the jump table for the microcode and stores the jump table in the same file or other storage unit as the microcode. When the processing unit is initialized to execute a program, such as an operating system, the processing unit retrieves the microcode corresponding to the program from memory, stores the microcode in a cache or other memory module for execution, and automatically loads the embedded jump table from the microcode to a specified set of jump table registers, thereby preparing the processing unit to process received packets.
-
-
-
-
-
-
-
-
-