-
公开(公告)号:US10366008B2
公开(公告)日:2019-07-30
申请号:US15376275
申请日:2016-12-12
Applicant: Advanced Micro Devices, Inc.
Inventor: Ganesh Balakrishnan , Vydhyanathan Kalyanasundharam , Kevin M. Lepak
IPC: G06F12/08 , G06F12/0853 , G06F12/0811 , G06F12/084
Abstract: A data processing system includes a processor and a cache controller coupled to the processor, and adapted to be coupled to a memory. The cache controller uses the memory to form a pseudo direct mapped cache having a plurality of groups of pages. The memory forms a first number of selected pages, including a first page for storing a plurality of sets of tags and a plurality of remaining pages for storing data. Each tag, of the plurality of sets of tags, stores tags for respective entries in a corresponding one of the plurality of remaining pages.
-
公开(公告)号:US10365824B2
公开(公告)日:2019-07-30
申请号:US15495296
申请日:2017-04-24
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Wade K. Smith , Anthony Asaro
IPC: G06F3/06 , G06F12/1009 , G06F12/1027
Abstract: Systems, apparatuses, and methods for migrating memory pages are disclosed herein. In response to detecting that a migration of a first page between memory locations is being initiated, a first page table entry (PTE) corresponding to the first page is located and a migration pending indication is stored in the first PTE. In one embodiment, the migration pending indication is encoded in the first PTE by disabling read and write permissions. If a translation request targeting the first PTE is received by the MMU and the translation request corresponds to a read request, a read operation is allowed to the first page. Otherwise, if the translation request corresponds to a write request, a write operation to the first page is blocked and a silent retry request is generated and conveyed to the requesting client.
-
公开(公告)号:US20190229736A1
公开(公告)日:2019-07-25
申请号:US16370479
申请日:2019-03-29
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Stephen Victor Kosonocky , Mikhail Rodionov , Joyce Cheuk Wai Wong
Abstract: An oscillator circuit is provided that adapts to voltage supply variations. The circuit first and second delays lines connected inputs of an edge detector, one delay line supplied by a reference voltage and the other with a drooping supply voltage. The edge detector generates an output clock based on a relationship between the inputs. The output clock applied to the signal inputs of the first and second delay lines. The output clock has a voltage dependent frequency performance curve with a slope dependent at least on the second delay line delay and a delay of the edge detector. At least one of the first delay line, the second delay line, and the edge detector delay are adjusted to change the slope of the performance curve.
-
公开(公告)号:US10360177B2
公开(公告)日:2019-07-23
申请号:US15189054
申请日:2016-06-22
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Syed Zohaib M. Gilani , Jiasheng Chen , QingCheng Wang , YunXiao Zou , Michael Mantor , Bin He , Timour T. Paltashev
IPC: G06F15/80 , G06F1/3234 , G06T15/00
Abstract: Described is a method and processing apparatus to improve power efficiency by gating redundant threads processing. In particular, the method for gating redundant threads in a graphics processor includes determining if data for a thread and data for at least another thread are within a predetermined similarity threshold, gating execution of the at least another thread if the data for the thread and the data for the at least another thread are within the predetermined similarity threshold, and using an output data from the thread as an output data for the at least another thread.
-
公开(公告)号:US10355966B2
公开(公告)日:2019-07-16
申请号:US15081558
申请日:2016-03-25
Applicant: Advanced Micro Devices, Inc.
Inventor: Samuel Lawrence Wasmundt , Leonardo Piga , Indrani Paul , Wei Huang , Manish Arora
Abstract: Systems, apparatuses, and methods for managing variations among nodes in parallel system frameworks. Sensor and performance data associated with the nodes of a multi-node cluster may be monitored to detect variations among the nodes. A variability metric may be calculated for each node of the cluster based on the sensor and performance data associated with the node. The variability metrics may then be used by a mapper to efficiently map tasks of a parallel application to the nodes of the cluster. In one embodiment, the mapper may assign the critical tasks of the parallel application to the nodes with the lowest variability metrics. In another embodiment, the hardware of the nodes may be reconfigured so as to reduce the node-to-node variability.
-
656.
公开(公告)号:US10353708B2
公开(公告)日:2019-07-16
申请号:US15273916
申请日:2016-09-23
Applicant: Advanced Micro Devices, Inc.
Inventor: Anupama Rajesh Rasale , Dibyendu Das , Ashutosh Nema , Md Asghar Ahmad Shahid , Prathiba Kumar
Abstract: Systems, apparatuses, and methods for utilizing efficient vectorization techniques for operands in non-sequential memory locations are disclosed. A system includes a vector processing unit (VPU) and one or more memory devices. In response to determining that a plurality of vector operands are stored in non-sequential memory locations, the VPU performs a plurality of vector load operations to load the plurality of vector operands into a plurality of vector registers. Next, the VPU performs a shuffle operation to consolidate the plurality of vector operands from the plurality of vector registers into a single vector register. Then, the VPU performs a vector operation on the vector operands stored in the single vector register. The VPU can also perform a vector store operation by permuting and storing a plurality of vector operands in appropriate locations within multiple vector registers and then storing the vector registers to locations in memory using a mask.
-
公开(公告)号:US10353591B2
公开(公告)日:2019-07-16
申请号:US15442499
申请日:2017-02-24
Applicant: Advanced Micro Devices, Inc.
Inventor: Michael L. Schmitt , Radhakrishna Giduthuri
Abstract: Improvements in compute shader programs executed on parallel processing hardware are disclosed. An application or other entity defines a sequence of shader programs to execute. Each shader program defines inputs and outputs which would, if unmodified, execute as loads and stores to a general purpose memory, incurring high latency. A compiler combines the shader programs into groups that can operate in a lower-latency, but lower-capacity local data store memory. The boundaries of these combined shader programs are defined by several aspects including where memory barrier operations are to execute, whether combinations of shader programs can execute using only the local data store and not the global memory (except for initial reads and writes) and other aspects.
-
公开(公告)号:US20190204902A1
公开(公告)日:2019-07-04
申请号:US15858138
申请日:2017-12-29
Applicant: Advanced Micro Devices, Inc.
Inventor: Thomas J. Gibney , Sonu Arora
Abstract: A method and apparatus control power consumption of at least one functional unit on an integrated circuit by determining that a change in a first performance state is required for the at least one functional unit, and changing the first performance state to a second performance state that sets voltage for the functional unit to be at an under-voltage margin setting with respect to a nominal product minimum voltage of the functional unit.
-
公开(公告)号:US20190204899A1
公开(公告)日:2019-07-04
申请号:US15856546
申请日:2017-12-28
Applicant: Advanced Micro Devices, Inc.
Inventor: Benjamin Tsien , Greggory D. Donley , Bryan P. Broussard
IPC: G06F1/32
CPC classification number: G06F1/3287 , G06F1/3234 , G06F1/3296 , G06F9/5094
Abstract: Systems, apparatuses, and methods for performing efficient power management for a multi-node computing system are disclosed. A computing system includes multiple nodes. When power down negotiation is distributed, negotiation for system-wide power down occurs within a lower level of a node hierarchy prior to negotiation for power down occurring at a higher level of the node hierarchy. When power down negotiation is centralized, a given node combines a state of its clients with indications received on its downstream link and sends an indication on an upstream link based on the combining. Only a root node sends power down requests.
-
660.
公开(公告)号:US10340916B1
公开(公告)日:2019-07-02
申请号:US15859124
申请日:2017-12-29
Applicant: Advanced Micro Devices, Inc.
Inventor: Thomas J. Gibney , Sridhar V. Gada , Alexander J. Branover , Benjamin Tsien
IPC: G06F17/50 , H03K19/0175 , H03K19/173
CPC classification number: H03K19/017509 , H03K19/1733
Abstract: An electronic device includes a plurality of hardware functional blocks, the hardware functional blocks being logically grouped into two or more islands, with each island including a different one or more of the hardware functional blocks. A hardware controller in the electronic device is configured to determine a present activity being performed by at least one of the hardware functional blocks. The hardware controller then, based on the present activity, configures supply voltages for the hardware functional blocks in some or all of the islands.
-
-
-
-
-
-
-
-
-