-
公开(公告)号:US10649810B2
公开(公告)日:2020-05-12
申请号:US14981257
申请日:2015-12-28
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Jimshed Mirza , YunPeng Zhu
IPC: G06F9/50
Abstract: Methods, devices, and systems for data driven scheduling of a plurality of computing cores of a processor. A plurality of threads may be executed on the plurality of computing cores, according to a default schedule. The plurality of threads may be analyzed, based on the execution, to determine correlations among the plurality of threads. A data driven schedule may be generated based on the correlations. The plurality of threads may be executed on the plurality of computing cores according to the data driven schedule.
-
公开(公告)号:US10353859B2
公开(公告)日:2019-07-16
申请号:US15432173
申请日:2017-02-14
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: YunPeng Zhu , Jimshed Mirza
Abstract: A method for allocating registers in a compute unit of a vector processor includes determining a maximum number of registers that are to be used concurrently by a plurality of threads of a kernel at the compute unit. The method further includes setting a mode of register allocation at the compute unit based on a comparison of the determined maximum number of registers and a total number of physical registers implemented at the compute unit.
-
公开(公告)号:US10241925B2
公开(公告)日:2019-03-26
申请号:US15433560
申请日:2017-02-15
Applicant: ATI Technologies ULC
Inventor: Jimshed Mirza , Anthony Chan , Edwin Chi Yeung Pang
IPC: G06F12/10 , G06F12/1027 , G06F12/1009
Abstract: Systems, apparatuses, and methods for selecting default page sizes in a variable page size translation lookaside buffer (TLB) are disclosed. In one embodiment, a system includes at least one processor, a memory subsystem, and a first TLB. The first TLB is configured to allocate a first entry for a first request responsive to detecting a miss for the first request in the first TLB. Prior to determining a page size targeted by the first request, the first TLB specifies, in the first entry, that the first request targets a page of a first page size. Responsive to determining that the first request actually targets a second page size, the first TLB reissues the first request with an indication that the first request targets the second page size. On the reissue, the first TLB allocates a second entry and specifies the second page size for the first request.
-
公开(公告)号:US10223280B2
公开(公告)日:2019-03-05
申请号:US16025449
申请日:2018-07-02
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Vydhyanathan Kalyanasundharam , Yaniv Adiri , Philip Ng , Maggie Chan , Vincent Cueva , Anthony Asaro , Jimshed Mirza , Greggory D. Donley , Bryan Broussard , Benjamin Tsien
IPC: G06F3/14 , G06F13/38 , G06F12/1009 , G06F12/12 , G06F12/1045
Abstract: A system including a gasket communicatively coupled between a unified northbridge (UNB) having a cache coherent interconnect (CCI) interface and a processor having an Advanced eXtensible Interface (AXI) coherency extension (ACE). The gasket is configured to translate requests from the processor that include ACE commands into equivalent CCI commands, wherein each request from the processor maps onto a specific CCI request type. The gasket is further configured to translate ACE tags into CCI tags. The gasket is further configured to translate CCI encoded probes from a system resource interface (SRI) into equivalent ACE snoop transactions. The gasket is further configured to translate the memory map to inter-operate with a UNB/coherent HyperTransport (cHT) environment. The gasket is further configured to receive a barrier transaction that is used to provide ordering for transactions.
-
公开(公告)号:US20180314528A1
公开(公告)日:2018-11-01
申请号:US15607118
申请日:2017-05-26
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Yunpeng Zhu , Jimshed Mirza
Abstract: Systems, apparatuses, and methods for generating flexibly addressed memory requests are disclosed. In one embodiment, a system includes a processor, control unit, and memory subsystem. The processor launches a plurality of threads on a plurality of compute units, wherein each thread generates memory requests without specifying target memory addresses. The threads executing on the plurality of compute units convey a plurality of memory requests to the control unit. The control unit generates target memory addresses for the plurality of received memory requests. In one embodiment, the memory requests are write requests, and the control unit interleaves write requests from the plurality of threads into a single output buffer stored in the memory subsystem. The control unit can be located in a cache, in a memory controller, or in another location within the system.
-
公开(公告)号:US20180307619A1
公开(公告)日:2018-10-25
申请号:US16025449
申请日:2018-07-02
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Vydhyanathan Kalyanasundharam , Philip Ng , Maggie Chan , Vincent Cueva , Anthony Asaro , Jimshed Mirza , Greggory D. Donley , Bryan Broussard , Benjamin Tsien , Yaniv Adiri
IPC: G06F12/1009 , G06F12/1045 , G06F12/12
CPC classification number: G06F12/1009 , G06F12/1045 , G06F12/12 , G06F2212/684
Abstract: A system including a gasket communicatively coupled between a unified northbridge (UNB) having a cache coherent interconnect (CCI) interface and a processor having an Advanced eXtensible Interface (AXI) coherency extension (ACE). The gasket is configured to translate requests from the processor that include ACE commands into equivalent CCI commands, wherein each request from the processor maps onto a specific CCI request type. The gasket is further configured to translate ACE tags into CCI tags. The gasket is further configured to translate CCI encoded probes from a system resource interface (SRI) into equivalent ACE snoop transactions. The gasket is further configured to translate the memory map to inter-operate with a UNB/coherent HyperTransport (cHT) environment. The gasket is further configured to receive a barrier transaction that is used to provide ordering for transactions.
-
公开(公告)号:US10025721B2
公开(公告)日:2018-07-17
申请号:US14523705
申请日:2014-10-24
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Vydhyanathan Kalyanasundharam , Philip Ng , Maggie Chan , Vincent Cueva , Anthony Asaro , Jimshed Mirza , Greggory D. Donley , Bryan Broussard , Benjamin Tsien , Yaniv Adiri
IPC: G06F12/10 , G06F12/12 , G06F12/1009 , G06F12/1045 , G06F13/38
Abstract: The present invention provides for page table access and dirty bit management in hardware via a new atomic test[0] and OR and Mask. The present invention also provides for a gasket that enables ACE to CCI translations. This gasket further provides request translation between ACE and CCI, deadlock avoidance for victim and probe collision, ARM barrier handling, and power management interactions. The present invention also provides a solution for ARM victim/probe collision handling which deadlocks the unified northbridge. These solutions includes a dedicated writeback virtual channel, probes for IO requests using 4-hop protocol, and a WrBack Reorder Ability in MCT where victims update older requests with data as they pass the requests.
-
28.
公开(公告)号:US20170315927A1
公开(公告)日:2017-11-02
申请号:US15139902
申请日:2016-04-27
Applicant: ATI Technologies ULC , Advanced Micro Devices, Inc.
Inventor: Gabriel H. Loh , Jimshed Mirza
IPC: G06F12/1027 , G06F12/1009
CPC classification number: G06F12/1027 , G06F12/1009 , G06F2212/1021 , G06F2212/401 , G06F2212/502 , G06F2212/656 , G06F2212/684 , Y02D10/13
Abstract: Methods and apparatus obtain one or more system page table entries that represent virtual system (e.g., memory) page to physical system page translations. A number of the obtained system page table entries that can be encoded in each of a plurality of translation lookaside buffer (TLB) entry encoding formats are determined. The method and apparatus may select one of the TLB entry encoding formats that encode a number of the obtained system page table entries. The method and apparatus may encode a number of obtained system page table entries in the TLB entry encoding format selected into a compressed encoding format TLB entry. The method and apparatus may associate the compressed encoding format TLB entry with an encoding format indication of the encoding format selected. The method and apparatus may decode a compressed encoding format TLB entry based on a determined TLB entry encoding format.
-
公开(公告)号:US09239804B2
公开(公告)日:2016-01-19
申请号:US14045701
申请日:2013-10-03
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Andrew Kegel , Jimshed Mirza , Paul Blinzer , Philip Ng
CPC classification number: G06F13/00 , G06F11/073 , G06F11/0745 , G06F11/0793 , G06F12/00 , G06F12/10 , G06F12/1009 , G06F12/1081 , G06F13/385 , Y02D10/14 , Y02D10/151
Abstract: A system and method of managing requests from peripherals in a computer system are provided. In the system and method, an input/output memory management unit (IOMMU) receives a peripheral page request (PPR) from a peripheral. In response to a determination that a criterion regarding an available capacity of a PPR log is satisfied, a completion message is sent to the peripheral indicating that the PPR is complete and the PPR is discarded without queuing the PPR in the PPR log.
Abstract translation: 提供了一种在计算机系统中管理来自外围设备的请求的系统和方法。 在系统和方法中,输入/输出存储器管理单元(IOMMU)从外设接收外围寻呼请求(PPR)。 响应于满足关于PPR日志的可用容量的标准的确定,向外设发送完成消息,指示PPR完成并且PPR被丢弃,而不在PPR日志中排队PPR。
-
-
-
-
-
-
-
-