-
公开(公告)号:US10320695B2
公开(公告)日:2019-06-11
申请号:US15165953
申请日:2016-05-26
Applicant: Advanced Micro Devices, Inc.
Inventor: Steven K. Reinhardt , Marc S. Orr , Bradford M. Beckmann , Shuai Che , David A. Wood
IPC: G06F15/173 , H04L12/805 , H04L12/811
Abstract: A system and method for efficient management of network traffic management of highly data parallel computing. A processing node includes one or more processors capable of generating network messages. A network interface is used to receive and send network messages across a network. The processing node reduces at least one of a number or a storage size of the original network messages into one or more new network messages. The new network messages are sent to the network interface to send across the network.
-
公开(公告)号:US10209990B2
公开(公告)日:2019-02-19
申请号:US14728643
申请日:2015-06-02
Applicant: Advanced Micro Devices, Inc.
Inventor: David A. Wood , Steven K. Reinhardt , Bradford M. Beckmann , Marc S. Orr
Abstract: A conditional fetch-and-phi operation tests a memory location to determine if the memory locations stores a specified value and, if so, modifies the value at the memory location. The conditional fetch-and-phi operation can be implemented so that it can be concurrently executed by a plurality of concurrently executing threads, such as the threads of wavefront at a GPU. To execute the conditional fetch-and-phi operation, one of the concurrently executing threads is selected to execute a compare-and-swap (CAS) operation at the memory location, while the other threads await the results. The CAS operation tests the value at the memory location and, if the CAS operation is successful, the value is passed to each of the concurrently executing threads.
-
公开(公告)号:US10198261B2
公开(公告)日:2019-02-05
申请号:US15096205
申请日:2016-04-11
Applicant: Advanced Micro Devices, Inc.
Inventor: Shuai Che , Marc S. Orr , Bradford M. Beckmann
IPC: G06F9/30 , G06F12/10 , G06F12/0802 , G06F12/0811 , G06F12/0815 , G06F12/0837 , G06F12/0875 , G06F12/0897
Abstract: A method of performing memory synchronization operations is provided that includes receiving, at a programmable cache controller in communication with one or more caches, an instruction in a first language to perform a memory synchronization operation of synchronizing a plurality of instruction sequences executing on a processor, mapping the received instruction in the first language to one or more selected cache operations in a second language executable by the cache controller and executing the one or more cache operations to perform the memory synchronization operation. The method further comprises receiving a second mapping that provides mapping instructions to map the received instruction to one or more other cache operations, mapping the received instruction to one or more other cache operations and executing the one or more other cache operations to perform the memory synchronization operation.
-
公开(公告)号:US20140250442A1
公开(公告)日:2014-09-04
申请号:US13782063
申请日:2013-03-01
Applicant: ADVANCED MICRO DEVICES, INC
Inventor: Steven K. Reinhardt , Marc S. Orr , Bradford M. Beckmann
IPC: G06F9/54
CPC classification number: G06F9/542 , G06F2209/543
Abstract: The described embodiments include a computing device. In these embodiments, an entity in the computing device receives an identification of a memory location and a condition to be met by a value in the memory location. Upon a predetermined event occurring, the entity causes an operation to be performed when the value in the memory location meets the condition.
Abstract translation: 所描述的实施例包括计算设备。 在这些实施例中,计算设备中的实体通过存储器位置中的值接收存储器位置的标识和要满足的条件。 当预定事件发生时,当存储器位置中的值满足条件时,实体导致执行操作。
-
公开(公告)号:US10360652B2
公开(公告)日:2019-07-23
申请号:US14304483
申请日:2014-06-13
Applicant: Advanced Micro Devices, Inc.
Inventor: Marc S. Orr , Bradford M. Beckmann , Benedict R. Gaster , Steven K. Reinhardt , David A. Wood
IPC: G06T1/20
Abstract: A processor comprising hardware logic configured to execute of a first wavefront in a hardware resource and stop execution of the first wavefront before the first wavefront completes. The processor schedules a second wavefront for execution in the hardware resource.
-
公开(公告)号:US20170293487A1
公开(公告)日:2017-10-12
申请号:US15096205
申请日:2016-04-11
Applicant: Advanced Micro Devices, Inc.
Inventor: Shuai Che , Marc S. Orr , Bradford M. Beckmann
CPC classification number: G06F9/30087 , G06F12/0802 , G06F12/0811 , G06F12/0815 , G06F12/0837 , G06F12/0875 , G06F12/0897 , G06F12/10 , G06F2212/1016 , G06F2212/452
Abstract: A method of performing memory synchronization operations is provided that includes receiving, at a programmable cache controller in communication with one or more caches, an instruction in a first language to perform a memory synchronization operation of synchronizing a plurality of instruction sequences executing on a processor, mapping the received instruction in the first language to one or more selected cache operations in a second language executable by the cache controller and executing the one or more cache operations to perform the memory synchronization operation. The method further comprises receiving a second mapping that provides mapping instructions to map the received instruction to one or more other cache operations, mapping the received instruction to one or more other cache operations and executing the one or more other cache operations to perform the memory synchronization operation.
-
公开(公告)号:US20160357551A1
公开(公告)日:2016-12-08
申请号:US14728643
申请日:2015-06-02
Applicant: Advanced Micro Devices, Inc.
Inventor: David A. Wood , Steven K. Reinhardt , Bradford M. Beckmann , Marc S. Orr
IPC: G06F9/30
CPC classification number: G06F9/3004 , G06F9/30072 , G06F9/30087 , G06F9/345 , G06F9/3851 , G06F9/52 , G06F9/526
Abstract: A conditional fetch-and-phi operation tests a memory location to determine if the memory locations stores a specified value and, if so, modifies the value at the memory location. The conditional fetch-and-phi operation can be implemented so that it can be concurrently executed by a plurality of concurrently executing threads, such as the threads of wavefront at a GPU. To execute the conditional fetch-and-phi operation, one of the concurrently executing threads is selected to execute a compare-and-swap (CAS) operation at the memory location, while the other threads await the results. The CAS operation tests the value at the memory location and, if the CAS operation is successful, the value is passed to each of the concurrently executing threads.
Abstract translation: 条件获取和操作操作测试存储器位置以确定存储器位置是否存储指定的值,如果是,则修改存储器位置处的值。 可以实现条件获取和操作操作,使得其可以由多个并发执行的线程(诸如GPU处的波阵面的线程)同时执行。 为了执行条件提取和操作操作,选择并发执行的线程之一,以在存储器位置执行比较和交换(CAS)操作,而其他线程等待结果。 CAS操作测试内存位置的值,如果CAS操作成功,则将该值传递给每个并发执行的线程。
-
公开(公告)号:US10025605B2
公开(公告)日:2018-07-17
申请号:US15094615
申请日:2016-04-08
Applicant: Advanced Micro Devices, Inc.
Inventor: Shuai Che , Marc S. Orr
IPC: G06F9/455
Abstract: A receiving node in a computer system that includes a plurality of types of execution units receives an active message from a sending node. The receiving node compiles an intermediate language message handler corresponding to the active message into a machine instruction set architecture (ISA) message handler and the receiver executes the ISA message handler on a selected one of the execution units. If the active message handler is not available at the receiver, the sender sends an intermediate language version of the message handler to the receiving node. The execution unit selected to execute the message handler is chosen based on a field in the active message or on runtime criteria in the receiving system.
-
公开(公告)号:US09804883B2
公开(公告)日:2017-10-31
申请号:US14542042
申请日:2014-11-14
Applicant: Advanced Micro Devices, Inc.
Inventor: Marc S. Orr , Bradford M. Beckmann , Ayse Yilmazer , Shuai Che , David A. Wood , Mark D. Hill
IPC: G06F9/46 , G06F12/0806 , G06F3/06 , G06F12/0871
CPC classification number: G06F9/46 , G06F3/06 , G06F12/0806 , G06F12/0871 , G06F2212/222 , G06F2212/313 , G06F2212/401
Abstract: Described herein is an apparatus and method for remote scoped synchronization, which is a new semantic that allows a work-item to order memory accesses with a scope instance outside of its scope hierarchy. More precisely, remote synchronization expands visibility at a particular scope to all scope-instances encompassed by that scope. Remote scoped synchronization operation allows smaller scopes to be used more frequently and defers added cost to only when larger scoped synchronization is required. This enables programmers to optimize the scope that memory operations are performed at for important communication patterns like work stealing. Executing memory operations at the optimum scope reduces both execution time and energy. In particular, remote synchronization allows a work-item to communicate with a scope that it otherwise would not be able to access. Specifically, work-items can pull valid data from and push updates to scopes that do not (hierarchically) contain them.
-
公开(公告)号:US20170293499A1
公开(公告)日:2017-10-12
申请号:US15094615
申请日:2016-04-08
Applicant: Advanced Micro Devices, Inc.
Inventor: Shuai Che , Marc S. Orr
IPC: G06F9/455
CPC classification number: G06F9/4552
Abstract: A receiving node in a computer system that includes a plurality of types of execution units receives an active message from a sending node. The receiving node compiles an intermediate language message handler corresponding to the active message into a machine instruction set architecture (ISA) message handler and the receiver executes the ISA message handler on a selected one of the execution units. If the active message handler is not available at the receiver, the sender sends an intermediate language version of the message handler to the receiving node. The execution unit selected to execute the message handler is chosen based on a field in the active message or on runtime criteria in the receiving system.
-
-
-
-
-
-
-
-
-