-
公开(公告)号:US10366013B2
公开(公告)日:2019-07-30
申请号:US14997026
申请日:2016-01-15
Applicant: Futurewei Technologies, Inc.
Inventor: Lee McFearin , Sushma Wokhlu , Alan Gatherer
IPC: G06F12/12 , G06F12/08 , G06F12/121 , G06F12/0891 , G06F12/0897 , G06F12/0804 , G06F12/126
Abstract: The present disclosure relates to a system and method of managing operation of a cache memory. The system and method assign each nested task a level, and each task within a nested level an instance. Using the assigned task levels and instances, the cache management module is able to determine which cache entries to evict from cache when space is needed, and which evicted cache entries to recover upon completion of preempting tasks.
-
公开(公告)号:US20170163698A1
公开(公告)日:2017-06-08
申请号:US14958773
申请日:2015-12-03
Applicant: Futurewei Technologies, Inc.
Inventor: Ashish Rai Shrivastava , Alan Gatherer , Sushma Wokhlu
IPC: H04L29/06
CPC classification number: H04L65/4069 , G06F9/30032 , G06F9/3004 , G06F9/3455 , G06F9/383 , G06F12/00 , G06F15/76
Abstract: A data streaming unit (DSU) and a method for operating a DSU are disclosed. In an embodiment the DSU includes a memory interface configured to be connected to a storage unit, a compute engine interface configured to be connected to a compute engine (CE) and an address generator configured to manage address data representing address locations in the storage unit. The data streaming unit further includes a data organization unit configured to access data in the storage unit and to reorganize the data to be forwarded to the compute engine, wherein the memory interface is communicatively connected to the address generator and the data organization unit, wherein the address generator is communicatively connected to the data organization unit, and wherein the data organization unit is communicatively connected to the compute engine interface.
-
公开(公告)号:US11334355B2
公开(公告)日:2022-05-17
申请号:US15586937
申请日:2017-05-04
Applicant: Futurewei Technologies, Inc.
Inventor: Alan Gatherer , Sushma Wokhlu , Peter Yan , Ywhpyng Harn , Ashish Rai Shrivastava , Tong Sun , Lee Dobson McFearin
IPC: G06F9/38 , G06F9/30 , G06F13/16 , G06F12/0853 , G06F9/46 , G06F12/0884 , G06F3/06 , G06F12/0868 , G06F12/0855
Abstract: Technology for providing data to a processing unit is disclosed. A computer processor may be divided into a master processing unit and consumer processing units. The master processing unit at least partially decodes a machine instruction and determines whether data is needed to execute the machine instruction. The master processing unit sends a request to memory for the data. The request may indicate that the data is to be sent from the memory to a consumer processing unit. The data sent by the memory in response to the request may be stored in local read storage that is close to the consumer processing unit for fast access. The master processing unit may also provide the machine instruction to the consumer processing unit. The consumer processing unit may access the data from the local read storage and execute the machine instruction based on the accessed data.
-
公开(公告)号:US10042773B2
公开(公告)日:2018-08-07
申请号:US14811436
申请日:2015-07-28
Applicant: Futurewei Technologies, Inc.
Inventor: Sushma Wokhlu , Lee McFearin , Alan Gatherer , Ashish Shrivastava , Peter Yifey Yan
IPC: G06F12/08 , G06F12/0879 , G06F12/0895
Abstract: Systems and techniques for advance cache allocation are described. A described technique includes selecting a job from a plurality of jobs; selecting a processor core from a plurality of processor cores to execute the selected job; receiving a message which describes future memory accesses that will be generated by the selected job; generating a memory burst request based on the message; performing the memory burst request to load data from a memory to at least a dedicated portion of a cache, the cache corresponding to the selected processor core; and starting the selected job on the selected processor core. The technique can include performing an action indicated by a send message to write one or more values from another dedicated portion of the cache to the memory.
-
公开(公告)号:US20170031829A1
公开(公告)日:2017-02-02
申请号:US14811436
申请日:2015-07-28
Applicant: Futurewei Technologies, Inc.
Inventor: Sushma Wokhlu , Lee McFearin , Alan Gatherer , Ashish Shrivastava , Peter Yifey Yan
IPC: G06F12/08
CPC classification number: G06F12/0879 , G06F12/0862 , G06F12/0864 , G06F12/0895 , G06F2212/60 , G06F2212/6026 , G06F2212/6028
Abstract: Systems and techniques for advance cache allocation are described. A described technique includes selecting a job from a plurality of jobs; selecting a processor core from a plurality of processor cores to execute the selected job; receiving a message which describes future memory accesses that will be generated by the selected job; generating a memory burst request based on the message; performing the memory burst request to load data from a memory to at least a dedicated portion of a cache, the cache corresponding to the selected processor core; and starting the selected job on the selected processor core. The technique can include performing an action indicated by a send message to write one or more values from another dedicated portion of the cache to the memory.
Abstract translation: 描述了用于提前高速缓存分配的系统和技术。 所描述的技术包括从多个作业中选择作业; 从多个处理器核心选择处理器核心以执行所选择的作业; 接收描述将由所选作业生成的未来存储器访问的消息; 基于所述消息产生存储器突发请求; 执行所述存储器突发请求以将数据从存储器加载到高速缓存的至少专用部分,所述高速缓存对应于所选择的处理器核; 并在所选的处理器核心上启动所选作业。 该技术可以包括执行由发送消息指示的动作,以将一个或多个值从高速缓存的另一个专用部分写入存储器。
-
公开(公告)号:US20200050376A1
公开(公告)日:2020-02-13
申请号:US16658899
申请日:2019-10-21
Applicant: Futurewei Technologies, Inc.
Inventor: Sushma Wokhlu , Lee Dobson McFearin , Alan Gatherer , Hao Luan
IPC: G06F3/06 , G06F12/084 , G06F12/0888 , G06F12/14 , G06F9/50
Abstract: It is possible to reduce the latency attributable to memory protection in shared memory systems by performing access protection at a central Data Ownership Manager (DOM), rather than at distributed memory management units in the central processing unit (CPU) elements (CEs) responsible for parallel thread processing. In particular, the DOM may monitor read requests communicated over a data plane between the CEs and a memory controller, and perform access protection verification in parallel with the memory controller's generation of the data response. The DOM may be separate and distinct from both the CEs and the memory controller, and therefore may generally be able to make the access determination without interfering with data plane processing/generation of the read requests and data responses exchanged between the memory controller and the CEs.
-
公开(公告)号:US10419501B2
公开(公告)日:2019-09-17
申请号:US14958773
申请日:2015-12-03
Applicant: Futurewei Technologies, Inc.
Inventor: Ashish Rai Shrivastava , Alan Gatherer , Sushma Wokhlu
Abstract: A data streaming unit (DSU) and a method for operating a DSU are disclosed. In an embodiment the DSU includes a memory interface configured to be connected to a storage unit, a compute engine interface configured to be connected to a compute engine (CE) and an address generator configured to manage address data representing address locations in the storage unit. The data streaming unit further includes a data organization unit configured to access data in the storage unit and to reorganize the data to be forwarded to the compute engine, wherein the memory interface is communicatively connected to the address generator and the data organization unit, wherein the address generator is communicatively connected to the data organization unit, and wherein the data organization unit is communicatively connected to the compute engine interface.
-
公开(公告)号:US09983995B2
公开(公告)日:2018-05-29
申请号:US15131779
申请日:2016-04-18
Applicant: Futurewei Technologies, Inc.
Inventor: Sushma Wokhlu , Alan Gatherer , Ashish Rai Shrivastava
IPC: G06F12/00 , G06F12/0831 , G06F12/0877 , G06F12/0893
CPC classification number: G06F12/0893 , G06F12/0877
Abstract: A cache and a method for operating a cache are disclosed. In an embodiment, the cache includes a cache controller, data cache and a delay write through cache (DWTC), wherein the data cache is separate and distinct from the DWTC, wherein cacheable write accesses are split into shareable cacheable write accesses and non-shareable cacheable write accesses, wherein the cacheable shareable write accesses are allocated only to the DWTC, and wherein the non-shareable cacheable write accesses are not allocated to the DWTC.
-
公开(公告)号:US20220164115A1
公开(公告)日:2022-05-26
申请号:US17543024
申请日:2021-12-06
Applicant: Futurewei Technologies, Inc.
Inventor: Sushma Wokhlu , Lee Dobson McFearin , Alan Gatherer , Hao Luan
IPC: G06F3/06 , G06F9/50 , G06F12/14 , G06F12/0888 , G06F12/084
Abstract: It is possible to reduce the latency attributable to memory protection in shared memory systems by performing access protection at a central Data Ownership Manager (DOM), rather than at distributed memory management units in the central processing unit (CPU) elements (CEs) responsible for parallel thread processing. In particular, the DOM may monitor read requests communicated over a data plane between the CEs and a memory controller, and perform access protection verification in parallel with the memory controller's generation of the data response. The DOM may be separate and distinct from both the CEs and the memory controller, and therefore may generally be able to make the access determination without interfering with data plane processing/generation of the read requests and data responses exchanged between the memory controller and the CEs.
-
公开(公告)号:US11194478B2
公开(公告)日:2021-12-07
申请号:US16658899
申请日:2019-10-21
Applicant: Futurewei Technologies, Inc.
Inventor: Sushma Wokhlu , Lee Dobson McFearin , Alan Gatherer , Hao Luan
IPC: G06F3/06 , G06F9/50 , G06F12/14 , G06F12/0888 , G06F12/084 , G06F12/0815
Abstract: It is possible to reduce the latency attributable to memory protection in shared memory systems by performing access protection at a central Data Ownership Manager (DOM), rather than at distributed memory management units in the central processing unit (CPU) elements (CEs) responsible for parallel thread processing. In particular, the DOM may monitor read requests communicated over a data plane between the CEs and a memory controller, and perform access protection verification in parallel with the memory controller's generation of the data response. The DOM may be separate and distinct from both the CEs and the memory controller, and therefore may generally be able to make the access determination without interfering with data plane processing/generation of the read requests and data responses exchanged between the memory controller and the CEs.
-
-
-
-
-
-
-
-
-