-
公开(公告)号:US20200334067A1
公开(公告)日:2020-10-22
申请号:US16774047
申请日:2020-01-28
Inventor: Haikun Liu , Xiaofei Liao , Hai Jin , Dang Yang
Abstract: The present invention relates to a hybrid memory system with live page migration for virtual machine, and the system comprises a physical machine installed with a virtual machine and being configured to: build a channel for a shared memory between the virtual machine and a hypervisor; make the hypervisor generate to-be-migrated cold/hot page information and writing write the to-be-migrated cold/hot page information into the shared memory; make the virtual machine read the to-be-migrated cold/hot page information from the shared memory; and make the virtual machine according to the read to-be-migrated cold/hot page information perform a page migration process across heterogeneous memories of the virtual machine without stopping the virtual machine.
-
公开(公告)号:US11048442B2
公开(公告)日:2021-06-29
申请号:US16774039
申请日:2020-01-28
Inventor: Haikun Liu , Xiaofei Liao , Hai Jin , Zhiwei Li
IPC: G06F3/06
Abstract: The present invention is related to a storage system of scalable storage for in-memory objects using a DRAM-NVM hybrid memory devices.
-
公开(公告)号:US10810037B1
公开(公告)日:2020-10-20
申请号:US16774047
申请日:2020-01-28
Inventor: Haikun Liu , Xiaofei Liao , Hai Jin , Dang Yang
Abstract: The present invention relates to a hybrid memory system with live page migration for virtual machine, and the system comprises a physical machine installed with a virtual machine and being configured to: build a channel for a shared memory between the virtual machine and a hypervisor; make the hypervisor generate to-be-migrated cold/hot page information and writing write the to-be-migrated cold/hot page information into the shared memory; make the virtual machine read the to-be-migrated cold/hot page information from the shared memory; and make the virtual machine according to the read to-be-migrated cold/hot page information perform a page migration process across heterogeneous memories of the virtual machine without stopping the virtual machine.
-
公开(公告)号:US11568268B2
公开(公告)日:2023-01-31
申请号:US16748284
申请日:2020-01-21
Inventor: Hai Jin , Xiaofei Liao , Long Zheng , Haikun Liu , Xi Ge
Abstract: A deep learning heterogeneous computing method based on layer-wide memory allocation, at least comprises steps of: traversing a neural network model so as to acquire a training operational sequence and a number of layers L thereof; calculating a memory room R1 required by data involved in operation at the ith layer of the neural network model under a double-buffer configuration, where 1≤i≤L; altering a layer structure of the ith layer and updating the training operational sequence; distributing all the data across a memory room of the CPU and the memory room of the GPU according to a data placement method; performing iterative computation at each said layer successively based on the training operational sequence so as to complete neural network training.
-
公开(公告)号:US20210097221A1
公开(公告)日:2021-04-01
申请号:US16895545
申请日:2020-06-08
Inventor: Xiaofei Liao , Qingxiang Chen , Long Zheng , Hai Jin , Pengcheng Yao
IPC: G06F30/331 , G06F16/901 , G06F12/0806 , G06F9/54 , G06F9/38
Abstract: The present invention relates to an optimization method for graph processing based on heterogeneous FPGA data streams. The method can balance processing loads between the CPU processing module and the FPGA processing module during acceleration of graph data processing.
-
公开(公告)号:US20200278795A1
公开(公告)日:2020-09-03
申请号:US16750655
申请日:2020-01-23
Inventor: Haikun Liu , Xiaofei Liao , Hai Jin , Yuanyuan Ye
Abstract: The present disclosure involves a hard-ware-supported 3D-stacked NVM data compression method and system, involving setting a first identifier to mark a compression state of written-back data, the method at least comprising steps of: dividing the written-back data into a plurality of sub-blocks and acquiring a plurality of first output results through OR operations among the sub-blocks, respectively, or acquiring a plurality of second output results through exclusive OR operations among the sub-blocks, and determining a compression strategy for the written-back data based on the first output results or the second output results; and setting a second identifier to mark a storing means of the written-back data so that the second identifier is in pair with the first identifier, and configuring a storage strategy for the written-back data that includes at least rotating the second identifier.
-
7.
公开(公告)号:US20230281157A1
公开(公告)日:2023-09-07
申请号:US17815436
申请日:2022-07-27
Inventor: Yu ZHANG , Jin Zhao , Hui Yu , Yun Yang , Xinyu Jiang , Shijun Li , Xiaofei Liao , Hai Jin
IPC: G06F15/80
CPC classification number: G06F15/80
Abstract: The present invention relates to a post-exascale graph computing method, and corresponding system, storage medium and electronic device. The invention solves the problems of low computing performance, poor scalability and high communication overhead in the large-scale distributed environment, and improves the performance of the supercomputer when supporting large-scale graph computing.
-
公开(公告)号:US10248576B2
公开(公告)日:2019-04-02
申请号:US15287022
申请日:2016-10-06
Inventor: Hai Jin , Xiaofei Liao , Haikun Liu , Yujie Chen , Rentong Guo
IPC: G06F12/10 , G06F12/1045 , G06F12/0862
Abstract: The present invention provides a DRAM/NVM hierarchical heterogeneous memory system with software-hardware cooperative management schemes. In the system, NVM is used as large-capacity main memory, and DRAM is used as a cache to the NVM. Some reserved bits in the data structure of TLB and last-level page table are employed effectively to eliminate hardware costs in the conventional hardware-managed hierarchical memory architecture. The cache management in such a heterogeneous memory system is pushed to the software level. Moreover, the invention is able to reduce memory access latency in case of last-level cache misses. Considering that many applications have relatively poor data locality in big data application environments, the conventional demand-based data fetching policy for DRAM cache can aggravates cache pollution. In the present invention, an utility-based data fetching mechanism is adopted in the DRAM/NVM hierarchical memory system, and it determines whether data in the NVM should be cached in the DRAM according to current DRAM memory utilization and application memory access patterns. It improves the efficiency of the DRAM cache and bandwidth usage between the NVM main memory and the DRAM cache.
-
公开(公告)号:US12189950B2
公开(公告)日:2025-01-07
申请号:US18145552
申请日:2022-12-22
Inventor: Long Zheng , Qinggang Wang , Xiaofei Liao , Zhaozeng An , Hai Jin
IPC: G06F3/06
Abstract: The present invention relates to a dynamic memory management apparatus and method for HLS, the apparatus has several searching and caching modules and several modifying and writing-back modules, the searching and caching modules are in connection with a DRAM storing module and a BRAM buffer, respectively, and the modifying and writing-back modules are in connection with the DRAM storing module and the BRAM buffer, respectively, the BRAM buffer is for caching information about nodes on a search path and registering information about modification made to the nodes. To remedy the defect that the traditional operating system is directly transplanted to the FPGA and has low execution efficiency, the present invention utilizes the advantage of the large capacity of the DRAM on the FPGA to realize efficient dynamic memory allocation and deallocation, and improve the usability and code reusability of HLS.
-
公开(公告)号:US11609787B2
公开(公告)日:2023-03-21
申请号:US16947055
申请日:2020-07-16
Inventor: Xiaofei Liao , Yicheng Chen , Yu Zhang , Hai Jin , Jin Zhao , Xiang Zhao , Beibei Si
Abstract: The present disclosure relates to an FPGA-based dynamic graph processing method, comprising: where graph mirrors of a dynamic graph that have successive timestamps define an increment therebetween, a pre-processing module dividing the graph mirror having the latter timestamp into at least one path unit in a manner that incremental computing for any vertex only depends on a preorder vertex of that vertex; an FPGA processing module storing at least two said path units into an on-chip memory directly linked to threads in a manner that every thread unit is able to process the path unit independently; the thread unit determining an increment value between the successive timestamps of the preorder vertex while updating a state value of the preorder vertex, and transferring the increment value to a succeeding vertex adjacent to the preorder vertex in a transfer direction determined by the path unit, so as to update the state value of the succeeding vertex.
-
-
-
-
-
-
-
-
-