Tensor-based optimization method for memory management of a deep-learning GPU and system thereof

    公开(公告)号:US11625320B2

    公开(公告)日:2023-04-11

    申请号:US16946690

    申请日:2020-07-01

    Abstract: The present disclosure relates to a tensor-based optimization method for GPU memory management of deep learning, at least comprising steps of: executing at least one computing operation, which gets tensors as input and generates tensors as output; when one said computing operation is executed, tracking access information of the tensors, and setting up a memory management optimization decision based on the access information, during a first iteration of training, performing memory swapping operations passively between a CPU memory and a GPU memory so as to obtain the access information about the tensors regarding a complete iteration; according to the obtained access information about the tensors regarding the complete iteration, setting up a memory management optimization decision; and in a successive iteration, dynamically adjusting the set optimization decision of memory management according to operational feedbacks.

    GPU-BASED METHOD FOR OPTIMIZING RICH METADATA MANAGEMENT AND SYSTEM THEREOF

    公开(公告)号:US20190294643A1

    公开(公告)日:2019-09-26

    申请号:US16284611

    申请日:2019-02-25

    Abstract: A GPU-based system for optimizing rich metadata management and a method thereof are disclosed. The system includes: a search engine for converting rich metadata information into traversal information and/or search information of a property graph, and providing at least one API according to a traversal process and/or a search process; a mapping module for detecting relationships among entity nodes in the property graph by means of mapping; a management module for activating a GPU thread group and allotting video memory blocks, so as to store the property graph in a GPU as a mixed graph; and a traversal module for activating a traversal program and performing detection and gathering on stored property arrays for iteration, so as to feed back a result of the iteration to the search engine. The system and the method are efficient in rich metadata search while having good scalability and compatibility.

    METHOD AND GAME-BASED SYSTEM FOR PARTITIONING OF STREAMING GRAPH

    公开(公告)号:US20190244402A1

    公开(公告)日:2019-08-08

    申请号:US16237053

    申请日:2018-12-31

    CPC classification number: G06T11/206 G06F16/2465 G06F16/9024 G06F2216/03

    Abstract: The present invention relates to a game-based method and system for streaming-graph partitioning, the method comprises: partitioning a streaming graph using one or more processors, the one or more processors being configured to: read an edge streaming having a predetermined number of edges in an unpartitioned area of the streaming graph as a sub-graph; based on a first pre-partitioning model, pre-partition the edges of the sub-graph to at least two partition blocks as an initial state of a game process; and sequentially select an optimal partition block for each edge of the sub-graph through the game process until the game process becomes convergent, the disclosed method and system can partition streaming graph using local information only, without loading the whole streaming graph into the memory, thus have good scalability and support dynamic graph partitioning; the disclosed partitioning method and system can provide better partitioning results.

    Method of memory estimation and configuration optimization for distributed data processing system

    公开(公告)号:US10725754B2

    公开(公告)日:2020-07-28

    申请号:US16216155

    申请日:2018-12-11

    Abstract: The present invention relates to a method of memory estimation and configuration optimization for a distributed data processing system involves performing match between an application data stream and a data feature library, wherein the application data stream has received analysis and processing on conditional branches and/or loop bodies of an application code in a Java archive of the application, estimating a memory limit for at least one stage of the application based on the successful matching result, optimizing configuration parameters of the application accordingly, and acquiring static features and/or dynamic features of the application data based on running of the optimized application and performing persistent recording. Opposite to machine-learning-based memory estimation that does not ensure accuracy and fails to provide fine-grained estimation for individual stages, this method uses application analysis and existing data feature to estimate overall memory occupation more precisely and to estimate memory use of individual job stages for more fine-grained configuration optimization.

Patent Agency Ranking