Patent search ap:("SambaNova Systems Page Inc.") AND inv:"Etash Kumar GUHA"

1.

发明公开
Estimating Throughput for Placement Graphs for a Reconfigurable Dataflow Computing System 审中-公开

公开(公告)号：US20230162032A1

公开(公告)日：2023-05-25

申请号：US17990556

申请日：2022-11-18

Applicant: SambaNova Systems, Inc.

Inventor： Etash Kumar GUHA , Tianxiao JIANG , Andrew DENG , Jian ZHANG , Muthiah ANNAMALAI

IPC: G06N3/08

CPC classification number: G06N3/08

Abstract: A method for estimating throughput for placement graphs includes obtaining a set of reference placement graphs for at least one computing task, determining a corresponding throughput value for each reference placement graph, configuring a graph neural network for each reference placement graph and training the graph neural network using each corresponding throughput value as a training target to produce a trained graph neural network. The method further includes configuring the trained graph neural network for a candidate placement graph corresponding to a target computing task, and using the trained graph neural network to estimate a throughput for the target computing task when conducted on a reconfigurable dataflow computing system using the candidate placement graph. The method may also include generating configuration information, configuring the reconfigurable dataflow computing system, and conducting the target computing task. A corresponding system and computer-readable medium are also disclosed herein.

2.

发明公开
Estimating Resource Costs for Computing Tasks for a Reconfigurable Dataflow Computing System 审中-公开

公开(公告)号：US20240086235A1

公开(公告)日：2024-03-14

申请号：US18367764

申请日：2023-09-13

Applicant: SambaNova Systems, Inc.

Inventor： Tianxiao JIANG , Jian ZHANG , Etash Kumar GUHA , Andrew DENG , Muthiah ANNAMALAI

IPC: G06F9/48 , G06F9/30

CPC classification number: G06F9/4881 , G06F9/3005

Abstract: Reconfigurable dataflow architecture is an emerging design for deep learning training accelerator. This architecture maps model operators to an accelerator in a spatial way, enabling pipeline parallelization for high throughput. An essential ingredient to exploit this throughput advantage is compiler Performance Optimization (PO) which searches for optimal model mappings. The convention in industry-leading dataflow compilation uses hand-tuned rules to guide PO, requiring immense engineering cost to develop. This paper challenges this convention and asks if data-driven learned performance optimization can reduce the engineering cost while improving training throughput over hand-tuned rules. We present a workflow which guides PO using simple machine learning models trained from throughput observations of randomly generated mappings. We empirically show that developing and integrating these learned models into an industrial compiler can be 10× more efficient than hand-tuned rules in terms of engineering time cost.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification