-
1.
公开(公告)号:US20240144044A1
公开(公告)日:2024-05-02
申请号:US18405203
申请日:2024-01-05
申请人: Groq, Inc.
摘要: A system receives a predictive model and receives one or more runtime constraints. The system generates a directed acyclic graph (DAG) of the predictive model indicating dependencies. The system compiles the predictive model into first instructions for a first processor based on the one or more runtime constraints and the DAG. The system packages first instructions, the one or more runtime constraints, and the DAG of the predictive model in a first binary. The system recompiles the predictive model into second instructions for a second processor based on the runtime constraints and the DAG stored in the first processor. The system packages the second instructions, the DAG, and the runtime constraints in a second binary.
-
2.
公开(公告)号:US20230121986A1
公开(公告)日:2023-04-20
申请号:US18083388
申请日:2022-12-16
申请人: Groq, Inc.
摘要: A system receives a predictive model and receives one or more runtime constraints. The system generates a directed acyclic graph (DAG) of the predictive model indicating dependencies. The system compiles the predictive model into first instructions for a first processor based on the one or more runtime constraints and the DAG. The system packages first instructions, the one or more runtime constraints, and the DAG of the predictive model in a first binary. The system recompiles the predictive model into second instructions for a second processor based on the runtime constraints and the DAG stored in the first processor. The system packages the second instructions, the DAG, and the runtime constraints in a second binary.
-
公开(公告)号:US11625618B1
公开(公告)日:2023-04-11
申请号:US17528609
申请日:2021-11-17
申请人: Groq, Inc.
摘要: A system receives a predictive model and receives one or more runtime constraints. The system generates a directed acyclic graph (DAG) of the predictive model indicating dependencies. The system compiles the predictive model into first instructions for a first processor based on the one or more runtime constraints and the DAG. The system packages first instructions, the one or more runtime constraints, and the DAG of the predictive model in a first binary. The system recompiles the predictive model into second instructions for a second processor based on the runtime constraints and the DAG stored in the first processor. The system packages the second instructions, the DAG, and the runtime constraints in a second binary.
-
公开(公告)号:US11568275B1
公开(公告)日:2023-01-31
申请号:US16526936
申请日:2019-07-30
申请人: Groq, Inc.
摘要: A system receives a predictive model and receives one or more runtime constraints. The system generates a directed acyclic graph (DAG) of the predictive model indicating dependencies. The system compiles the predictive model into first instructions for a first processor based on the one or more runtime constraints and the DAG. The system packages first instructions, the one or more runtime constraints, and the DAG of the predictive model in a first binary. The system recompiles the predictive model into second instructions for a second processor based on the runtime constraints and the DAG stored in the first processor. The system packages the second instructions, the DAG, and the runtime constraints in a second binary.
-
公开(公告)号:US11114138B2
公开(公告)日:2021-09-07
申请号:US16132196
申请日:2018-09-14
申请人: Groq, Inc.
摘要: A memory structure having 2m read ports allowing for concurrent access to n data entries can be constructed using three memory structures each having 2m−1 read ports. The three memory structures include two structures providing access to half of the n data entries, and a difference structure providing access to difference data between the halves of the n data entries. Each pair of the 2m ports is connected to a respective port of each of the 2m−1-port data structures, such that each port of the part can access data entries of a first half of the n data entries either by accessing the structure storing that half directly, or by accessing both the difference structure and the structure containing the second half to reconstruct the data entries of the first half, thus allowing for a pair of ports to concurrently access any of the stored data entries in parallel.
-
公开(公告)号:US11875874B2
公开(公告)日:2024-01-16
申请号:US17397158
申请日:2021-08-09
申请人: Groq, Inc.
CPC分类号: G11C7/1075 , G06F9/30029 , G06F9/3887 , G11C7/22
摘要: A memory structure having 2m read ports allowing for concurrent access to n data entries can be constructed using three memory structures each having 2m-1 read ports. The three memory structures include two structures providing access to half of the n data entries, and a difference structure providing access to difference data between the halves of the n data entries. Each pair of the 2m ports is connected to a respective port of each of the 2m-1-port data structures, such that each port of the part can access data entries of a first half of the n data entries either by accessing the structure storing that half directly, or by accessing both the difference structure and the structure containing the second half to reconstruct the data entries of the first half, thus allowing for a pair of ports to concurrently access any of the stored data entries in parallel.
-
7.
公开(公告)号:US11868908B2
公开(公告)日:2024-01-09
申请号:US18083388
申请日:2022-12-16
申请人: Groq, Inc.
摘要: A system receives a predictive model and receives one or more runtime constraints. The system generates a directed acyclic graph (DAG) of the predictive model indicating dependencies. The system compiles the predictive model into first instructions for a first processor based on the one or more runtime constraints and the DAG. The system packages first instructions, the one or more runtime constraints, and the DAG of the predictive model in a first binary. The system recompiles the predictive model into second instructions for a second processor based on the runtime constraints and the DAG stored in the first processor. The system packages the second instructions, the DAG, and the runtime constraints in a second binary.
-
公开(公告)号:US11263129B1
公开(公告)日:2022-03-01
申请号:US16526966
申请日:2019-07-30
申请人: Groq, Inc.
摘要: A processor having a functional slice architecture is divided into a plurality of functional units (“tiles”) organized into a plurality of slices. Each slice is configured to perform specific functions within the processor, which may include memory slices (MEM) for storing operand data, and arithmetic logic slices for performing operations on received operand data. The tiles of the processor are configured to stream operand data across a first dimension, and receive instructions across a second dimension orthogonal to the first dimension. The timing of data and instruction flows are configured such that corresponding data and instructions are received at each tile with a predetermined temporal relationship, allowing operand data to be transmitted between the slices of the processor without any accompanying metadata. Instead, each slice is able to determine what operations to perform on received data based upon the timing at which the data is received.
-
公开(公告)号:US20240176737A1
公开(公告)日:2024-05-30
申请号:US18394442
申请日:2023-12-22
申请人: Groq, Inc.
CPC分类号: G06F12/0292 , G06F3/061 , G06F3/064 , G06F3/0673 , G06F9/3004 , G06F9/3009 , G06F9/30145 , G06F9/3814 , G06F13/1689 , G06F2212/16
摘要: A processor having a functional slice architecture is divided into a plurality of functional units (“tiles”) organized into a plurality of slices. Each slice is configured to perform specific functions within the processor, which may include memory slices (MEM) for storing operand data, and arithmetic logic slices for performing operations on received operand data. The tiles of the processor are configured to stream operand data across a first dimension, and receive instructions across a second dimension orthogonal to the first dimension. The timing of data and instruction flows are configured such that corresponding data and instructions are received at each tile with a predetermined temporal relationship, allowing operand data to be transmitted between the slices of the processor without any accompanying metadata. Instead, each slice is able to determine what operations to perform on received data based upon the timing at which the data is received.
-
公开(公告)号:US11868250B1
公开(公告)日:2024-01-09
申请号:US17582895
申请日:2022-01-24
申请人: Groq, Inc.
CPC分类号: G06F12/0292 , G06F3/061 , G06F3/064 , G06F3/0673 , G06F9/3004 , G06F9/3009 , G06F9/30145 , G06F9/3814 , G06F13/1689 , G06F2212/16
摘要: A processor having a functional slice architecture is divided into a plurality of functional units (“tiles”) organized into a plurality of slices. Each slice is configured to perform specific functions within the processor, which may include memory slices (MEM) for storing operand data, and arithmetic logic slices for performing operations on received operand data. The tiles of the processor are configured to stream operand data across a first dimension, and receive instructions across a second dimension orthogonal to the first dimension. The timing of data and instruction flows are configured such that corresponding data and instructions are received at each tile with a predetermined temporal relationship, allowing operand data to be transmitted between the slices of the processor without any accompanying metadata. Instead, each slice is able to determine what operations to perform on received data based upon the timing at which the data is received.
-
-
-
-
-
-
-
-
-