Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Patricio Kaplan"

1.

发明授权
Resizable scratchpad memory 有权

公开(公告)号：US12045475B1

公开(公告)日：2024-07-23

申请号：US17457502

申请日：2021-12-03

Applicant: Amazon Technologies, Inc.

Inventor： Paul Gilbert Meyer , Patricio Kaplan , Sundeep Amirineni , Laura Sharpless , Ron Diamant , Akshay Balasubramanian

IPC: G06F3/06 , G06F12/02

CPC classification number: G06F3/0631 , G06F3/0604 , G06F3/064 , G06F3/0656 , G06F3/0659 , G06F3/0679 , G06F12/0246

Abstract: Techniques for implementing a dynamically resizable memory region for alternative use in a memory are described. The techniques may include using two concurrent address maps corresponding to two address ranges for a memory represented as an array of memory blocks. The first address range can be mapped to the memory with starting addresses of the memory blocks incrementing sequentially along each row. The second address range can be mapped to the memory with starting addresses of the memory blocks incrementing sequentially along each column. When an access request is received having a target address belonging to the first address range, the target address is provided as the memory address to access the memory. When an access request having a target address belonging to the second address range, the target address is translated by address translation logic into a memory address to access the memory.

2.

发明授权
Memory access operation in distributed computing system 有权

公开(公告)号：US11467992B1

公开(公告)日：2022-10-11

申请号：US17031668

申请日：2020-09-24

Applicant: Amazon Technologies, inc.

Inventor： Patricio Kaplan , Ron Diamant

IPC: G06F3/06 , G06F13/28

Abstract: In one example, an apparatus comprises: a local on-chip memory; a computation engine configured to generate local data and to store the local data at the local on-chip memory; and a controller. The apparatus is configured to be coupled with a second device via an interconnect, the second device comprising a local memory. The controller is configured to: fetch the local data from the local on-chip memory; fetch remote data generated by another device from a local off-chip memory; generate output data based on combining the local data and the remote data; and store, via the interconnect, the output data at the local memory of the second device.

3.

发明授权
Low latency memory notification 有权

公开(公告)号：US12056072B1

公开(公告)日：2024-08-06

申请号：US17457603

申请日：2021-12-03

Applicant: Amazon Technologies, Inc.

Inventor： Patricio Kaplan , Ron Diamant

IPC: G06F13/28 , G06F3/06

CPC classification number: G06F13/28 , G06F3/0611 , G06F3/0655 , G06F3/0679 , G06F2213/28

Abstract: Techniques to reduce the latency of data transfer notifications in a computing system are disclosed. The techniques can include receiving, at a memory, a first access request of a set of access requests associated with a data transfer. The first access request has a token and an access count indicating the number of access requests in the set of access requests. A counter is initiated to count the number of received access requests having the token. When additional access requests belonging to the set of access requests are received, the counter is incremented for each of the additional access requests being received. A notification is transmitted to an integrated circuit component in response to receiving the last access request of the set of access requests having the token to notify the integrated circuit component that the memory is ready for access.

4.

发明授权
Error avoidance in memory device 有权

公开(公告)号：US11704211B1

公开(公告)日：2023-07-18

申请号：US17643292

申请日：2021-12-08

Applicant: Amazon Technologies, Inc.

Inventor： Patricio Kaplan , Ron Diamant , Brian Robert Silver

IPC: G06F11/00 , G06F11/20

CPC classification number: G06F11/2094 , G06F2201/82

Abstract: Techniques for avoiding uncorrectable errors in a memory device can include detecting a correctable error pattern of a memory page of a memory device, and determining that the correctable error pattern of the memory page satisfies a page migration condition. Upon satisfying the page migration condition, write accesses to the memory page are prevented from reaching a memory controller of the memory device. The contents of the memory page are then migrated to a reserved page, and a mapping table is updated to replace accesses to the memory page with accesses to the reserved page.

5.

发明申请
MULTI-MODEL TRAINING PIPELINE IN DISTRIBUTED SYSTEMS 有权

公开(公告)号：US20210303988A1

公开(公告)日：2021-09-30

申请号：US16835161

申请日：2020-03-30

Applicant: Amazon Technologies, Inc.

Inventor： Patricio Kaplan , Ron Diamant

IPC: G06N3/08 , G06N3/04

Abstract: A first worker node of a distributed system computes a first set of gradients using a first neural network model and a first set of weights associated with the first neural network model. The first set of gradients are transmitted from the first worker node to a second worker node of the distributed system. The second worker node computes a first set of synchronized gradients based on the first set of gradients. While the first set of synchronized gradients are being computed, the first worker node computes a second set of gradients using a second neural network model and a second set of weights associated with the second neural network model. The second set of gradients are transmitted from the first worker node to the second worker node. The second worker node computes a second set of synchronized gradients based on the second set of gradients.

6.

发明授权
Multi-model training pipeline in distributed systems 有权

公开(公告)号：US11676021B1

公开(公告)日：2023-06-13

申请号：US17947355

申请日：2022-09-19

Applicant: Amazon Technologies, Inc.

Inventor： Patricio Kaplan , Ron Diamant

IPC: G06N3/08 , G06N3/045

CPC classification number: G06N3/08 , G06N3/045

Abstract: A first worker node of a distributed system computes a first set of gradients using a first neural network model and a first set of weights associated with the first neural network model. The first set of gradients are transmitted from the first worker node to a second worker node of the distributed system. The second worker node computes a first set of synchronized gradients based on the first set of gradients. While the first set of synchronized gradients are being computed, the first worker node computes a second set of gradients using a second neural network model and a second set of weights associated with the second neural network model. The second set of gradients are transmitted from the first worker node to the second worker node. The second worker node computes a second set of synchronized gradients based on the second set of gradients.

7.

发明授权
Multi-model training pipeline in distributed systems 有权

公开(公告)号：US11468325B2

公开(公告)日：2022-10-11

申请号：US16835161

申请日：2020-03-30

Applicant: Amazon Technologies, Inc.

Inventor： Patricio Kaplan , Ron Diamant

IPC: G06N3/08 , G06N3/04

Abstract: A first worker node of a distributed system computes a first set of gradients using a first neural network model and a first set of weights associated with the first neural network model. The first set of gradients are transmitted from the first worker node to a second worker node of the distributed system. The second worker node computes a first set of synchronized gradients based on the first set of gradients. While the first set of synchronized gradients are being computed, the first worker node computes a second set of gradients using a second neural network model and a second set of weights associated with the second neural network model. The second set of gradients are transmitted from the first worker node to the second worker node. The second worker node computes a second set of synchronized gradients based on the second set of gradients.

8.

发明申请
SPARSE MACHINE LEARNING ACCELERATION 有权

公开(公告)号：US20220318604A1

公开(公告)日：2022-10-06

申请号：US17301271

申请日：2021-03-30

Applicant: Amazon Technologies, Inc.

Inventor： Kun Xu , Ron Diamant , Patricio Kaplan

IPC: G06N3/063 , G06N3/04 , G06N3/08

Abstract: To reduce the storage size of weight tensors and speed up loading of weight tensors from system memory, a compression technique can be employed to remove zero values from a weight tensor before storing the weight tensor in system memory. A sparsity threshold can be enforced to achieve a compression ratio target by forcing small weight values to zero during training. When the weight tensor is loaded from system memory, a direct memory access (DMA) engine with an in-line decompression unit can decompress the weight tensor on-the-fly. By performing the decompression in the DMA engine, expansion of the weight values back to the original weight tensor size can be carried out in parallel while other neural network computations are being performed by the processing unit.

9.

发明授权
Sparse machine learning acceleration 有权

公开(公告)号：US12254398B2

公开(公告)日：2025-03-18

申请号：US17301271

申请日：2021-03-30

Applicant: Amazon Technologies, Inc.

Inventor： Kun Xu , Ron Diamant , Patricio Kaplan

IPC: G06N3/06 , G06N3/04 , G06N3/063 , G06N3/08

Abstract: To reduce the storage size of weight tensors and speed up loading of weight tensors from system memory, a compression technique can be employed to remove zero values from a weight tensor before storing the weight tensor in system memory. A sparsity threshold can be enforced to achieve a compression ratio target by forcing small weight values to zero during training. When the weight tensor is loaded from system memory, a direct memory access (DMA) engine with an in-line decompression unit can decompress the weight tensor on-the-fly. By performing the decompression in the DMA engine, expansion of the weight values back to the original weight tensor size can be carried out in parallel while other neural network computations are being performed by the processing unit.

10.

发明授权
Speculative training using partial gradients update 有权

公开(公告)号：US11948352B2

公开(公告)日：2024-04-02

申请号：US16831060

申请日：2020-03-26

Applicant: Amazon Technologies, Inc.

Inventor： Patricio Kaplan , Randy Renfu Huang

IPC: G06N3/084 , G06N3/063 , G06N5/046 , G06N20/00 , G06V10/764 , G06V10/82 , G06V10/94

CPC classification number: G06V10/955 , G06N3/063 , G06N3/084 , G06N5/046 , G06N20/00 , G06V10/764 , G06V10/82

Abstract: The exchange of weight gradients among the processing nodes can introduce a substantial bottleneck to the training process. Instead of remaining idle during the weight gradients exchange process, a processing node can update its own set of weights for the next iteration of the training process using the processing node's local weight gradients. The next iteration of training can be started by using these speculative weights until the weight gradients exchange process completes and a global weights update is available. If the speculative weights is close enough to the weight values from the global weights update, the training process at the processing node can continue training using the results computed from the speculative weights to reduce the overall training time.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification