-
公开(公告)号:US20240272953A1
公开(公告)日:2024-08-15
申请号:US18642668
申请日:2024-04-22
Applicant: Amazon Technologies, Inc.
Inventor: Ramyanshu Datta , Ishaaq Chandy , Arvind Sowmyan , Wei You , Kunal Mehrotra , Kohen Berith Chia , Andrea Olgiati , Lakshmi Naarayanan Ramakrishnan , Saurabh Gupta
IPC: G06F9/50
CPC classification number: G06F9/5038 , G06F9/5022 , G06F9/5055
Abstract: A post-task-completion retention period for which a computing resource is to be retained, without de-activating the resource, on behalf of a set of requesters of machine learning tasks is determined at a machine learning service. A first task, identified at the service prior to expiration of the retention period at a first computing resource at which a second task has completed, is initiated at the first computing resource. In response to obtaining an indication of a third task and determining that a threshold criterion associated with the retention period satisfies a criterion, the third task is initiated at an additional computing resource. The additional computing resource is de-activated after the third task completes, without waiting for the retention period to expire.
-
公开(公告)号:US20240020514A1
公开(公告)日:2024-01-18
申请号:US18143970
申请日:2023-05-05
Applicant: Amazon Technologies, Inc.
Inventor: Randy Renfu Huang , Richard John Heaton , Andrea Olgiati , Ron Diamant
IPC: G06N3/045 , G06N3/04 , G06N3/08 , G06F18/214
CPC classification number: G06N3/045 , G06N3/04 , G06N3/08 , G06F18/214
Abstract: Systems and methods for performing improper input data detection are described. In one example, a system comprises: hardware circuits configured to receive input data and to perform computations of a neural network based on the input data to generate computation outputs; and an improper input detection circuit configured to: determine a relationship between the computation outputs of the hardware circuits and reference outputs; determine that the input data are improper based on the relationship; and perform an action based on determining that the input data are improper.
-
公开(公告)号:US11599821B2
公开(公告)日:2023-03-07
申请号:US16020776
申请日:2018-06-27
Applicant: Amazon Technologies, Inc.
Inventor: Sudipta Sengupta , Poorna Chand Srinivas Perumalla , Dominic Rajeev Divakaruni , Nafea Bshara , Leo Parker Dirac , Bratin Saha , Matthew James Wood , Andrea Olgiati , Swaminathan Sivasubramanian
Abstract: Implementations detailed herein include description of a computer-implemented method. In an implementation, the method at least includes receiving an application instance configuration, an application of the application instance to utilize a portion of an attached accelerator during execution of a machine learning model and the application instance configuration including: an indication of the central processing unit (CPU) capability to be used, an arithmetic precision of the machine learning model to be used, an indication of the accelerator capability to be used, a storage location of the application, and an indication of an amount of random access memory to use.
-
公开(公告)号:US11494621B2
公开(公告)日:2022-11-08
申请号:US16020788
申请日:2018-06-27
Applicant: Amazon Technologies, Inc.
Inventor: Sudipta Sengupta , Poorna Chand Srinivas Perumalla , Dominic Rajeev Divakaruni , Nafea Bshara , Leo Parker Dirac , Bratin Saha , Matthew James Wood , Andrea Olgiati , Swaminathan Sivasubramanian
Abstract: Implementations detailed herein include description of a computer-implemented method. In an implementation, the method at least includes receiving an application instance configuration, an application of the application instance to utilize a portion of an attached accelerator during execution of a machine learning model and the application instance configuration including an arithmetic precision of the machine learning model to be used in determining the portion of the accelerator to provision; provisioning the application instance and the portion of the accelerator attached to the application instance, wherein the application instance is implemented using a physical compute instance in a first location, wherein the portion of the accelerator is implemented using a physical accelerator in the second location; loading the machine learning model onto the portion of the accelerator; and performing inference using the loaded machine learning model of the application using the portion of the accelerator on the attached accelerator.
-
公开(公告)号:US11422863B2
公开(公告)日:2022-08-23
申请号:US16020810
申请日:2018-06-27
Applicant: Amazon Technologies, Inc.
Inventor: Sudipta Sengupta , Poorna Chand Srinivas Perumalla , Dominic Rajeev Divakaruni , Nafea Bshara , Leo Parker Dirac , Bratin Saha , Matthew James Wood , Andrea Olgiati , Swaminathan Sivasubramanian
Abstract: Implementations detailed herein include description of a computer-implemented method. In an implementation, the method at least includes provisioning an application instance and portions of at least one accelerator attached to the application instance to execute a machine learning model of an application of the application instance; loading the machine learning model onto the portions of the at least one accelerator; receiving scoring data in the application; and utilizing each of the portions of the attached at least one accelerator to perform inference on the scoring data in parallel and only using one response from the portions of the accelerator.
-
公开(公告)号:US11210605B1
公开(公告)日:2021-12-28
申请号:US15658005
申请日:2017-07-24
Applicant: Amazon Technologies, Inc.
Inventor: Pracheer Gupta , Andrea Olgiati , Poorna Chand Srinivas Perumalla , Stefano Stefani , Maden Mohan Rao Jampani
Abstract: A processing device receives a dataset comprising a plurality of data points, wherein each data point of the plurality of data points comprises a representative vector for the data point and an associated classification for the data point. The processing device determines, for the dataset, a score representative of a degree of clustering of the plurality of data points. The processing device determines a suitability of the dataset for use in machine learning based on the score.
-
公开(公告)号:US10652565B1
公开(公告)日:2020-05-12
申请号:US15782725
申请日:2017-10-12
Applicant: Amazon Technologies, inc.
Inventor: Jia Bi Zhang , Andrea Olgiati , Meng Wang
IPC: H04N19/463 , G06K9/62 , G06T9/00 , G06N20/00
Abstract: A processing device receives a representation of an image, wherein the image has a first size and the representation has a second size that is smaller than the first size, the representation having been generated from the image by a first portion of a first trained machine learning model. The processing device processes the representation of the image using a second portion of the trained machine learning model to generate a reconstruction of the image and then outputs the reconstruction of the image.
-
公开(公告)号:US10579591B1
公开(公告)日:2020-03-03
申请号:US15385740
申请日:2016-12-20
Applicant: Amazon Technologies, Inc.
Inventor: Ron Diamant , Andrea Olgiati , Nathan Binkert
Abstract: Techniques for performing incremental block compression using a processor are described herein. The processor receives a request to compress input data, the request including compression parameters for the compression and a target block size. The processor divides the input data into portions. The processor iteratively compresses the input data to an output block, until compressing another portion of data would increase a file size of the output block over a threshold value that is based at least on the target block size.
-
公开(公告)号:US10460175B1
公开(公告)日:2019-10-29
申请号:US15624258
申请日:2017-06-15
Applicant: Amazon Technologies, Inc.
Inventor: Stephen Gould , Andrea Olgiati
Abstract: A method and system for processing multiple frames of a video by a neural network are provided. Two frames of a video may be analyzed to determine if at least a portion of the layer-by-layer processing by a neural network can be skipped or terminated. Processing of a first frame of the video is performed by the neural network. A next frame of the video is processed by the neural network, such that processing of fewer layers (or sets of operations) of the neural network is performed if the first frame and the second frame are substantially similar.
-
公开(公告)号:US10366026B1
公开(公告)日:2019-07-30
申请号:US15390250
申请日:2016-12-23
Applicant: AMAZON TECHNOLOGIES, INC.
Inventor: Ron Diamant , Andrea Olgiati , Nathan Binkert
Abstract: A system comprises a data storage, a decompression accelerator configured to decompress compressed data and thereby generate decompressed data, and a direct memory access (DMA) engine coupled to the data storage and the decompression accelerator. The DMA engine comprises a buffer for storage of a plurality of descriptors containing configuration parameters for a block of compressed data to be retrieved from the data storage and decompressed by the decompression accelerator, wherein at least one of the descriptors comprises a threshold value. The DMA engine, in accordance with one or more of the descriptors, is configured to read compressed data from data storage and transmit the threshold value and the compressed data to the decompression accelerator. The decompression accelerator is configured to decompress the compressed data until the threshold value is reached and then to abort further data decompression and to assert a stop transaction signal to the DMA engine.
-
-
-
-
-
-
-
-
-