-
公开(公告)号:US20200042389A1
公开(公告)日:2020-02-06
申请号:US16054972
申请日:2018-08-03
摘要: Methods, systems, and other aspects for reconstructing data and rebuilding a failed storage device in a storage system using one or more functioning compute resources and/or storage resources of the failed storage device. For example, a method may include, responsive to a detection of a failed storage device in a storage system, locating data and redundancy information in functioning storage device(s) in the storage system for reconstructing data of the failed storage device; issuing peer-to-peer commands to the functioning storage device(s) to obtain the data and the redundancy information from the functioning storage device(s); and reconstructing the data of the failed storage device based on the data and the redundancy information obtained from the functioning storage device(s), wherein a functioning compute resource of the failed computing device at least partially performs one or more of the locating, issuing, and reconstructing.
-
公开(公告)号:US20180341606A1
公开(公告)日:2018-11-29
申请号:US15936319
申请日:2018-03-26
发明人: Vladislav Bolkhovitin , Sanjay Subbarao , Brian W. O'Krafka , Anand Kulkarni , Warren Fritz Kruger
摘要: Data management functions are offloaded from a main controller to individual storage devices in a multi-device storage environment. The main controller receives a data management request from a host system, and responds by determining one or more storage devices and one or more data management operations to be performed by the one or more storage devices. The main controller initiates performance of a data management function corresponding to the data management request, by sending one or more data management commands to the one or more storage devices, and initiating one or more data transfers, such as a direct memory access operation to transfer data between a memory buffer of a storage device and a host memory buffer of the host system, and an internal data transfer between two or more of the storage devices using an internal communication fabric of the data storage sub system.
-
公开(公告)号:US11755683B2
公开(公告)日:2023-09-12
申请号:US16726084
申请日:2019-12-23
发明人: Kiran Gunnam , Anand Kulkarni , Zvonimir Bandic
IPC分类号: G06F17/16 , G06F9/38 , G06F9/445 , G06F9/50 , G06N20/00 , G06F18/24 , G06V10/764 , G06V10/82 , G06V10/94
CPC分类号: G06F17/16 , G06F9/3877 , G06F9/3891 , G06F9/44578 , G06F9/5094 , G06F18/24 , G06N20/00 , G06V10/764 , G06V10/82 , G06V10/94 , G06F9/5044 , G06F9/5072
摘要: An apparatus includes a first tensor compute cluster configured to receive first input feature tensors, a second tensor compute cluster configured to receive second input feature tensors more sparse than the first input feature tensors, and a vector accelerator. The apparatus also includes circuitry configured to partition an input feature map into a plurality of input feature tensors based on a compression criteria and assign each of the plurality of input feature tensors to one of the first tensor compute cluster, the second tensor compute cluster, or the vector accelerator based upon at least one of parameters including a sparsity and an optimization parameter.
-
4.
公开(公告)号:US11544547B2
公开(公告)日:2023-01-03
申请号:US16908576
申请日:2020-06-22
发明人: Anand Kulkarni , Won Ho Choi , Martin Lueker-Boden
摘要: A non-volatile memory device includes an array of non-volatile memory cells that are configured to store weights of a neural network. Associated with the array is a data latch structure that includes a page buffer, which can store weights for a layer of the neural network that is read out of the array, and a transfer buffer, that can store inputs for the neural network. The memory device can perform multiply and accumulate operations between inputs and weight of the neural network within the latch structure, avoiding the need to transfer data out of the array and associated latch structure for portions of an inference operation. By using binary weights and inputs, multiplication can be performed by bit-wise XNOR operations. The results can then be summed and activation applied, all within the latch structure.
-
5.
公开(公告)号:US20210397930A1
公开(公告)日:2021-12-23
申请号:US16908576
申请日:2020-06-22
发明人: Anand Kulkarni , Won Ho Choi , Martin Lueker-Boden
摘要: A non-volatile memory device includes an array of non-volatile memory cells that are configured to store weights of a neural network. Associated with the array is a data latch structure that includes a page buffer, which can store weights for a layer of the neural network that is read out of the array, and a transfer buffer, that can store inputs for the neural network. The memory device can perform multiply and accumulate operations between inputs and weight of the neural network within the latch structure, avoiding the need to transfer data out of the array and associated latch structure for portions of an inference operation. By using binary weights and inputs, multiplication can be performed by bit-wise XNOR operations. The results can then be summed and activation applied, all within the latch structure.
-
公开(公告)号:US20240193088A1
公开(公告)日:2024-06-13
申请号:US18231730
申请日:2023-08-08
发明人: Chao Sun , Qingbo Wang , Minghai Qin , Jaco Hofmann , Anand Kulkarni , Dejan Vucinic , Zvonimir Bandic
IPC分类号: G06F12/0862 , G06N20/00
CPC分类号: G06F12/0862 , G06N20/00
摘要: A memory device includes a first memory and a second memory that caches data stored in the first memory. At least one controller of the memory device receives page fault information from a host. The page fault information results from a request for data by the host that is stored in the first memory but is not cached in the second memory when requested by the host. The memory device uses the received page fault information for one or more inputs into a prefetch model trained by Machine Learning (ML) to generate at least one inference. Based at least in part on the at least one inference, prefetch data is cached in the second memory. In one aspect, the page fault information is used to train the prefetch model. In another aspect, the page fault information includes at least one virtual address used by the host for the requested data.
-
公开(公告)号:US10831603B2
公开(公告)日:2020-11-10
申请号:US16054972
申请日:2018-08-03
摘要: Methods, systems, and other aspects for reconstructing data and rebuilding a failed storage device in a storage system using one or more functioning compute resources and/or storage resources of the failed storage device. For example, a method may include, responsive to a detection of a failed storage device in a storage system, locating data and redundancy information in functioning storage device(s) in the storage system for reconstructing data of the failed storage device; issuing peer-to-peer commands to the functioning storage device(s) to obtain the data and the redundancy information from the functioning storage device(s); and reconstructing the data of the failed storage device based on the data and the redundancy information obtained from the functioning storage device(s), wherein a functioning compute resource of the failed computing device at least partially performs one or more of the locating, issuing, and reconstructing.
-
公开(公告)号:US10725941B2
公开(公告)日:2020-07-28
申请号:US16024738
申请日:2018-06-30
摘要: Example multi-device storage systems, storage devices, and methods provide hosted services on peer storage devices. Storage devices include local memory resources, such as operating memory, remotely addressable memory, or logical mapping memory, and compute resources, such as a processor or coding engine. Each storage device is configured to communicate with a plurality of peer storage devices over an interconnect fabric. The storage devices identify requested hosted services from service host requests received through the interconnect fabric. The storage devices store a plurality of hosted services are to enable access to local memory resources and local compute resources for data management operations for the plurality of peer storage devices.
-
公开(公告)号:US11462003B2
公开(公告)日:2022-10-04
申请号:US16830167
申请日:2020-03-25
发明人: Kiran Gunnam , Anand Kulkarni , Zvonimir Bandic
摘要: A system with a multiplication circuit having a plurality of multipliers is disclosed. Each of the plurality of multipliers is configured to receive a data value and a weight value to generate a product value in a convolution operation of a machine learning application. The system also includes an accumulator configured to receive the product value from each of the plurality of multipliers and a register bank configured to store an output of the convolution operation. The accumulator is further configured to receive a portion of values stored in the register bank and combine the received portion of values with the product values to generate combined values. The register bank is further configured to replace the portion of values with the combined values.
-
公开(公告)号:US20210303976A1
公开(公告)日:2021-09-30
申请号:US16830129
申请日:2020-03-25
发明人: Kiran Gunnam , Anand Kulkarni , Zvonimir Bandic
摘要: An apparatus includes a tensor compute cluster having a plurality of tensor compute units to process a plurality of sub-feature maps in a machine learning application and a tensor memory cluster having a plurality of tensor feature map memory units to store the plurality of sub-feature maps. The apparatus also includes circuitry to partition an input feature map into the plurality of sub-feature maps such that sparsity in each of the plurality of sub-feature maps satisfies a predetermined threshold, and assign each of the plurality of sub-feature maps to one of the plurality of tensor compute units and one of the plurality of tensor feature map memory units for processing in parallel.
-
-
-
-
-
-
-
-
-