-
公开(公告)号:US10360496B2
公开(公告)日:2019-07-23
申请号:US15088543
申请日:2016-04-01
申请人: Gregory K. Chen , Jae-Sun Seo , Thomas C Chen , Raghavan Kumar
发明人: Gregory K. Chen , Jae-Sun Seo , Thomas C Chen , Raghavan Kumar
摘要: An apparatus and method are described for a neuromorphic processor design in which neuron timing information is duplicated on a neuromorphic core. For example, one embodiment of an apparatus comprises: a first neurosynaptic core comprising a plurality of neurons and a synapse array comprising a plurality of synapses to communicatively couple the plurality of neurons, each synapse connecting two neurons having a weight associated therewith, wherein a first neuron is to generate an output spike based on the weights of synapses over which inputs are received from the other neurons; a second neurosynaptic core also comprising a plurality of neurons and having at least one counter to maintain a count value indicative of spike timing for a second neuron, wherein a spike output of the second neuron in the second neurosynaptic core is communicatively coupled over a first synapse to the first neuron in the first neurosynaptic core; and a duplicate counter maintained within the first neurosynaptic core and synchronized with the counter from the second neurosynaptic core, the first neuron to use a first value from the duplicate counter to adjust the weight of the first synapse coupling the second neuron to the first neuron.
-
公开(公告)号:US10825509B2
公开(公告)日:2020-11-03
申请号:US16146932
申请日:2018-09-28
申请人: Huseyin Ekin Sumbul , Gregory K. Chen , Raghavan Kumar , Phil Knag , Abhishek Sharma , Sasikanth Manipatruni , Amrita Mathuriya , Ram Krishnamurthy , Ian A. Young
发明人: Huseyin Ekin Sumbul , Gregory K. Chen , Raghavan Kumar , Phil Knag , Abhishek Sharma , Sasikanth Manipatruni , Amrita Mathuriya , Ram Krishnamurthy , Ian A. Young
IPC分类号: G11C11/00 , G11C11/419 , G11C27/02 , G11C7/10 , G11C7/12 , G11C11/412 , G11C11/418
摘要: A full-rail digital-read CIM circuit enables a weighted read operation on a single row of a memory array. A weighted read operation captures a value of a weight stored in the single memory array row without having to rely on weighted row-access. Rather, using full-rail access and a weighted sampling capacitance network, the CIM circuit enables the weighted read operation even under process variation, noise and mismatch.
-
公开(公告)号:US20190057050A1
公开(公告)日:2019-02-21
申请号:US16160952
申请日:2018-10-15
申请人: Amrita Mathuriya , Sasikanth Manipatruni , Victor W. Lee , Abhishek Sharma , Huseyin E. Sumbul , Gregory Chen , Raghavan Kumar , Phil Knag , Ram Krishnamurthy , Ian Young
发明人: Amrita Mathuriya , Sasikanth Manipatruni , Victor W. Lee , Abhishek Sharma , Huseyin E. Sumbul , Gregory Chen , Raghavan Kumar , Phil Knag , Ram Krishnamurthy , Ian Young
摘要: Techniques and mechanisms for performing in-memory computations with circuitry having a pipeline architecture. In an embodiment, various stages of a pipeline each include a respective input interface and a respective output interface, distinct from said input interface, to couple to different respective circuitry. These stages each further include a respective array of memory cells and circuitry to perform operations based on data stored by said array. A result of one such in-memory computation may be communicated from one pipeline stage to a respective next pipeline stage for use in further in-memory computations. Control circuitry, interconnect circuitry, configuration circuitry or other logic of the pipeline precludes operation of the pipeline as a monolithic, general-purpose memory device. In other embodiments, stages of the pipeline each provide a different respective layer of a neural network.
-
公开(公告)号:US20190057300A1
公开(公告)日:2019-02-21
申请号:US16160466
申请日:2018-10-15
申请人: Amrita Mathuriya , Sasikanth Manipatruni , Victor Lee , Huseyin Sumbul , Gregory Chen , Raghavan Kumar , Phil Knag , Ram Krishnamurthy , IAN YOUNG , Abhishek Sharma
发明人: Amrita Mathuriya , Sasikanth Manipatruni , Victor Lee , Huseyin Sumbul , Gregory Chen , Raghavan Kumar , Phil Knag , Ram Krishnamurthy , IAN YOUNG , Abhishek Sharma
IPC分类号: G06N3/04 , G06F3/06 , G06F12/0875
摘要: The present disclosure is directed to systems and methods of bit-serial, in-memory, execution of at least an nth layer of a multi-layer neural network in a first on-chip processor memory circuitry portion contemporaneous with prefetching and storing layer weights associated with the (n+1)st layer of the multi-layer neural network in a second on-chip processor memory circuitry portion. The storage of layer weights in on-chip processor memory circuitry beneficially decreases the time required to transfer the layer weights upon execution of the (n+1)st layer of the multi-layer neural network by the first on-chip processor memory circuitry portion. In addition, the on-chip processor memory circuitry may include a third on-chip processor memory circuitry portion used to store intermediate and/or final input/output values associated with one or more layers included in the multi-layer neural network.
-
公开(公告)号:US20190057036A1
公开(公告)日:2019-02-21
申请号:US16160270
申请日:2018-10-15
申请人: Amrita Mathuriya , Sasikanth Manipatruni , Victor Lee , Huseyin Sumbul , Gregory Chen , Raghavan Kumar , Phil Knag , Ram Krishnamurthy , IAN YOUNG , Abhishek Sharma
发明人: Amrita Mathuriya , Sasikanth Manipatruni , Victor Lee , Huseyin Sumbul , Gregory Chen , Raghavan Kumar , Phil Knag , Ram Krishnamurthy , IAN YOUNG , Abhishek Sharma
IPC分类号: G06F12/0875 , G06N3/063 , G06N3/04 , G06F3/06
摘要: The present disclosure is directed to systems and methods of implementing a neural network using in-memory mathematical operations performed by pipelined SRAM architecture (PISA) circuitry disposed in on-chip processor memory circuitry. A high-level compiler may be provided to compile data representative of a multi-layer neural network model and one or more neural network data inputs from a first high-level programming language to an intermediate domain-specific language (DSL). A low-level compiler may be provided to compile the representative data from the intermediate DSL to multiple instruction sets in accordance with an instruction set architecture (ISA), such that each of the multiple instruction sets corresponds to a single respective layer of the multi-layer neural network model. Each of the multiple instruction sets may be assigned to a respective SRAM array of the PISA circuitry for in-memory execution. Thus, the systems and methods described herein beneficially leverage the on-chip processor memory circuitry to perform a relatively large number of in-memory vector/tensor calculations in furtherance of neural network processing without burdening the processor circuitry.
-
公开(公告)号:US20190056885A1
公开(公告)日:2019-02-21
申请号:US16160482
申请日:2018-10-15
申请人: Amrita Mathuriya , Sasikanth Manipatruni , Victor Lee , Huseyin Sumbul , Gregory Chen , Raghavan Kumar , Phil Knag , Ram Krishnamurthy , IAN YOUNG , Abhishek Sharma
发明人: Amrita Mathuriya , Sasikanth Manipatruni , Victor Lee , Huseyin Sumbul , Gregory Chen , Raghavan Kumar , Phil Knag , Ram Krishnamurthy , IAN YOUNG , Abhishek Sharma
IPC分类号: G06F3/06 , G06F12/0802 , G06F12/1081 , G06N3/04
摘要: The present disclosure is directed to systems and methods of implementing a neural network using in-memory, bit-serial, mathematical operations performed by a pipelined SRAM architecture (bit-serial PISA) circuitry disposed in on-chip processor memory circuitry. The on-chip processor memory circuitry may include processor last level cache (LLC) circuitry. The bit-serial PISA circuitry is coupled to PISA memory circuitry via a relatively high-bandwidth connection to beneficially facilitate the storage and retrieval of layer weights by the bit-serial PISA circuitry during execution. Direct memory access (DMA) circuitry transfers the neural network model and input data from system memory to the bit-serial PISA memory and also transfers output data from the PISA memory circuitry to system memory circuitry. Thus, the systems and methods described herein beneficially leverage the on-chip processor memory circuitry to perform a relatively large number of vector/tensor calculations without burdening the processor circuitry.
-
7.
公开(公告)号:US20190057727A1
公开(公告)日:2019-02-21
申请号:US16160955
申请日:2018-10-15
申请人: Amrita Mathuriya , Sasikanth Manipatruni , Victor W. Lee , Abhishek Sharma , Huseyin E. Sumbul , Gregory Chen , Raghavan Kumar , Phil Knag , Ram Krishnamurthy , Ian Young
发明人: Amrita Mathuriya , Sasikanth Manipatruni , Victor W. Lee , Abhishek Sharma , Huseyin E. Sumbul , Gregory Chen , Raghavan Kumar , Phil Knag , Ram Krishnamurthy , Ian Young
摘要: Techniques and mechanisms for configuring a memory device to perform a sequence of in-memory computations. In an embodiment, a memory device includes a memory array and circuitry, coupled thereto, to perform data computations based on the data stored at the memory array. Based on instructions received at the memory device, control circuitry is configured to enable an automatic performance of a sequence of operations. In another embodiment, the memory device is coupled in an in-series arrangement of other memory devices to provide a pipeline circuit architecture. The memory devices each function as a respective stage of the pipeline circuit architecture, where the stages each perform respective in-memory computations. Some or all such stages each provide a different respective layer of a neural network.
-
公开(公告)号:US20190057304A1
公开(公告)日:2019-02-21
申请号:US16160800
申请日:2018-10-15
申请人: Amrita Mathuriya , Sasikanth Manipatruni , Victor Lee , Huseyin Sumbul , Gregory Chen , Raghavan Kumar , Phil Knag , Ram Krishnamurthy , IAN YOUNG , Abhishek Sharma
发明人: Amrita Mathuriya , Sasikanth Manipatruni , Victor Lee , Huseyin Sumbul , Gregory Chen , Raghavan Kumar , Phil Knag , Ram Krishnamurthy , IAN YOUNG , Abhishek Sharma
摘要: The present disclosure is directed to systems and methods of implementing an analog neural network using a pipelined SRAM architecture (“PISA”) circuitry disposed in on-chip processor memory circuitry. The on-chip processor memory circuitry may include processor last level cache (LLC) circuitry. One or more physical parameters, such as a stored charge or voltage, may be used to permit the generation of an in-memory analog output using a SRAM array. The generation of an in-memory analog output using only word-line and bit-line capabilities beneficially increases the computational density of the PISA circuit without increasing power requirements. Thus, the systems and methods described herein beneficially leverage the existing capabilities of on-chip SRAM processor memory circuitry to perform a relatively large number of analog vector/tensor calculations associated with execution of a neural network, such as a recurrent neural network, without burdening the processor circuitry and without significant impact to the processor power requirements.
-
-
-
-
-
-
-