-
公开(公告)号:US20240233796A1
公开(公告)日:2024-07-11
申请号:US18505128
申请日:2023-11-09
Applicant: SHANGHAITECH UNIVERSITY
Inventor: Yuhao SHU , Hongtu ZHANG , Yajun HA
IPC: G11C11/402 , G11C11/408 , G11C11/4091 , G11C11/4096
CPC classification number: G11C11/4023 , G11C11/4087 , G11C11/4091 , G11C11/4096
Abstract: An energy-efficient memory for cryogenic computing is provided. The energy-efficient memory includes a plurality of memory banks, where each of the memory banks includes a cryogenic semi-static, dual-port, boost-free gain cell (CSDB-GC) macro module, a universal address decoder, and a different address decoder. The CSDB-GC macro module includes a plurality of columns of local blocks, and each of the local blocks includes a plurality of CSDB-GC memory cells. A final measurement result of a 16 Kb CSDB-eDRAM shows that the 16 Kb CSDB-eDRAM achieves data retention time (DRT) of 16.67 seconds, which is 2.6 times longer than DRT of a state-of-the-art cryogenic eDRAM at a temperature of 4.2 K, and achieves lower refresh power (0.11 pW/Kb). In addition, the 16 Kb CSDB-eDRAM also achieves shorter access time, namely, 710 ps (1.41 GHz). Compared with the state-of-the-art work, the 16 Kb CSDB-eDRAM has a lowest dynamic power consumption overhead, namely, 49.23 uW/Kb.
-
公开(公告)号:US20240221811A1
公开(公告)日:2024-07-04
申请号:US18229698
申请日:2023-08-03
Applicant: SHANGHAITECH UNIVERSITY
Inventor: Yuhao SHU , Hongtu ZHANG , Yajun HA
IPC: G11C11/405 , G06F17/15 , G11C11/4091 , G11C11/4096 , H03K19/20
CPC classification number: G11C11/405 , G06F17/153 , G11C11/4091 , G11C11/4096 , H03K19/20
Abstract: An energy-efficient cryogenic-in-memory-computing (CIMC) accelerator includes cryogenic 3T (C3T) macros. Each of the C3T macros comprises a C3T array containing M rows×N columns of bitcells. An input signal is converted into a timing sequence signal of a corresponding pulse width by using a digital timing sequence converter array. A C3T bitcell of a corresponding row in the C3T macro is controlled to perform charging and discharging on a read bit line (RBL) of a corresponding column. A voltage on the RBL of the corresponding column is sampled by a sense amplifier configured in each C3T macro to obtain a final result. With adaptive reference voltage configuration and storage on the chip, this design can achieve fast and low-power boolean/convolutional computing.
-
3.
公开(公告)号:US20230196079A1
公开(公告)日:2023-06-22
申请号:US18009341
申请日:2022-08-05
Applicant: SHANGHAITECH UNIVERSITY
Inventor: Hongtu ZHANG , Yuhao SHU , Yajun HA
IPC: G06N3/0464 , G06F5/16
CPC classification number: G06N3/0464 , G06F5/16
Abstract: An enhanced dynamic random access memory (eDRAM)-based computing-in-memory (CIM) convolutional neural network (CNN) accelerator comprises four P2ARAM blocks, where each of the P2ARAM blocks includes a 5T1C ping-pong eDRAM bit cell array composed of 64×16 5T1C ping-pong eDRAM bit cells. In each of the P2ARAM blocks, 64×2 digital time converters convert a 4-bit activation value into different pulse widths from a row direction and input the pulse widths into the 5T1C ping-pong eDRAM bit cell array for calculation. A total of 16×2 convolution results are output in a column direction of the 5T1C ping-pong eDRAM bit cell array. The CNN accelerator uses the 5T1C ping-pong eDRAM bit cells to perform multi-bit storage and convolution in parallel. An S2M-ADC scheme is proposed to allot an area of an input sampling capacitor of an ABL to sign-numerical SAR ADC units of a C-DAC array without adding area overhead.
-
4.
公开(公告)号:US20230161627A1
公开(公告)日:2023-05-25
申请号:US18098746
申请日:2023-01-19
Applicant: SHANGHAITECH UNIVERSITY
Inventor: Hongtu ZHANG , Yuhao SHU , Yajun HA
CPC classification number: G06F9/5027 , G06F7/523 , G06F7/50 , H03K19/21
Abstract: A high-energy-efficiency binary neural network accelerator applicable to artificial intelligence Internet of Things is provided. 0.3-0.6V sub/near threshold 10T1C multiplication bit units with series capacitors are configured for charge domain binary convolution. An anti-process deviation differential voltage amplification array between bit lines and DACs is configured for robust pre-amplification in 0.3V batch standardized operations. A lazy bit line reset scheme further reduces energy, and inference accuracy losses can be ignored. Therefore, a binary neural network accelerator chip based on in-memory computation achieves peak energy efficiency of 18.5 POPS/W and 6.06 POPS/W, which are respectively improved by 21× and 135× compared with previous macro and system work [9, 11].
-
5.
公开(公告)号:US20240233815A9
公开(公告)日:2024-07-11
申请号:US18377840
申请日:2023-10-09
Applicant: SHANGHAITECH UNIVERSITY
Inventor: Hongtu ZHANG , Yuhao SHU , Yajun HA
IPC: G11C11/419 , G11C8/16 , G11C11/54
CPC classification number: G11C11/419 , G11C8/16 , G11C11/54
Abstract: A dual-six-transistor (D6T) in-memory computing (IMC) accelerator supporting always-linear discharge and reducing digital steps is provided. In the IMC accelerator, three effective techniques are proposed: (1) A D6T bitcell can reliably run at 0.4 V and enter a standby mode at 0.26 V, to support parallel processing of dual decoupled ports. (2) An always-linear discharge and convolution mechanism (ALDCM) not only reduces a voltage of a bit line (BL), but also keeps linear calculation throughout an entire voltage range of the BL. (3) A bypass of a bias voltage time converter (BVTC) reduces digital steps, but still keeps high energy efficiency and computing density at a low voltage. A measurement result of the IMC accelerator shows that the IMC accelerator achieves an average energy efficiency of 8918 TOPS/W (8b×8b), and an average computing density of 38.6 TOPS/mm2 (8b×8b) in a 55 nm CMOS technology.
-
6.
公开(公告)号:US20240135989A1
公开(公告)日:2024-04-25
申请号:US18377840
申请日:2023-10-08
Applicant: SHANGHAITECH UNIVERSITY
Inventor: Hongtu ZHANG , Yuhao SHU , Yajun HA
IPC: G11C11/419 , G11C8/16 , G11C11/54
CPC classification number: G11C11/419 , G11C8/16 , G11C11/54
Abstract: A dual-six-transistor (D6T) in-memory computing (IMC) accelerator supporting always-linear discharge and reducing digital steps is provided. In the IMC accelerator, three effective techniques are proposed: (1) A D6T bitcell can reliably run at 0.4 V and enter a standby mode at 0.26 V, to support parallel processing of dual decoupled ports. (2) An always-linear discharge and convolution mechanism (ALDCM) not only reduces a voltage of a bit line (BL), but also keeps linear calculation throughout an entire voltage range of the BL. (3) A bypass of a bias voltage time converter (BVTC) reduces digital steps, but still keeps high energy efficiency and computing density at a low voltage. A measurement result of the IMC accelerator shows that the IMC accelerator achieves an average energy efficiency of 8918 TOPS/W (8b×8b), and an average computing density of 38.6 TOPS/mm2 (8b×8b) in a 55 nm CMOS technology.
-
-
-
-
-