-
1.
公开(公告)号:US11651766B2
公开(公告)日:2023-05-16
申请号:US17181908
申请日:2021-02-22
Applicant: SOUTHEAST UNIVERSITY
Inventor: Weiwei Shan , Lixuan Zhu , Jun Yang , Longxing Shi
CPC classification number: G10L15/02 , G06F17/142 , G10L25/24
Abstract: The present invention discloses an ultra-low-power speech feature extraction circuit based on non-overlapping framing and serial fast Fourier transform (FFT), and belongs to the technical field of computation, calculation or counting. The circuit is oriented to the field of intelligence, and is integrally composed of a pre-process module, a windowing module, a Fourier transform module, a Mel filtering module, an adjacent frame merging module, a discrete cosine transform (DCT) module and other modules by optimizing the architecture of a Mel-frequency Cepstral Coefficients (MFCC) algorithm. Large-scale storage caused by framing is avoided in a non-overlapping framing mode, storage contained in the MFCC algorithm is further reduced, and the circuit area and the power consumption are greatly reduced. An FFT algorithm in the feature extraction circuit adopts a serial pipeline mode to process data, makes full use of the characteristics of serial inflow of audio data, and further reduces the storage area and operations of the circuit.
-
公开(公告)号:US12141682B2
公开(公告)日:2024-11-12
申请号:US17181595
申请日:2021-02-22
Applicant: SOUTHEAST UNIVERSITY
Inventor: Weiwei Shan , Ziyu Li , Jun Yang , Longxing Shi
IPC: G06N3/063 , G06N3/04 , G06F119/12
Abstract: The present invention discloses an ultralow-power negative margin timing monitoring method of a neural network circuit, relates to an adaptive voltage regulation technology based on on-chip timing detection, and belongs to the technical field of low-power design of integrated circuit. The present invention provides an ultralow-power operating method of neural network circuit. By inserting a timing monitoring unit in specific position of critical paths and setting partial circuits to operate under “negative margin”, the system can further lower voltage, compress the timing slack, and obtain higher power gain.
-
3.
公开(公告)号:US11664013B2
公开(公告)日:2023-05-30
申请号:US17112287
申请日:2020-12-04
Applicant: SOUTHEAST UNIVERSITY
Inventor: Weiwei Shan
CPC classification number: G10L15/08 , G10L15/02 , G10L15/16 , G10L2015/088
Abstract: It discloses a speech feature reuse-based storing and calculating compression method for a keyword-spotting CNN, and belongs to the technical filed of calculating, reckoning or counting. If the updated row number of input data is equal to a convolution step size, every time new input data arrive, an input layer of a neural network replaces the earliest part of the input data with the new input data and meanwhile adjusts an addressing sequence of the input data, thereby performing an operation on the input data and corresponding convolution kernels in an arrival sequence of the input data, and an operation result is stored in an intermediate data memory of the neural network to update corresponding data.
-
公开(公告)号:US09600382B2
公开(公告)日:2017-03-21
申请号:US14442071
申请日:2013-08-30
Applicant: SOUTHEAST UNIVERSITY
Inventor: Weiwei Shan , Chaoxuan Tian , Huafang Sun , Longxing Shi
CPC classification number: G06F11/2035 , G06F1/324 , G06F1/3296 , G06F11/0721 , G06F11/0793 , G06F11/1402 , G06F11/3024 , G06F11/3062 , G06F11/3093 , G06F2201/805 , G06F2201/85 , Y02D10/126 , Y02D10/172
Abstract: Disclosed is an error recovery circuit facing a CPU assembly line, comprising: on-chip monitoring circuits (1), an error signal statistics module (2), a voltage frequency control module (3), an error recovery control module (4), an in-situ error recovery module (5) and an upper-layer error recovery module (6), wherein each of the on-chip monitoring circuits (1) is integrated at the end of each stage of assembly lines of the previous N−1 stages of assembly lines of a CPU kernel with an N-stage assembly line structure, so as to monitor the time sequence information about each clock period of an operating circuit, wherein N is a positive integer which is greater than or equal to 3 and less than 20. The present invention provides the on-line time sequence monitoring on the CPU kernel with N stages of assembly lines to search for the lowest possible operating voltage of the circuit, and to reduce the margin of the operating voltage reserved for the circuit in the design stage, thereby significantly reducing the power consumption of the circuit and improving the energy efficiency of the circuit.
-
公开(公告)号:US11139805B1
公开(公告)日:2021-10-05
申请号:US16957724
申请日:2019-07-09
Applicant: SOUTHEAST UNIVERSITY
Inventor: Weiwei Shan , Jun Yang , Longxing Shi
Abstract: A two-way adaptive clock circuit supporting a wide frequency range is composed of a phase clock generating module, a phase clock selecting module, an adaptive clock stretching or compressing amount regulating circuit module and a control module. The adaptive clock stretching or compressing amount regulating circuit module can monitor delay information of a critical path in a chip in real time and feed the information back into the control module. After receiving a clock stretching or compressing enable signal and a stretching or compressing scale signal, the control module selects a target phase clock from clocks generated by the phase clock generating module to rapidly regulate an adaptive clock in a current cycle. The present invention is applied to an adaptive voltage frequency regulating circuit based on on-line time sequence monitoring.
-
公开(公告)号:US10033362B1
公开(公告)日:2018-07-24
申请号:US15562893
申请日:2017-02-24
Applicant: Southeast University
Inventor: Weiwei Shan , Liang Wan , Longxing Shi
Abstract: A PVTM-based wide voltage range clock stretching circuit is disclosed. The circuit consists of a PVTM circuit module, a phase clock generation module, a clock synchronization selection module and a control module. The PVTM circuit module monitors in real time the delay information of an on-chip delay unit to monitor the operating environment of the circuit, and feeds the delay information back to the control module. Under the control of a clock stretching enable signal and a clock stretching extent signal, the control module selects a target phase clock from the clocks generated by the phase clock generation module in accordance with the feedback from the PVTM, enabling the stretching of system clock within a single cycle in different PVT conditions. Sophisticated gate devices are not required, and the cost of area and power consumption are kept to minimal.
-
公开(公告)号:US20170219649A1
公开(公告)日:2017-08-03
申请号:US15321111
申请日:2014-12-26
Applicant: Southeast University
Inventor: Weiwei Shan , Longxing Shi , Jun Yang
Abstract: A process corner detection circuit based on a self-timing oscillation ring comprises a reset circuit (1), the self-timing oscillation ring (2), and a counting module (3). The self-timing oscillation ring (2) consists of m two-input Miller units and inverters, and a two-input AND gate, m being a positive integer greater than or equal to 3. The circuit can be used for detecting a process corner of a fabricated integrated circuit chip, and reflecting the process corner of the chip according to the number of oscillations of the self-timing oscillation ring (2). The number of oscillations of the self-timing oscillation ring (2) in different process corners is acquired by Hspice simulation before the chip tape-out, and the process corner of the chip after the chip tape-out can be determined according to the actually measured number of oscillations.
-
公开(公告)号:US11715456B2
公开(公告)日:2023-08-01
申请号:US17112246
申请日:2020-12-04
Applicant: SOUTHEAST UNIVERSITY
Inventor: Weiwei Shan , Lixuan Zhu
CPC classification number: G10L15/02 , G06F17/141 , G10L25/27 , G10L25/45
Abstract: It discloses a serial FFT-based low-power MFCC speech feature extraction circuit, and belongs to the technical field of calculation, reckoning or counting. The circuit is oriented toward the field of intelligence, and is adapted to a hardware circuit design by optimizing an MFCC algorithm, and a serial FFT algorithm and an approximation operation on a multiplication are fully used, thereby greatly reducing a circuit area and power. The entire circuit includes a preprocessing module, a framing and windowing module, an FFT module, a Mel filtration module, and a logarithm and DCT module. The improved FFT algorithm uses a serial pipeline manner to process data, and a time of an audio frame is effectively utilized, thereby reducing a storage area and operation frequency of the circuit under the condition of meeting an output requirement.
-
公开(公告)号:US11335387B2
公开(公告)日:2022-05-17
申请号:US17042921
申请日:2019-10-30
Applicant: SOUTHEAST UNIVERSITY
Inventor: Weiwei Shan , Tao Wang
Abstract: An in-memory computing circuit for a fully connected binary neural network includes an input latch circuit, a counting addressing module, an address selector, a decoding and word line drive circuit, a memory array, a pre-charge circuit, a writing bit line drive circuit, a replica bit line column cell, a timing control circuit, a sensitive amplifier and a NAND gate array, an output latch circuit and an analog delay chain. A parallel XNOR operation is performed in the circuit on the SRAM bit line, and the accumulation operation, activation operation and other operations are performed by the delay chain in the time domain. Partial calculation is completed while reading the data, and the delay chain with a small area occupation can be integrated with SRAM, thus reducing the energy consumption of the memory access process. Multi-column parallel computing also improves system throughput.
-
公开(公告)号:US10422830B2
公开(公告)日:2019-09-24
申请号:US15321111
申请日:2014-12-26
Applicant: Southeast University
Inventor: Weiwei Shan , Longxing Shi , Jun Yang
Abstract: A process corner detection circuit based on a self-timing ring oscillator comprises a reset circuit (1), the self-timing oscillation ring (2), and a counting module (3). The self-timing ring oscillator (2) consists of m two-input Muller C-elements and inverters, and a two-input AND gate, m being a positive integer greater than or equal to 3. The circuit can be used for detecting a process corner of a fabricated integrated circuit chip, and reflecting the process corner of the chip according to the number of oscillations of the self-timing ring oscillator (2). The number of oscillations of the self-timing ring oscillator (2) in different process corners is acquired by Hspice simulation before the chip tape-out, and the process corner of the chip after the chip tape-out can be determined according to the actually measured number of oscillations.
-
-
-
-
-
-
-
-
-