Patent search ap:("SHANGHAITECH UNIVERSITY") AND inv:"Yajun Ha" Page 1

1.

发明授权
Max-flow/min-cut solution algorithm for early terminating push-relabel algorithm 有权

公开(公告)号：US12223691B2

公开(公告)日：2025-02-11

申请号：US17798898

申请日：2021-09-22

Applicant: SHANGHAITECH UNIVERSITY

Inventor： Xinzhe Liu , Guangyao Yan , Yajun Ha

IPC: G06V10/762 , G06V10/764 , G06V10/96

Abstract: A max-flow/min-cut solution algorithm for early terminating a push-relabel algorithm is provided. The max-flow/min-cut solution algorithm is used for an application that does not require an exact maximum flow, and includes: defining an early termination condition of the push-relabel algorithm by a separation condition and a stable condition; determining that the separation condition is satisfied if there is no source node s, s∈S, in the set T at any time in an operation process of the push-relabel algorithm; determining that the stable condition is satisfied if there is no active node in the set T; and terminating the push-relabel algorithm if both the separation condition and the stability condition are satisfied. The early termination technique is proposed to greatly reduce redundant computations and ensure that the algorithm terminates correctly in all cases.

2.

发明授权
Adaptive stereo matching optimization method and apparatus, device and storage medium 有权

公开(公告)号：US11875523B2

公开(公告)日：2024-01-16

申请号：US17286488

申请日：2019-09-20

Applicant: ShanghaiTech University

Inventor： Fupeng Chen , Heng Yu , Yajun Ha

IPC: G06K9/00 , G06T7/593 , G06F17/12

CPC classification number: G06T7/593 , G06F17/12 , G06T2207/10012 , G06T2207/20004 , G06T2207/20081

Abstract: The present disclosure provides an adaptive stereo matching optimization method, apparatus, and device, and a storage medium. The method includes: acquiring images of at least two perspectives of the same target scene, accordingly obtaining, through calculation, disparity value ranges corresponding to pixels in the target scene; and obtaining optimized depth value ranges by adjusting the disparity value ranges of the pixels in the target scene in real time through an adaptive stereo matching model; adjusting an execution cycle in the adaptive stereo matching model in real time through a DVFS algorithm according to a resource constraint condition of the processing system; and/or training on a plurality of scene image data sets through a convolutional neural network, so that the specific function parameters in the adaptive stereo matching model are correspondingly adjusted in real time according to the acquired different scene images.

3.

发明授权
Enhanced dynamic random access memory (eDRAM)-based computing-in-memory (CIM) convolutional neural network (CNN) accelerator 有权

公开(公告)号：US11875244B2

公开(公告)日：2024-01-16

申请号：US18009341

申请日：2022-08-05

Applicant: SHANGHAITECH UNIVERSITY

Inventor： Hongtu Zhang , Yuhao Shu , Yajun Ha

IPC: G06N3/0464 , G06F5/16

CPC classification number: G06N3/0464 , G06F5/16

Abstract: An enhanced dynamic random access memory (eDRAM)-based computing-in-memory (CIM) convolutional neural network (CNN) accelerator comprises four P2ARAM blocks, where each of the P2ARAM blocks includes a 5T1C ping-pong eDRAM bit cell array composed of 64×16 5T1C ping-pong eDRAM bit cells. In each of the P2ARAM blocks, 64×2 digital time converters convert a 4-bit activation value into different pulse widths from a row direction and input the pulse widths into the 5T1C ping-pong eDRAM bit cell array for calculation. A total of 16×2 convolution results are output in a column direction of the 5T1C ping-pong eDRAM bit cell array. The CNN accelerator uses the 5T1C ping-pong eDRAM bit cells to perform multi-bit storage and convolution in parallel. An S2M-ADC scheme is proposed to allot an area of an input sampling capacitor of an ABL to sign-numerical SAR ADC units of a C-DAC array without adding area overhead.

4.

发明授权
Full-path circuit delay measurement device for field-programmable gate array (FPGA) and measurement method 有权

公开(公告)号：US11762015B2

公开(公告)日：2023-09-19

申请号：US17801266

申请日：2021-09-22

Applicant: SHANGHAITECH UNIVERSITY

Inventor： Weixiong Jiang , Yajun Ha

IPC: G01R31/317

CPC classification number: G01R31/31725

Abstract: A full-path circuit delay measurement device for a field-programmable gate array (FPGA) and a measurement method are provided. The measurement device includes two shadow registers and a phase-shifted clock, where the two shadow registers take an output of a measured combinational logic circuit as a clock and sample the phase-shifted clock SCLK as data; the two shadow registers are respectively triggered on rising and falling edges of the output of the measured combinational logic circuit to sample the phase-shifted clock; outputs of the two shadow registers are delivered by an OR gate as an input into a synchronization register; a clock of the synchronization register serves as a clock MCLK of the measured combinational logic circuit; an output of the synchronization register serves as that of the circuit delay measurement device; the phase-shifted clock SCLK and the clock MCLK of the measured combinational logic circuit have the same frequency.

5.

发明授权
Low-power SRAM memory cell and application structure thereof 有权

公开(公告)号：US11100979B1

公开(公告)日：2021-08-24

申请号：US17051783

申请日：2020-06-17

Applicant: SHANGHAITECH UNIVERSITY

Inventor： Yuqi Wang , Yajun Ha

IPC: G11C11/412 , H01L27/11 , G11C11/419

Abstract: A low-power SRAM memory cell includes five word lines and four bit lines. The five word lines are a first word line, a second word line, a third word line, a fourth word line and a fifth word line. The four bit lines are a first bit line, a second bit line, a third bit line, and a fourth bit line. During the operation process of calculating a binary 10×11, the first word line is 1, the second word line is 0, the third word line is 0, the fourth word line is 1, the high bit stored in the bit cell is 1, and the low bit is 1. The voltage value of the fifth word line is 0.73 volt. At this time, the first bit line, the second bit line, and the third bit line do not discharge, while the fourth bit line discharges.

6.

发明授权
Normal distributions transform (NDT) method for LiDAR point cloud localization in unmanned driving 有权

公开(公告)号：US11845466B2

公开(公告)日：2023-12-19

申请号：US17802148

申请日：2021-09-22

Applicant: SHANGHAITECH UNIVERSITY

Inventor： Qi Deng , Hao Sun , Yajun Ha , Hui Wang

IPC: B60W60/00

CPC classification number: B60W60/001 , B60W2420/52 , B60W2554/4049

Abstract: A normal distributions transform (NDT) method for LiDAR point cloud localization in unmanned driving is provided. The method proposes a non-recursive, memory-efficient data structure occupation-aware-voxel-structure (OAVS), which speeds up each search operation. Compared with a tree-based structure, the proposed data structure OAVS is easy to parallelize and consumes only about 1/10 of memory. Based on the data structure OAVS, the method proposes a semantic-assisted OAVS-based (SEO)-NDT algorithm, which significantly reduces the number of search operations, redefines a parameter affecting the number of search operations, and removes a redundant search operation. In addition, the method proposes a streaming field-programmable gate array (FPGA) accelerator architecture, which further improves the real-time and energy-saving performance of the SEO-NDT algorithm. The method meets the real-time and high-precision requirements of smart vehicles for three-dimensional (3D) lidar localization.

7.

发明授权
Optimized reconfiguration algorithm based on dynamic voltage and frequency scaling 有权

公开(公告)号：US11537774B2

公开(公告)日：2022-12-27

申请号：US17595194

申请日：2021-06-09

Applicant: SHANGHAITECH UNIVERSITY

Inventor： Rui Li , Yajun Ha

IPC: G06F30/34 , G06F30/337 , G06F30/3323

Abstract: An optimized reconfiguration algorithm based on dynamic voltage and frequency scaling (DVFS) is provided, which mainly has the following contributions. The optimized reconfiguration algorithm based on DVFS proposes a DVFS-based reconfiguration method, which schedules user tasks according to a degree of parallelism (DOP) of the user tasks so as to reconfigure more parallel user tasks, thereby achieving higher reliability. The optimized reconfiguration algorithm based on DVFS proposes a K-means-based heuristic approximation algorithm, which minimizes the delay of the DVFS-based reconfiguration scheduling algorithm. The optimized reconfiguration algorithm based on DVFS proposes a K-means-based method, which reduces memory overhead caused by DVFS-based reconfiguration scheduling. The optimized reconfiguration algorithm based on DVFS improves the reliability of a field programmable gate array (FPGA) system and minimizes the area overhead of a hardware circuit.

8.

发明授权
Efficient K-nearest neighbor search algorithm for three-dimensional (3D) lidar point cloud in unmanned driving 有权

公开(公告)号：US11430200B2

公开(公告)日：2022-08-30

申请号：US17593852

申请日：2021-06-09

Applicant: SHANGHAITECH UNIVERSITY

Inventor： Hao Sun , Yajun Ha

IPC: G06V10/22 , G06V20/50 , G06T7/10

Abstract: An efficient K-nearest neighbor search algorithm for three-dimensional (3D) lidar point cloud in unmanned driving and a use of the foregoing K-nearest neighbor search algorithm in a point cloud map matching process in the unmanned driving are provided. A novel data structure for fast K-nearest neighbor search is used, such that each voxel or sub-voxel includes a proper quantity of points to reduce redundant search. The novel K-nearest neighbor search algorithm is based on a double segmentation voxel structure (DSVS) and a field programmable gate array (FPGA). By means of the novel K-nearest neighbor search algorithm, nearest neighbors are searched for only in a neighboring expected area near a search point, thereby reducing search of redundant points. In addition, an optimized data transmission and access policy is used, which makes the algorithm more fit the characteristic of the FPGA.

9.

发明授权
Method for implementing formal verification of optimized multiplier via SCA-SAT synergy 有权

公开(公告)号：US12292946B1

公开(公告)日：2025-05-06

申请号：US18967676

申请日：2024-12-04

Applicant: SHANGHAITECH UNIVERSITY

Inventor： Rui Li , Lin Li , Yajun Ha

IPC: G06F17/16

Abstract: A method for implementing formal verification of an optimized multiplier via symbolic computer algebra (SCA)-satisfiability (SAT) synergy includes: systematically recovering, by a reverse engineering algorithm, an adder tree from an optimized multiplier; 2) generating, by a constraint satisfaction algorithm, a reference multiplier only by using an adder based on a constraint condition; and 3) combining, by an SCA-based and SAT-based verification method, complementary advantages of SCA and SAT. In the verification framework, the method introduces a reference multiplier generator for generating a correct reference multiplier. The correct reference multiplier has both a structure similar to a structure of the optimized multiplier and a clear adder boundary. The clear adder boundary allows proving correctness of the correct reference multiplier through SCA-based verification. With a structural similarity between the reference multiplier and the optimized multiplier, the reference multiplier is used as a known correct model for SAT-based verification of the optimized multiplier.

10.

发明授权
Stream processing-based non-blocking ORB feature extraction accelerator implemented by FPGA 有权

公开(公告)号：US12217475B1

公开(公告)日：2025-02-04

申请号：US18813094

申请日：2024-08-23

Applicant: SHANGHAITECH UNIVERSITY

Inventor： Qixing Zhang , Yajun Ha

IPC: G06V10/46 , G06T1/60

Abstract: The provided is a stream processing-based non-blocking oriented FAST and rotated BRIEF (ORB) feature extraction accelerator implemented by a field programmable gate array (FPGA), which mainly includes two innovations: A stream processing-based non-blocking hardware architecture and a cache management algorithm are provided. The accelerator precisely controls and buffers each column of an rBRIEF descriptor computation window by using an algorithm, allowing to receive a new input pixel stream while computing a descriptor, thereby achieving non-blocking processing. An efficient hardware sorting design embedded in an accelerator is provided. Based on a count sorting algorithm, minimal resources are used to implement rBRIEF sorting on hardware, and the rBRIEF sorting is embedded in the accelerator. The accelerator ensures quality of a feature point while achieving high-speed feature point extraction, without significantly reducing accuracy of ORB_SLAM and other algorithms.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification