NORMAL DISTRIBUTIONS TRANSFORM (NDT) METHOD FOR LIDAR POINT CLOUD LOCALIZATION IN UNMANNED DRIVING

    公开(公告)号:US20230192123A1

    公开(公告)日:2023-06-22

    申请号:US17802148

    申请日:2021-09-22

    CPC classification number: B60W60/001 B60W2420/52 B60W2554/4049

    Abstract: A normal distributions transform (NDT) method for LiDAR point cloud localization in unmanned driving is provided. The method proposes a non-recursive, memory-efficient data structure occupation-aware-voxel-structure (OAVS), which speeds up each search operation. Compared with a tree-based structure, the proposed data structure OAVS is easy to parallelize and consumes only about 1/10 of memory. Based on the data structure OAVS, the method proposes a semantic-assisted OAVS-based (SEO)-NDT algorithm, which significantly reduces the number of search operations, redefines a parameter affecting the number of search operations, and removes a redundant search operation. In addition, the method proposes a streaming field-programmable gate array (FPGA) accelerator architecture, which further improves the real-time and energy-saving performance of the SEO-NDT algorithm. The method meets the real-time and high-precision requirements of smart vehicles for three-dimensional (3D) lidar localization.

    OPTIMIZED RECONFIGURATION ALGORITHM BASED ON DYNAMIC VOLTAGE AND FREQUENCY SCALING

    公开(公告)号:US20220309217A1

    公开(公告)日:2022-09-29

    申请号:US17595194

    申请日:2021-06-09

    Inventor: Rui LI Yajun HA

    Abstract: An optimized reconfiguration algorithm based on dynamic voltage and frequency scaling (DVFS) is provided, which mainly has the following contributions. The optimized reconfiguration algorithm based on DVFS proposes a DVFS-based reconfiguration method, which schedules user tasks according to a degree of parallelism (DOP) of the user tasks so as to reconfigure more parallel user tasks, thereby achieving higher reliability. The optimized reconfiguration algorithm based on DVFS proposes a K-means-based heuristic approximation algorithm, which minimizes the delay of the DVFS-based reconfiguration scheduling algorithm. The optimized reconfiguration algorithm based on DVFS proposes a K-means-based method, which reduces memory overhead caused by DVFS-based reconfiguration scheduling. The optimized reconfiguration algorithm based on DVFS improves the reliability of a field programmable gate array (FPGA) system and minimizes the area overhead of a hardware circuit.

    ADAPTIVE STEREO MATCHING OPTIMIZATION METHOD AND APPARATUS, DEVICE AND STORAGE MEDIUM

    公开(公告)号:US20210390725A1

    公开(公告)日:2021-12-16

    申请号:US17286488

    申请日:2019-09-20

    Abstract: The present disclosure provides an adaptive stereo matching optimization method, apparatus, and device, and a storage medium. The method includes: acquiring images of at least two perspectives of the same target scene, accordingly obtaining, through calculation, disparity value ranges corresponding to pixels in the target scene; and obtaining optimized depth value ranges by adjusting the disparity value ranges of the pixels in the target scene in real time through an adaptive stereo matching model; adjusting an execution cycle in the adaptive stereo matching model in real time through a DVFS algorithm according to a resource constraint condition of the processing system; and/or training on a plurality of scene image data sets through a convolutional neural network, so that the specific function parameters in the adaptive stereo matching model are correspondingly adjusted in real time according to the acquired different scene images.

    LOW-POWER SRAM MEMORY CELL AND APPLICATION STRUCTURE THEREOF

    公开(公告)号:US20210249069A1

    公开(公告)日:2021-08-12

    申请号:US17051783

    申请日:2020-06-17

    Inventor: Yuqi WANG Yajun HA

    Abstract: A low-power SRAM memory cell includes five word lines and four bit lines. The five word lines are a first word line, a second word line, a third word line, a fourth word line and a fifth word line. The four bit lines are a first bit line, a second bit line, a third bit line, and a fourth bit line. During the operation process of calculating a binary 10×11, the first word line is 1, the second word line is 0, the third word line is 0, the fourth word line is 1, the high bit stored in the bit cell is 1, and the low bit is 1. The voltage value of the fifth word line is 0.73 volt. At this time, the first bit line, the second bit line, and the third bit line do not discharge, while the fourth bit line discharges.

    GRAPHICS PROCESSING UNIT (GPU)-BASED LOGIC REWRITING ACCELERATION METHOD

    公开(公告)号:US20240289914A1

    公开(公告)日:2024-08-29

    申请号:US18537836

    申请日:2023-12-13

    Inventor: Lin LI Yajun HA

    CPC classification number: G06T1/20 G06F9/3851

    Abstract: A graphics processing unit (GPU)-based logic rewriting acceleration method comprising parallelizing sub-procedures of And-Inverter Graph (AIG)-based logic rewriting. A recursive sub-procedure of the AIG-based logic rewriting is redesigned to be non-recursive, to provide sufficient parallelism for a GPU. In order to parallelize a replacement step on the GPU, the present disclosure uses a lock to ensure mutually exclusive access, which inevitably damages scalability of inter-node parallelism. In order to fully utilize the inter-node parallelism on a large scale, the present disclosure proposes a work scheduler that adds nodes with non-overlapping maximum fan-out-free cones (MFFCs) to a group, such that nodes in an MFFC can be deleted simultaneously without a conflict. In order to simultaneously create and delete a same node, the present disclosure also proposes a GPU-friendly graphical data structure to support these concurrent operations.

    ENERGY-EFFICIENT MEMORY FOR CRYOGENIC COMPUTING

    公开(公告)号:US20240233796A1

    公开(公告)日:2024-07-11

    申请号:US18505128

    申请日:2023-11-09

    CPC classification number: G11C11/4023 G11C11/4087 G11C11/4091 G11C11/4096

    Abstract: An energy-efficient memory for cryogenic computing is provided. The energy-efficient memory includes a plurality of memory banks, where each of the memory banks includes a cryogenic semi-static, dual-port, boost-free gain cell (CSDB-GC) macro module, a universal address decoder, and a different address decoder. The CSDB-GC macro module includes a plurality of columns of local blocks, and each of the local blocks includes a plurality of CSDB-GC memory cells. A final measurement result of a 16 Kb CSDB-eDRAM shows that the 16 Kb CSDB-eDRAM achieves data retention time (DRT) of 16.67 seconds, which is 2.6 times longer than DRT of a state-of-the-art cryogenic eDRAM at a temperature of 4.2 K, and achieves lower refresh power (0.11 pW/Kb). In addition, the 16 Kb CSDB-eDRAM also achieves shorter access time, namely, 710 ps (1.41 GHz). Compared with the state-of-the-art work, the 16 Kb CSDB-eDRAM has a lowest dynamic power consumption overhead, namely, 49.23 uW/Kb.

    LAYOUT METHOD AND APPLICATION OF SCALABLE MULTI-DIE NETWORK-ON-CHIP FPGA ARCHITECTURE

    公开(公告)号:US20240143883A1

    公开(公告)日:2024-05-02

    申请号:US18203662

    申请日:2023-05-31

    CPC classification number: G06F30/347 G06F30/31

    Abstract: A layout method for a scalable multi-die network-on-chip FPGA architecture is provided. An application of the aforementioned layout method for the scalable multi-die network-on-chip FPGA architecture is further provided. A scalable multi-die FPGA architecture based on network-on-chip and a corresponding hierarchical recursive layout algorithm are provided, aiming to directly map a register transfer level dataflow design generated by existing high-level synthesis onto the provided interconnection architecture. The layout method can exploit the potential for hierarchical topology and make more efficient use of dedicated interconnection resources, such as cross-die nets, network-on-chips, and high-speed transceivers.

    RIPPLE PUSH METHOD FOR GRAPH CUT
    20.
    发明公开

    公开(公告)号:US20230195793A1

    公开(公告)日:2023-06-22

    申请号:US17799278

    申请日:2021-09-22

    CPC classification number: G06F16/9024

    Abstract: A ripple push method for a graph cut includes: obtaining an excess flow ef(v) of a current node v; traversing four edges connecting the current node v in top, bottom, left and right directions, and determining whether each of the four edges is a pushable edge; calculating, according to different weight functions, a maximum push value of each of the four edges by efw=ef(v)*W, where W denotes a weight function; and traversing the four edges, recording a pushable flow of each of the four edges, and pushing out a calculated flow. The ripple push method explores different push weight functions, and significantly improves the actual parallelism of the push-relabel algorithm.

Patent Agency Ranking