Systems and methods for robust self-relocalization in a visual map

    公开(公告)号:US11788845B2

    公开(公告)日:2023-10-17

    申请号:US16770614

    申请日:2018-06-29

    IPC分类号: G01C21/30 G05D1/00

    CPC分类号: G01C21/30 G05D1/0088

    摘要: Described herein are systems and methods that improve the success rate of relocalization and eliminate the ambiguity of false relocalization by exploiting motions of the sensor system. In one or more embodiments, during a relocalization process, a snapshot is taken using one or more visual sensors and a single-shot relocalization in a visual map is implemented to establish candidate hypotheses. In one or more embodiments, the sensors move in the environment, with a movement trajectory tracked, to capture visual representations of the environment in one or more new poses. As the visual sensors move, the relocalization system tracks various estimated localization hypotheses and removes false ones until one winning hypothesis. Once the process is finished, the relocalization system outputs a localization result with respect to the visual map.

    Hardware-software co-design for accelerating deep learning inference

    公开(公告)号:US11443173B2

    公开(公告)日:2022-09-13

    申请号:US16393247

    申请日:2019-04-24

    申请人: Baidu USA LLC

    IPC分类号: G06N3/063 G06N3/04 G06F17/15

    摘要: Embodiments disclose an artificial intelligence chip and a convolutional neural network applied to the artificial intelligence chip comprising a processor, at least one parallel computing unit, and a pooling computation unit. The method comprises: dividing a convolution task into convolution subtasks and corresponding pooling subtasks; executing convolution subtasks at different parallel computing units, and performing convolution, batch normalization, and non-linear computing operation in a same parallel computing unit; sending an execution result of each parallel computing unit from executing the convolution subtask to the pooling computation unit for executing the corresponding pooling subtask; merging executing results of the pooling computation unit from performing pooling operations on the executing results outputted by respective convolution subtasks to obtain an execution result of the convolution task. This can reduce data transport, such that operations of the convolutional neural network may be accomplished with lower power consumption and less time in an edge device.

    METHOD, ELECTRONIC DEVICE AND COMPUTER READABLE MEDIUM FOR INFORMATION PROCESSING FOR ACCELERATING NEURAL NETWORK TRAINING

    公开(公告)号:US20210117776A1

    公开(公告)日:2021-04-22

    申请号:US16660259

    申请日:2019-10-22

    申请人: Baidu USA LLC

    IPC分类号: G06N3/08 G06N3/04 G06N20/00

    摘要: A method for information processing for accelerating neural network training. The method includes: acquiring a neural network corresponding to a deep learning task; and performing iterations of iterative training on the neural network based on a training data set. The training data set includes task data corresponding to the deep learning task. The iterative training includes: processing the task data in the training data set using a current neural network, and determining, based on a processing result of the neural network on the task data in a current iterative training, prediction loss of the current iterative training; determining a learning rate and a momentum in the current iterative training; and updating weight parameters of the current neural network by gradient descent based on a preset weight decay, and the learning rate, the momentum, and the prediction loss in the current iterative training. This method achieves efficient and low-cost deep learning-based neural network training.

    Method and apparatus for determining a target object, and human-computer interaction system

    公开(公告)号:US11087133B2

    公开(公告)日:2021-08-10

    申请号:US16528134

    申请日:2019-07-31

    申请人: Baidu USA LLC

    发明人: Le Kang Yingze Bao

    IPC分类号: G06K9/00 H04W4/35

    摘要: Embodiments of the present disclosure disclose a method and an apparatus for determining a target object and a human-computer interaction system. The method according to one embodiment of the present disclosure comprises: in response to detecting a position change of an item, determining a to-be-detected image frame sequence based on a detection moment when the position change is detected; performing a human body key point detection to a to-be-detected image frame in the to-be-detected image frame sequence; and determining a target object which performs a target operation action to the item based on a detection result of the human body key point detection. This embodiment improves accuracy of the determined target object.

    Video action segmentation by mixed temporal domain adaption

    公开(公告)号:US11138441B2

    公开(公告)日:2021-10-05

    申请号:US16706590

    申请日:2019-12-06

    申请人: Baidu USA, LLC

    摘要: Embodiments herein treat the action segmentation as a domain adaption (DA) problem and reduce the domain discrepancy by performing unsupervised DA with auxiliary unlabeled videos. In one or more embodiments, to reduce domain discrepancy for both the spatial and temporal directions, embodiments of a Mixed Temporal Domain Adaptation (MTDA) approach are presented to jointly align frame-level and video-level embedded feature spaces across domains, and, in one or more embodiments, further integrate with a domain attention mechanism to focus on aligning the frame-level features with higher domain discrepancy, leading to more effective domain adaptation. Comprehensive experiment results validate that embodiments outperform previous state-of-the-art methods. Embodiments can adapt models effectively by using auxiliary unlabeled videos, leading to further applications of large-scale problems, such as video surveillance and human activity analysis.

    Systems and methods for simultaneous capture of two or more sets of light images

    公开(公告)号:US10834341B2

    公开(公告)日:2020-11-10

    申请号:US15844174

    申请日:2017-12-15

    申请人: Baidu USA, LLC

    发明人: Yingze Bao

    IPC分类号: H04N5/33 H04N9/04 G02B5/20

    摘要: Described herein are systems and methods that provide effective way for simultaneous capture of infrared and non-infrared images from a single camera. In embodiments, a filter comprises at least two types of filters elements: (1) an infrared filter type that allows infrared light to pass through the filter element; and (2) at least one non-infrared filter type that allows light in a visible spectrum range or ranges to pass through the filter element. In embodiments, the filter elements form a pattern of the infrared filter elements and the non-infrared filter elements and is positioned relative to a camera's array of sensor cells to form a correspondence between sensor cells and filter elements. In embodiments, signals captured at the camera's sensor cells may be divided to form an infrared image and a visible light image that were captured simultaneous, which images may be used to determine depth information.

    Systems and methods to improve visual feature detection using motion-related data

    公开(公告)号:US10776652B2

    公开(公告)日:2020-09-15

    申请号:US16102642

    申请日:2018-08-13

    申请人: Baidu USA, LLC

    摘要: Described herein are systems and methods that use motion-related data combined with image data to improve the speed and the accuracy of detecting visual features by predict the locations of features using the motion-related data. In embodiments, given a set of features in a previous image frame and given a next image frame, localization of the same set of features in the next image frame is attempted. In embodiments, motion-related data is used to compute the relative pose transformation between the two image frames, and the image location of the features may then be transformed to obtain their location prediction in the next frame. Such a process greatly reduces the search space of the features in the next image frame, and thereby accelerates and improves feature detection.