Hardware-software co-design for accelerating deep learning inference

    公开(公告)号:US11443173B2

    公开(公告)日:2022-09-13

    申请号:US16393247

    申请日:2019-04-24

    申请人: Baidu USA LLC

    IPC分类号: G06N3/063 G06N3/04 G06F17/15

    摘要: Embodiments disclose an artificial intelligence chip and a convolutional neural network applied to the artificial intelligence chip comprising a processor, at least one parallel computing unit, and a pooling computation unit. The method comprises: dividing a convolution task into convolution subtasks and corresponding pooling subtasks; executing convolution subtasks at different parallel computing units, and performing convolution, batch normalization, and non-linear computing operation in a same parallel computing unit; sending an execution result of each parallel computing unit from executing the convolution subtask to the pooling computation unit for executing the corresponding pooling subtask; merging executing results of the pooling computation unit from performing pooling operations on the executing results outputted by respective convolution subtasks to obtain an execution result of the convolution task. This can reduce data transport, such that operations of the convolutional neural network may be accomplished with lower power consumption and less time in an edge device.

    METHOD, ELECTRONIC DEVICE AND COMPUTER READABLE MEDIUM FOR INFORMATION PROCESSING FOR ACCELERATING NEURAL NETWORK TRAINING

    公开(公告)号:US20210117776A1

    公开(公告)日:2021-04-22

    申请号:US16660259

    申请日:2019-10-22

    申请人: Baidu USA LLC

    IPC分类号: G06N3/08 G06N3/04 G06N20/00

    摘要: A method for information processing for accelerating neural network training. The method includes: acquiring a neural network corresponding to a deep learning task; and performing iterations of iterative training on the neural network based on a training data set. The training data set includes task data corresponding to the deep learning task. The iterative training includes: processing the task data in the training data set using a current neural network, and determining, based on a processing result of the neural network on the task data in a current iterative training, prediction loss of the current iterative training; determining a learning rate and a momentum in the current iterative training; and updating weight parameters of the current neural network by gradient descent based on a preset weight decay, and the learning rate, the momentum, and the prediction loss in the current iterative training. This method achieves efficient and low-cost deep learning-based neural network training.

    Method and apparatus for determining a target object, and human-computer interaction system

    公开(公告)号:US11087133B2

    公开(公告)日:2021-08-10

    申请号:US16528134

    申请日:2019-07-31

    申请人: Baidu USA LLC

    发明人: Le Kang Yingze Bao

    IPC分类号: G06K9/00 H04W4/35

    摘要: Embodiments of the present disclosure disclose a method and an apparatus for determining a target object and a human-computer interaction system. The method according to one embodiment of the present disclosure comprises: in response to detecting a position change of an item, determining a to-be-detected image frame sequence based on a detection moment when the position change is detected; performing a human body key point detection to a to-be-detected image frame in the to-be-detected image frame sequence; and determining a target object which performs a target operation action to the item based on a detection result of the human body key point detection. This embodiment improves accuracy of the determined target object.

    Video action segmentation by mixed temporal domain adaption

    公开(公告)号:US11138441B2

    公开(公告)日:2021-10-05

    申请号:US16706590

    申请日:2019-12-06

    申请人: Baidu USA, LLC

    摘要: Embodiments herein treat the action segmentation as a domain adaption (DA) problem and reduce the domain discrepancy by performing unsupervised DA with auxiliary unlabeled videos. In one or more embodiments, to reduce domain discrepancy for both the spatial and temporal directions, embodiments of a Mixed Temporal Domain Adaptation (MTDA) approach are presented to jointly align frame-level and video-level embedded feature spaces across domains, and, in one or more embodiments, further integrate with a domain attention mechanism to focus on aligning the frame-level features with higher domain discrepancy, leading to more effective domain adaptation. Comprehensive experiment results validate that embodiments outperform previous state-of-the-art methods. Embodiments can adapt models effectively by using auxiliary unlabeled videos, leading to further applications of large-scale problems, such as video surveillance and human activity analysis.

    Systems and methods for simultaneous capture of two or more sets of light images

    公开(公告)号:US10834341B2

    公开(公告)日:2020-11-10

    申请号:US15844174

    申请日:2017-12-15

    申请人: Baidu USA, LLC

    发明人: Yingze Bao

    IPC分类号: H04N5/33 H04N9/04 G02B5/20

    摘要: Described herein are systems and methods that provide effective way for simultaneous capture of infrared and non-infrared images from a single camera. In embodiments, a filter comprises at least two types of filters elements: (1) an infrared filter type that allows infrared light to pass through the filter element; and (2) at least one non-infrared filter type that allows light in a visible spectrum range or ranges to pass through the filter element. In embodiments, the filter elements form a pattern of the infrared filter elements and the non-infrared filter elements and is positioned relative to a camera's array of sensor cells to form a correspondence between sensor cells and filter elements. In embodiments, signals captured at the camera's sensor cells may be divided to form an infrared image and a visible light image that were captured simultaneous, which images may be used to determine depth information.

    Systems and methods to improve visual feature detection using motion-related data

    公开(公告)号:US10776652B2

    公开(公告)日:2020-09-15

    申请号:US16102642

    申请日:2018-08-13

    申请人: Baidu USA, LLC

    摘要: Described herein are systems and methods that use motion-related data combined with image data to improve the speed and the accuracy of detecting visual features by predict the locations of features using the motion-related data. In embodiments, given a set of features in a previous image frame and given a next image frame, localization of the same set of features in the next image frame is attempted. In embodiments, motion-related data is used to compute the relative pose transformation between the two image frames, and the image location of the features may then be transformed to obtain their location prediction in the next frame. Such a process greatly reduces the search space of the features in the next image frame, and thereby accelerates and improves feature detection.

    Method and device for recognizing product

    公开(公告)号:US11488384B2

    公开(公告)日:2022-11-01

    申请号:US17029978

    申请日:2020-09-23

    申请人: BAIDU USA LLC

    发明人: Le Kang Yingze Bao

    IPC分类号: G06V20/40 G06Q30/06 G06V10/75

    摘要: The present disclosure provides a method and a device for recognizing a product, an electronic device and a non-transitory computer readable storage medium, relating to a field of unmanned retail product recognition. The method includes the following. A video taken by each camera in a store is acquired. A recognition is performed on each video to obtain a video segment that a product delivery is recognized and to obtain participated users. The participated users include a delivery initiation user and a delivery reception user. The video segment is inputted into a preset delivery recognition model to obtain a recognition result. The recognition result includes a product delivered and a delivery probability. The product information of products carried by the participated users is updated based on the recognition result.

    Deep learning model embodiments and training embodiments for faster training

    公开(公告)号:US11144790B2

    公开(公告)日:2021-10-12

    申请号:US16600148

    申请日:2019-10-11

    申请人: Baidu USA, LLC

    摘要: Presented herein are embodiments of a training deep learning models. In one or more embodiments, a compact deep learning model comprises fewer layers, which require fewer floating-point operations (FLOPs). Presented herein are also embodiments of a new learning rate function, which can adaptively change the learning rate between two linear functions. In one or more embodiments, combinations of half-precision floating point format training together with larger batch size in the training process may also be employed to aid the training process.