-
公开(公告)号:US20210176035A1
公开(公告)日:2021-06-10
申请号:US16315998
申请日:2019-01-04
发明人: Yueqiang CHENG , Yong LIU , Tao WEI , Jian OUYANG
摘要: According to one embodiment, a system receives, at a host system from a data processing (DP) accelerator, an accelerator identifier (ID) that uniquely identifies the DP accelerator), wherein the host system is coupled to the DP accelerator over a bus. The system transmits the accelerator ID to a predetermined trusted server over a network. The system receives a certificate from the predetermined trusted server over the network, the certificate certifying the DP accelerator. The system extracts a public root key (PK_RK) from the certificate for verification, the PK_RK corresponding to a private root key (SK_RK) associated with the DP accelerator. The system establishes a secure channel with the DP accelerator using the PK_RK based on the verification to exchange data securely between the host system and the DP accelerator.
-
公开(公告)号:US20210173934A1
公开(公告)日:2021-06-10
申请号:US16315957
申请日:2019-01-04
发明人: Yong LIU , Yueqiang CHENG , Jian OUYANG , Tao WEI
摘要: According to one embodiment, a system performs a secure boot using a security module such as a trusted platform module (TPM) of a host system. The system establishes a trusted execution environment (TEE) associated with one or more processors of the host system. The system launches a memory manager within the TEE, where the memory manager is configured to manage memory resources of a data processing (DP) accelerator coupled to the host system over a bus, including maintaining memory usage information of global memory of the DP accelerator. In response to a request received from an application running within the TEE for accessing a memory location of the DP accelerator, the system allows or denies the request based on the memory usage information.
-
公开(公告)号:US20210173428A1
公开(公告)日:2021-06-10
申请号:US16315924
申请日:2019-01-04
发明人: Yong LIU , Yueqiang CHENG , Jian OUYANG , Tao WEI
摘要: According to one embodiment, a DP accelerator includes one or more execution units (EUs) configured to perform data processing operations in response to an instruction received from a host system coupled over a bus. The DP accelerator includes a security unit (SU) configured to establish and maintain a secure channel with the host system to exchange commands and data associated with the data processing operations. The DP accelerator includes a time unit (TU) coupled to the security unit to provide timestamp services to the security unit, where the time unit includes a clock generator to generate clock signals locally without having to derive the clock signals from an external source. The TU includes a timestamp generator coupled to the clock generator to generate a timestamp based on the clock signals, and a power supply to provide power to the clock generator and the timestamp generator.
-
公开(公告)号:US20210163037A1
公开(公告)日:2021-06-03
申请号:US16067556
申请日:2018-04-18
发明人: Fan ZHU , Qi KONG , Yuchang PAN , Feiyi JIANG , Xin XU , Xiaoxin FU , Zhongpu XIA , Chunming ZHAO , Liangliang ZHANG , Weicheng ZHU , Li ZHUANG , Haoyang FAN , Hui JIANG , Jiaming TAO
摘要: In one embodiment, instead of using map data, a relative coordinate system is utilized to assist perception of the driving environment surrounding an ADV for some driving situations. One of such driving situations is driving on a highway. Typically, a highway has fewer intersections and exits. The relative coordinate system is utilized based on the relative lane configuration and relative obstacle information to control the ADV to simply follow the lane and avoid potential collision with any obstacles discovered within the road, without having to use map data. Once the relative lane configuration and obstacle information have been determined, regular path and speed planning and optimization can be performed to generate a trajectory to drive the ADV. Such a perception system is referred to as a relative perception system based on a relative coordinate system.
-
35.
公开(公告)号:US20200218821A1
公开(公告)日:2020-07-09
申请号:US16751665
申请日:2020-01-24
发明人: Yong LIU , Yueqiang CHENG , Jian OUYANG , Tao WEI
摘要: According to one embodiment, a system establishes a secure connection between a host system and a data processing (DP) accelerator over a bus, the secure connection including one or more data channels. The system transmits a first instruction from the host system to the DP accelerator over a command channel, the first instruction requesting the DP accelerator to perform a data preparation operation. The system receives a first request to read a first data from a first memory location of the host system from the DP accelerator over one data channel. In response to the request, the system transmits the first data to the DP accelerator over the data channel, where the first data is utilized for a computation or a configuration operation. The system transmits a second instruction from the host system to the DP accelerator over the command channel to perform the computation or the configuration operation.
-
36.
公开(公告)号:US10365649B2
公开(公告)日:2019-07-30
申请号:US15522218
申请日:2017-04-19
发明人: Fan Zhu , Qi Kong , Qi Luo , Xiang Yu , Sen Hu , Zhenguang Zhu , Xiaoxin Fu , Jiarui He , Hongye Li , Yuchang Pan , Zhongpu Xia , Chunming Zhao , Guang Yang , Jingao Wang
摘要: In one embodiment, a lane departure detection system detects at a first point in time that a wheel of an ADV rolls onto a lane curb disposed on an edge of a lane in which the ADV is moving. The system detects at a second point in time that the wheel of the ADV rolls off the lane curb of the lane. The system calculates an angle between a moving direction of the ADV and a lane direction of the lane based on the time difference between the first point in time and the second point in time in view of a current speed of the ADV. The system then generates a control command based on the angle to adjust the moving direction of the ADV in order to prevent the ADV from further drifting off the lane direction of the lane.
-
公开(公告)号:US20180183873A1
公开(公告)日:2018-06-28
申请号:US15115249
申请日:2016-07-21
发明人: Quan Wang , Liming Xia , Jingchao Feng , Ning Qu , James Peng
CPC分类号: H04L67/12 , G05D1/0088 , H04L67/02
摘要: A first request is received from a first processing node to produce data blocks of a first data stream representing a first communication topic. The first processing node is one of the processing nodes handling a specific function of operating an autonomous vehicle. Each of the processing nodes is executed within a specific node container having a specific operating environment. A global memory segment is allocated from a global memory to store the data blocks of the first data stream. A first local memory segment is mapped to the global memory segment. The first local memory segment is allocated from a first local memory of a first node container containing the first processing node. The first processing node directly accesses the data blocks of the first data stream stored in the global memory segment by accessing the mapped first local memory segment within the first node container.
-
公开(公告)号:US12039427B2
公开(公告)日:2024-07-16
申请号:US16966834
申请日:2019-09-24
发明人: Baopu Li , Yanwen Fan , Zhiyu Cheng , Yingze Bao
摘要: Deep neural networks (DNN) model quantization may be used to reduce storage and computation burdens by decreasing the bit width. Presented herein are novel cursor-based adaptive quantization embodiments. In embodiments, a multiple bits quantization mechanism is formulated as a differentiable architecture search (DAS) process with a continuous cursor that represents a possible quantization bit. In embodiments, the cursor-based DAS adaptively searches for a quantization bit for each layer. The DAS process may be accelerated via an alternative approximate optimization process, which is designed for mixed quantization scheme of a DNN model. In embodiments, a new loss function is used in the search process to simultaneously optimize accuracy and parameter size of the model. In a quantization step, the closest two integers to the cursor may be adopted as the bits to quantize the DNN together to reduce the quantization noise and avoid the local convergence problem.
-
公开(公告)号:US11799651B2
公开(公告)日:2023-10-24
申请号:US16315867
申请日:2019-01-04
申请人: Baidu USA LLC , Baidu.com Times Technology (Beijing) Co., Ltd. , KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
发明人: Yong Liu , Yueqiang Cheng , Jian Ouyang , Tao Wei
IPC分类号: H04L9/32 , G06F7/58 , H04L9/30 , H04L67/141
CPC分类号: H04L9/32 , G06F7/588 , H04L9/30 , H04L9/3265 , H04L67/141
摘要: According to one embodiment, a DP accelerator includes one or more execution units (EUs) configured to perform data processing operations in response to an instruction received from a host system coupled over a bus. The DP accelerator includes a time unit (TU) coupled to the security unit to provide timestamp services. The DP accelerator includes a security unit (SU) configured to establish and maintain a secure channel with the host system to exchange commands and data associated with the data processing operations, where the security unit includes a secure storage area to store a private root key associated with the DP accelerator, where the private root key is utilized for authentication. The SU includes a random number generator to generate a random number, and a cryptographic engine to perform cryptographic operations on data exchanged with the host system over the bus using a session key derived based on the random number.
-
公开(公告)号:US11741568B2
公开(公告)日:2023-08-29
申请号:US16770063
申请日:2018-06-29
发明人: Haofeng Kou , Kuipeng Wang , Le Kang , Xuejun Wang , Yingze Bao
CPC分类号: G06T1/20 , G06F1/329 , G06F9/4893 , G06F9/5038 , G06F18/214 , G06N3/08
摘要: Described herein are systems and methods for object detection to achieve hard real-time performance with low latency. Real-time object detection frameworks are disclosed. In one or more embodiments, a framework comprises a first CPU core, a second CPU core, and a plurality of shaves. In one or more embodiments, the first CPU core handles general CPU tasks, while the second CPU core handles the image frames from a camera sensor and computation task scheduling. In one or more embodiments, the scheduled computation tasks are implemented by the plurality of shaves using at least one object-detection model to detect an object in an image frame. In one or more embodiments, computation results from the object-detection model with a higher detection probability is used to form an output for object detection. In one or more embodiments, the object-detection models share some parameters for smaller size and higher implementing speed.
-
-
-
-
-
-
-
-
-