METHODS AND SYSTEMS FOR VISUAL RECOGNITION USING TRIPLET LOSS

    公开(公告)号:US20200234088A1

    公开(公告)日:2020-07-23

    申请号:US16254344

    申请日:2019-01-22

    Abstract: Methods, systems, and computer-readable mediums storing computer executable code for visual recognition implementing a triplet loss function are provided. The method include receiving an image generated from an image source associated with a vehicle. The method may also include analyzing the image based on a convolutional neural network. The convolutional neural network may apply both a triplet loss function and a softmax loss function to the image to determine classification logits. The method may also include classifying the image into a predetermined class distribution based upon the determined classification logits. The method may also include instructing the vehicle to perform a specific task based upon the classified image.

    DRIVING SCENARIO UNDERSTANDING
    3.
    发明公开

    公开(公告)号:US20230154195A1

    公开(公告)日:2023-05-18

    申请号:US17855745

    申请日:2022-06-30

    CPC classification number: G06V20/58 G06V10/764 G06V10/82 G06V10/806

    Abstract: According to one aspect, intersection scenario description may be implemented by receiving a video stream of a surrounding environment of an ego-vehicle, extracting tracklets and appearance features associated with dynamic objects from the surrounding environment, extracting motion features associated with dynamic objects from the surrounding environment based on the corresponding tracklets, passing the appearance features through an appearance neural network to generate an appearance model, passing the motion features through a motion neural network to generate a motion model, passing the appearance model and the motion model through a fusion network to generate a fusion output, passing the fusion output through a classifier to generate a classifier output, and passing the classifier output through a loss function to generate a multi-label classification output associated with the ego-vehicle, dynamic objects, and corresponding motion paths.

    SYSTEMS AND METHODS FOR BIRDS EYE VIEW SEGMENTATION

    公开(公告)号:US20220414887A1

    公开(公告)日:2022-12-29

    申请号:US17710807

    申请日:2022-03-31

    Abstract: Systems and methods for bird's eye view (BEV) segmentation are provided. In one embodiment, a method includes receiving an input image from an image sensor on an agent. The input image is a perspective space image defined relative to the position and viewing direction of the agent. The method includes extracting features from the input image. The method includes estimating a depth map that includes depth values for pixels of the plurality of pixels of the input image. The method includes generating a 3D point map including points corresponding to the pixels of the input image. The method includes generating a voxel grid by voxelizing the 3D point map into a plurality voxels. The method includes generating a feature map by extracting feature vectors for pixels based on the points included in the voxels of the plurality of voxels and generating a BEV segmentation based on the feature map.

    SYSTEM AND METHOD FOR PROVIDING UNSUPERVISED DOMAIN ADAPTATION FOR SPATIO-TEMPORAL ACTION LOCALIZATION

    公开(公告)号:US20220215661A1

    公开(公告)日:2022-07-07

    申请号:US17704324

    申请日:2022-03-25

    Abstract: A system and method for providing unsupervised domain adaption for spatio-temporal action localization that includes receiving video data associated with a source domain and a target domain that are associated with a surrounding environment of a vehicle. The system and method also include analyzing the video data associated with the source domain and the target domain and determining a key frame of the source domain and a key frame of the target domain. The system and method additionally include completing an action localization model to model a temporal context of actions occurring within the key frame of the source domain and the key frame of the target domain and completing an action adaption model to localize individuals and their actions and to classify the actions based on the video data. The system and method further include combining losses to complete spatio-temporal action localization of individuals and actions.

Patent Agency Ranking