Navigation trajectory using reinforcement learning for an ego vehicle in a navigation network
Abstract:
An ego vehicle includes decider modules and a grader module coupled to a resolver module. The decider modules generate trajectory decisions at a current time, generate a current two-dimensional slice of a flat space around the ego vehicle, generate future two-dimensional slices of the flat space by projecting the current two-dimensional slice of the flat space forward in time, and generate a three-dimensional state space by stacking the current two-dimensional slice and the future two-dimensional slices. The grader module generates rewards for the trajectory decisions based on a recent behavior of an ego vehicle. The resolver module selects a final trajectory decision for the ego vehicle from the trajectory decisions based on the three-dimensional state space and the rewards. The current two-dimensional slice includes a current ego vehicle location and current neighboring vehicle locations. The future two-dimensional slices include future ego vehicle locations and future neighboring vehicle locations.
Information query
Patent Agency Ranking
0/0