专利检索 ap:("Daniel Mark GRAVES" OR "Jun JIN" OR "Jun LUO") AND inv:"Daniel Mark GRAVES" 第 1 页

1.

发明申请
METHODS AND SYSTEMS FOR SUPPORT POLICY LEARNING 有权

公开(公告)号：US20210357782A1

公开(公告)日：2021-11-18

申请号：US16875741

申请日：2020-05-15

申请人： Daniel Mark GRAVES , Jun JIN , Jun LUO

发明人： Daniel Mark GRAVES , Jun JIN , Jun LUO

IPC分类号： G06N5/04 , G06N20/00 , G06N3/00

摘要： Methods and systems are described for support policy learning in an agent of a robot. A general value function (GVF) is learned for a main policy, where the GVF represents future performance of the agent executing the main policy for a given state of the environment. A master policy selects an action based on the predicted accumulated success value received from the general value function. When the predicted accumulated success value is an acceptable value, the action selected by the master policy is execution of the main policy. When the predicted accumulated success value is not an acceptable value, the master action causes a support policy to be learned. The support policy generates a support action to be performed which causes the robot to transition from to a new state where the predicted accumulated success value has an acceptable value.

2.

发明申请
METHOD AND SYSTEM FOR PREDICTIVE CONTROL OF VEHICLE USING DIGITAL IMAGES 有权

公开(公告)号：US20210004006A1

公开(公告)日：2021-01-07

申请号：US16921523

申请日：2020-07-06

申请人： Daniel Mark GRAVES

发明人： Daniel Mark GRAVES

IPC分类号： G05D1/02 , B60W30/095 , B60W50/00

摘要： Methods and systems for predictive control of an autonomous vehicle are described. Predictions of lane centeredness and road angle are generated based on data collected by sensors on the autonomous vehicle and are combined to determine a state of the vehicle that are then used to generate vehicle actions for steering control and speed control of the autonomous vehicle.

3.

发明申请
METHOD AND SYSTEM FOR CONTROLLING SAFETY OF EGO AND SOCIAL OBJECTS 审中-公开

公开(公告)号：US20200276988A1

公开(公告)日：2020-09-03

申请号：US16803386

申请日：2020-02-27

申请人： Daniel Mark GRAVES

发明人： Daniel Mark GRAVES

IPC分类号： B60W60/00 , G06N3/08

摘要： A method or system for controlling safety of both an ego vehicle and social objects in an environment of the ego vehicle, comprising: receiving data representative of at least one social object and determining a current state of the ego vehicle based on sensor data; predicting an ego safety value corresponding to the ego vehicle, for each possible behavior action in a set of possible behavior actions, based on the current state; predicting a social safety value corresponding to the at least one social object in the environment of the ego vehicle, based on the current state, for each possible behavior action; and selecting a next behavior action for the ego vehicle, based on the ego safety values, the social safety values, and one or more target objectives for the ego vehicle.

4.

发明申请
SYSTEMS AND METHODS FOR LEARNING REUSABLE OPTIONS TO TRANSFER KNOWLEDGE BETWEEN TASKS 有权

公开(公告)号：US20210387330A1

公开(公告)日：2021-12-16

申请号：US16900291

申请日：2020-06-12

申请人： Borislav MAVRIN , Daniel Mark GRAVES

发明人： Borislav MAVRIN , Daniel Mark GRAVES

IPC分类号： B25J9/16 , G06N3/08

摘要： A robot that includes an RL agent that is configured to learn a policy to maximize the cumulative reward of a task, to determine one or more features that are minimally correlated with each other. The features are then used as pseudo-rewards, called feature rewards, where each feature reward corresponds to an option policy, or skill, the RL agent learns to maximize. In an example, the RL agent is configured to select the most relevant features to learn respective option policies from. The RL agent is configured to, for each of the selected features, learn the respective option policy that maximizes the respective feature reward. Using the learned option policies, the RL agent is configured to learn a new (second) policy for a new (second) task that can choose from any of the learned option policies or actions available to the RL agent.