-
公开(公告)号:US20230143937A1
公开(公告)日:2023-05-11
申请号:US17523553
申请日:2021-11-10
Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventor: Akifumi Wachi , Songtao Lu
CPC classification number: B60W60/0016 , B60W60/0011 , G06N3/0481 , G06N3/0454 , G06K9/6297
Abstract: Methods and systems for training a model and automated motion include learning Markov decision processes using reinforcement learning in respective training environments. Logic rules are extracted from the Markov decision processes. T reward logic neural network (LNN) and a safety LNN are trained using the logic rules extracted from the Markov decision processes. The reward LNN and the safety LNN each take a state-action pair as an input and output a corresponding score for the state-action pair.
-
2.
公开(公告)号:US20230113168A1
公开(公告)日:2023-04-13
申请号:US17499815
申请日:2021-10-12
Inventor: Songtao Lu , Lior Horesh , Pin-Yu Chen , Sijia Liu , Tianyi Chen
Abstract: A reinforcement learning system includes a plurality of agents, each agent having an individual reward function and one or more safety constraints that involve joint actions of the agents, wherein each agent maximizes a team-average long-term return in performing the joint actions, subject to the safety constraints, and participates in operating a physical system. A peer-to-peer communication network is configured to connect the plurality of agents. A distributed constrained Markov decision process (D-CMDP) model is implemented over the peer-to-peer communication network and is configured to perform policy optimization using a decentralized policy gradient (PG) method, wherein the participation of each agent in operating the physical system is based on the D-CMDP model.
-
公开(公告)号:US20220261626A1
公开(公告)日:2022-08-18
申请号:US17170343
申请日:2021-02-08
Applicant: International Business Machines Corporation
Inventor: Sijia Liu , Gaoyuan ZHANG , Pin-Yu Chen , Chuang Gan , Songtao Lu
Abstract: Scalable distributed adversarial training techniques for robust deep neural networks are provided. In one aspect, a method for adversarial training of a deep neural network-based model by distributed computing machines M includes, by distributed computing machines M: obtaining adversarial perturbation-modified training examples for samples in a local dataset D(i); computing gradients of a local cost function fi with respect to parameters θ of the deep neural network-based model using the adversarial perturbation-modified training examples; transmitting the gradients of the local cost function fi to a server which aggregates the gradients of the local cost function fi and transmits an aggregated gradient to the distributed computing machines M; and updating the parameters θ of the deep neural network-based model stored at each of the distributed computing machines M based on the aggregated gradient received from the server. A method for distributed adversarial training of a deep neural network-based model by the server is also provided.
-
公开(公告)号:US20250005324A1
公开(公告)日:2025-01-02
申请号:US18217081
申请日:2023-06-30
Inventor: Siliang Zeng , Songtao Lu , Xiaodong Cui , Mark S. Squillante , Lior Horesh , Brian E. D. Kingsbury , Mingyi Hong
IPC: G06N3/045
Abstract: A computer-implemented method of decentralized multi-agent learning for use in a system having a plurality of intelligent agents each having a personal portion and a shared portion, is provided. The method includes iteratively, until each of a personal goal and a network goal are optimized: determining a feedback associated with an action relative to a personal goal and a degree of similarity relative to a shared goal; adjusting a policy based on the feedback to gain a superior feedback from a next action; broadcasting the shared policy; receiving the at least one of the one or more other intelligent agents' shared policy; generating a combined policy by combining the personal policy and the at least one of the one or more other intelligent agents' shared policy; estimating, using the combined policy, a network value function; and conducting the next action in accordance with the combined policy.
-
公开(公告)号:US20250124314A1
公开(公告)日:2025-04-17
申请号:US18488239
申请日:2023-10-17
Applicant: International Business Machines Corporation
Inventor: Songtao Lu , Tian GAO
Abstract: Systems/techniques that facilitate meta causal learning over multiple directed acyclic graphs (DAG) are provided. In various embodiments, a system can structurally decompose multiple DAGs of different domains into a shared DAG with private DAGs for each respective domain. In various aspects, the system can formulate the DAG causal structure learning as a functional constrained bilevel optimization problem. In various instances, the system can implement a bilevel primal dual method that extracts the shared DAG structure while learning the individual DAG model for personalization.
-
公开(公告)号:US20250068635A1
公开(公告)日:2025-02-27
申请号:US18453127
申请日:2023-08-21
Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventor: Hui Wan , Xiaodong Cui , Songtao Lu , Marina Danilevsky Hailpern
IPC: G06F16/2457
Abstract: A method, computer system, and a computer program product are provided for a context-aware relevancy modelling in conversational systems. A user query is received. A latent static content d is selected from a corpus of content D. A latent set of context C from a set of external context Cu is also selected. A result is generated using a scoring function and using the latent static content d from a corpus D and the latent set of context C from the set of external contexts CU so as to provide a most relevant context-base search response to said user query q. The result provides a most relevant context-base search response to said user query q. A response is then generated based on said result using said scoring function result to said user query q.
-
公开(公告)号:US20240095515A1
公开(公告)日:2024-03-21
申请号:US17943839
申请日:2022-09-13
Applicant: International Business Machines Corporation
Inventor: Songtao Lu , Xiaodong Cui , Mark S. Squillante , Brian E.D. Kingsbury , Lior Horesh
IPC: G06N3/08
CPC classification number: G06N3/08
Abstract: Decentralized bilevel optimization techniques for personalized learning over a heterogenous network are provided. In one aspect, a decentralized learning system includes: a distributed machine learning network with multiple nodes, and datasets associated with the nodes; and a bilevel learning structure at each of the nodes for optimizing one or more features from each of the datasets using a decentralized bilevel optimization solver, while maintaining distinct features from each of the datasets. A method for decentralized learning is also provided.
-
公开(公告)号:US20220253714A1
公开(公告)日:2022-08-11
申请号:US17157077
申请日:2021-01-25
Inventor: Pin-Yu Chen , Chia-Yi Hsu , Songtao Lu , Sijia Liu , Chuang Gan , Chia-Mu Yu
Abstract: A trained machine learning model and a training dataset used to train the trained machine learning model can be received. Based on the training dataset, unsupervised adversarial examples can be generated. Robustness of the trained machine learning model can be determined using the generated unsupervised adversarial examples. The training dataset can be augmented with the generated unsupervised adversarial examples. The trained machine learning model can be retrained using the augmented training dataset.
-
-
-
-
-
-
-