Patent search ap:("INTERNATIONAL BUSINESS MACHINES CORPORATION") AND inv:"Songtao Lu" Page 1

1.

发明公开
REINFORCEMENT LEARNING WITH INDUCTIVE LOGIC PROGRAMMING 审中-公开

公开(公告)号：US20230143937A1

公开(公告)日：2023-05-11

申请号：US17523553

申请日：2021-11-10

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor： Akifumi Wachi , Songtao Lu

IPC: B60W60/00 , G06N3/04 , G06K9/62

CPC classification number: B60W60/0016 , B60W60/0011 , G06N3/0481 , G06N3/0454 , G06K9/6297

Abstract: Methods and systems for training a model and automated motion include learning Markov decision processes using reinforcement learning in respective training environments. Logic rules are extracted from the Markov decision processes. T reward logic neural network (LNN) and a safety LNN are trained using the logic rules extracted from the Markov decision processes. The reward LNN and the safety LNN each take a state-action pair as an input and output a corresponding score for the state-action pair.

2.

发明申请
DECENTRALIZED POLICY GRADIENT DESCENT AND ASCENT FOR SAFE MULTI-AGENT REINFORCEMENT LEARNING 有权

公开(公告)号：US20230113168A1

公开(公告)日：2023-04-13

申请号：US17499815

申请日：2021-10-12

Applicant: International Business Machines Corporation , RENSSELAER POLYTECHNIC INSTITUTE

Inventor： Songtao Lu , Lior Horesh , Pin-Yu Chen , Sijia Liu , Tianyi Chen

IPC: G06N20/00 , G05B13/04 , B60W50/00

Abstract: A reinforcement learning system includes a plurality of agents, each agent having an individual reward function and one or more safety constraints that involve joint actions of the agents, wherein each agent maximizes a team-average long-term return in performing the joint actions, subject to the safety constraints, and participates in operating a physical system. A peer-to-peer communication network is configured to connect the plurality of agents. A distributed constrained Markov decision process (D-CMDP) model is implemented over the peer-to-peer communication network and is configured to perform policy optimization using a decentralized policy gradient (PG) method, wherein the participation of each agent in operating the physical system is based on the D-CMDP model.

3.

发明申请
Distributed Adversarial Training for Robust Deep Neural Networks 有权

公开(公告)号：US20220261626A1

公开(公告)日：2022-08-18

申请号：US17170343

申请日：2021-02-08

Applicant: International Business Machines Corporation

Inventor： Sijia Liu , Gaoyuan ZHANG , Pin-Yu Chen , Chuang Gan , Songtao Lu

IPC: G06N3/08 , G06N3/04

Abstract: Scalable distributed adversarial training techniques for robust deep neural networks are provided. In one aspect, a method for adversarial training of a deep neural network-based model by distributed computing machines M includes, by distributed computing machines M: obtaining adversarial perturbation-modified training examples for samples in a local dataset D(i); computing gradients of a local cost function fi with respect to parameters θ of the deep neural network-based model using the adversarial perturbation-modified training examples; transmitting the gradients of the local cost function fi to a server which aggregates the gradients of the local cost function fi and transmits an aggregated gradient to the distributed computing machines M; and updating the parameters θ of the deep neural network-based model stored at each of the distributed computing machines M based on the aggregated gradient received from the server. A method for distributed adversarial training of a deep neural network-based model by the server is also provided.

4.

发明申请
BILEVEL DECENTRALIZED MULTI-AGENT LEARNING 有权

公开(公告)号：US20250005324A1

公开(公告)日：2025-01-02

申请号：US18217081

申请日：2023-06-30

Applicant: International Business Machines Corporation , Regents of the University of Minnesota

Inventor： Siliang Zeng , Songtao Lu , Xiaodong Cui , Mark S. Squillante , Lior Horesh , Brian E. D. Kingsbury , Mingyi Hong

IPC: G06N3/045

Abstract: A computer-implemented method of decentralized multi-agent learning for use in a system having a plurality of intelligent agents each having a personal portion and a shared portion, is provided. The method includes iteratively, until each of a personal goal and a network goal are optimized: determining a feedback associated with an action relative to a personal goal and a degree of similarity relative to a shared goal; adjusting a policy based on the feedback to gain a superior feedback from a next action; broadcasting the shared policy; receiving the at least one of the one or more other intelligent agents' shared policy; generating a combined policy by combining the personal policy and the at least one of the one or more other intelligent agents' shared policy; estimating, using the combined policy, a network value function; and conducting the next action in accordance with the combined policy.

5.

发明申请
META CAUSAL LEARNING OVER MULTIPLE DIRECTED ACYCLIC GRAPHS 有权

公开(公告)号：US20250124314A1

公开(公告)日：2025-04-17

申请号：US18488239

申请日：2023-10-17

Applicant: International Business Machines Corporation

Inventor： Songtao Lu , Tian GAO

IPC: G06N7/01 , G06N20/00

Abstract: Systems/techniques that facilitate meta causal learning over multiple directed acyclic graphs (DAG) are provided. In various embodiments, a system can structurally decompose multiple DAGs of different domains into a shared DAG with private DAGs for each respective domain. In various aspects, the system can formulate the DAG causal structure learning as a functional constrained bilevel optimization problem. In various instances, the system can implement a bilevel primal dual method that extracts the shared DAG structure while learning the individual DAG model for personalization.

6.

发明申请
CONTEXT-AWARE RELEVANCE MODELING IN CONVERSATIONAL SYSTEMS 有权

公开(公告)号：US20250068635A1

公开(公告)日：2025-02-27

申请号：US18453127

申请日：2023-08-21

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor： Hui Wan , Xiaodong Cui , Songtao Lu , Marina Danilevsky Hailpern

IPC: G06F16/2457

Abstract: A method, computer system, and a computer program product are provided for a context-aware relevancy modelling in conversational systems. A user query is received. A latent static content d is selected from a corpus of content D. A latent set of context C from a set of external context Cu is also selected. A result is generated using a scoring function and using the latent static content d from a corpus D and the latent set of context C from the set of external contexts CU so as to provide a most relevant context-base search response to said user query q. The result provides a most relevant context-base search response to said user query q. A response is then generated based on said result using said scoring function result to said user query q.

7.

发明公开
Bilevel Optimization Based Decentralized Framework for Personalized Client Learning 审中-公开

公开(公告)号：US20240095515A1

公开(公告)日：2024-03-21

申请号：US17943839

申请日：2022-09-13

Applicant: International Business Machines Corporation

Inventor： Songtao Lu , Xiaodong Cui , Mark S. Squillante , Brian E.D. Kingsbury , Lior Horesh

IPC: G06N3/08

CPC classification number: G06N3/08

Abstract: Decentralized bilevel optimization techniques for personalized learning over a heterogenous network are provided. In one aspect, a decentralized learning system includes: a distributed machine learning network with multiple nodes, and datasets associated with the nodes; and a bilevel learning structure at each of the nodes for optimizing one or more features from each of the datasets using a decentralized bilevel optimization solver, while maintaining distinct features from each of the datasets. A method for decentralized learning is also provided.

8.

发明申请
GENERATING UNSUPERVISED ADVERSARIAL EXAMPLES FOR MACHINE LEARNING 有权

公开(公告)号：US20220253714A1

公开(公告)日：2022-08-11

申请号：US17157077

申请日：2021-01-25

Applicant: International Business Machines Corporation , National Chung Hsing University

Inventor： Pin-Yu Chen , Chia-Yi Hsu , Songtao Lu , Sijia Liu , Chuang Gan , Chia-Mu Yu

IPC: G06N3/08 , G06N3/04

Abstract: A trained machine learning model and a training dataset used to train the trained machine learning model can be received. Based on the training dataset, unsupervised adversarial examples can be generated. Robustness of the trained machine learning model can be determined using the generated unsupervised adversarial examples. The training dataset can be augmented with the generated unsupervised adversarial examples. The trained machine learning model can be retrained using the augmented training dataset.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification