Training a question-answer dialog sytem to avoid adversarial attacks
摘要:
A method, computer program product, and/or computer system protects a question-answer dialog system from being attacked by adversarial statements that incorrectly answer a question. A computing device accesses a plurality of adversarial statements that are capable of making an adversarial attack on a question-answer dialog system, which is trained to provide a correct answer to a specific type of question. The computing device utilizes the plurality of adversarial statements to train a machine learning model for the question-answer dialog system. The computing device then reinforces the trained machine learning model by bootstrapping adversarial policies that identify multiple types of adversarial statements onto the trained machine learning model. The computing device then utilizes the trained and bootstrapped machine learning model to avoid adversarial attacks when responding to questions submitted to the question-answer dialog system.
信息查询
0/0