-
公开(公告)号:US20240428783A1
公开(公告)日:2024-12-26
申请号:US18341412
申请日:2023-06-26
Applicant: Amazon Technologies, Inc.
Inventor: Rahul Gupta , Charith Peris , Palash Goyal , Lisa Bauer , Ninareh Mehrabi
Abstract: Systems and techniques for moderating responses of a generative language model are described herein. Some user inputs to a generative language model may include biases, misinformation, and other references to moderated content. To prevent the generative language model from generating responses that promote these forms of moderated content, the techniques described determine a policy corresponding to the determined moderated content category of the user input. The determined policy may correspond to a template of instructions for how the generative language model is to respond to such moderated content. The output of the generative language model may also be moderated before being presented to the user.