-
公开(公告)号:US20250111220A1
公开(公告)日:2025-04-03
申请号:US18374905
申请日:2023-09-29
Applicant: Amazon Technologies, Inc.
IPC: G06N3/08 , G06N3/0475
Abstract: Generative pre-trained large language models (LLMs) can create domain-specific text answers in various formats like JSON, XML, HTML, SQL, or programming languages. However, LLMs may “hallucinate,” generating incorrect or nonsensical answers that diverge from reality, thus eroding trust in their outputs or worse. Disclosed techniques use a sampling-based approach and an equivalence checker. Multiple answers (samples) to a prompt are generated by the LLM; if they are equivalent, the LLM is likely answering correctly. If the samples disagree or contradict, it's more likely that the LLM is hallucinating, or the prompt is ambiguous. An automated reasoning equivalence checker is utilized to verify the samples' functional equivalency, providing a method to detect and possibly rectify hallucination issues in LLM-generated answers.
-
公开(公告)号:US20240202545A1
公开(公告)日:2024-06-20
申请号:US18066881
申请日:2022-12-15
Applicant: Amazon Technologies, Inc.
Inventor: Kevin LOTZ , Bruno DUTERTRE , John Byron COOK , Amit GOEL , Robert JONES , Benjamin KIESL-REITER , Soon Ho KONG , Rupak MAJUMDAR
Abstract: Techniques are described for providing a SAT-based solver for a quantifier-free theory of strings and bit vectors. The solver can be used by an automated reasoning service of a cloud provider network to analyze policies and the consequences of policies. The solver reduces an input formula to a Boolean satisfiability problem by encoding the input formula into an equisatisfiable propositional formula, where the satisfiability of the equisatisfiable propositional formula is determined by a SAT solver. Rather than using a traditional DPLL(T) style algorithm, the solver described herein bounds the length of variables in an input formula and reduces the problem to a single formula, which can then be solved using incremental SAT solving. The solver can be used independently or as part of a portfolio of solvers used to determine the satisfiability or unsatisfiability of certain formula corresponding, e.g., to questions about users' policies within a cloud provider network.
-