ACCELERATING INFERENCING IN GENERATIVE ARTIFICIAL INTELLIGENCE MODELS

    公开(公告)号:US20250021761A1

    公开(公告)日:2025-01-16

    申请号:US18545804

    申请日:2023-12-19

    Abstract: Techniques and apparatus for generating a response to a query input into a generative artificial intelligence model. An example method generally includes generating, based on an input query and a first generative artificial intelligence model, a sequence of tokens corresponding to a candidate response to the input query. The sequence of tokens and the input query are output to a second generative artificial intelligence model for verification. One or more first guidance signals for the generated sequence of tokens are received from the second generative artificial intelligence model. The candidate response to the input query is revised based on the generated sequence of tokens and the one or more first guidance signals, and the revised candidate response is output as a response to the received input query.

Patent Agency Ranking