Speculative decoding in autoregressive generative artificial intelligence models

    公开(公告)号:US12229192B2

    公开(公告)日:2025-02-18

    申请号:US18538965

    申请日:2023-12-13

    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to a query input in a generative artificial intelligence model. An example method generally includes receiving a plurality of sets of tokens generated based on an input prompt and a first generative artificial intelligence model, each set of tokens in the plurality of sets of tokens corresponding to a candidate response to the input prompt; selecting, using a second generative artificial intelligence model and recursive adjustment of a target distribution associated with the received plurality of sets of tokens, a set of tokens from the plurality of sets of tokens; and outputting the selected set of tokens as a response to the input prompt.

Patent Agency Ranking