ADAPTIVE TOKEN DEPTH ADJUSTMENT IN TRANSFORMER NEURAL NETWORKS

    公开(公告)号:US20230186077A1

    公开(公告)日:2023-06-15

    申请号:US17841577

    申请日:2022-06-15

    CPC classification number: G06N3/08 G06N3/0481

    Abstract: One embodiment of the present invention sets forth a technique for executing a transformer neural network. The technique includes computing a first set of halting scores for a first set of tokens that has been input into a first layer of the transformer neural network. The technique also includes determining that a first halting score included in the first set of halting scores exceeds a threshold value. The technique further includes in response to the first halting score exceeding the threshold value, causing a first token that is included in the first set of tokens and is associated with the first halting score not to be processed by one or more layers within the transformer neural network that are subsequent to the first layer.

Patent Agency Ranking