SPARSE ATTENTION NEURAL NETWORKS
Abstract:
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing a machine learning task on a network input to generate a network output. In one aspect, one of the systems includes a neural network configured to perform the machine learning task, the neural network including one or more sparse attention layers.
Information query
Patent Agency Ranking
0/0