UNIVERSAL TRANSFORMERS
摘要:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for implementing a sequence to sequence model that is recurrent in depth while employing self-attention to combine information from different parts of sequences.
公开/授权文献
信息查询
0/0