-
公开(公告)号:US20150058573A1
公开(公告)日:2015-02-26
申请号:US13971811
申请日:2013-08-20
Applicant: Apple Inc.
Inventor: James S. Blomgren , Terence M. Potter , Timothy A. Olson , Andrew M. Havlir
CPC classification number: G06F12/0875 , G06F9/30043 , G06F9/30138 , G06F9/30145 , G06F9/30185
Abstract: Instructions may require one or more operands to be executed, which may be provided from a register file. In the context of a GPU, however, a register file may be a relatively large structure, and reading from the register file may be energy and/or time intensive An operand cache may be used to store a subset of operands, and may use less power and have quicker access times than the register file. Selectors (e.g., multiplexers) may be used to read operands from the operand cache. Power savings may be achieved in some embodiments by activating only a subset of the selectors, which may be done by activators (e.g. flip-flops). Operands may also be concurrently provided to two or more locations via forwarding, which may be accomplished via a source selection unit in some embodiments. Operand forwarding may also reduce power and/or speed execution in one or more embodiments.
Abstract translation: 指令可能需要执行一个或多个操作数,可以从寄存器文件提供。 然而,在GPU的上下文中,寄存器文件可以是相对较大的结构,并且从寄存器文件的读取可能是能量和/或时间密集的。操作数高速缓存可以用于存储操作数的子集,并且可以使用较少的 并且具有比寄存器文件更快的访问时间。 选择器(例如,多路复用器)可用于从操作数高速缓存读取操作数。 在一些实施例中可以通过激活选择器的子集来实现功率节省,这可以由激活器(例如,触发器)完成。 操作数还可以经由转发同时提供给两个或更多个位置,这在一些实施例中可以经由源选择单元来实现。 操作数转发还可以在一个或多个实施例中降低功率和/或速度执行。