INSTRUCTION SOURCE SPECIFICATION
    1.
    发明申请

    公开(公告)号:US20160350113A1

    公开(公告)日:2016-12-01

    申请号:US15233496

    申请日:2016-08-10

    Applicant: Apple Inc.

    Abstract: Techniques are disclosed relating to specification of instruction operands. In some embodiments, this may involve assigning operands to source inputs. In one embodiment, an instruction includes one or more mapping values, each of which corresponds to a source of the instruction and each of which specifies a location value. In this embodiment, the instruction includes one or more location values that are each usable to identify an operand for the instruction. In this embodiment, a method may include accessing operands using the location values and assigning accessed operands to sources using the mapping values. In one embodiment, the sources may correspond to inputs of an execution block. In one embodiment, a destination mapping value in the instruction may specify a location value that indicates a destination for storing an instruction result.

    Texture state cache
    2.
    发明授权

    公开(公告)号:US09811875B2

    公开(公告)日:2017-11-07

    申请号:US14482828

    申请日:2014-09-10

    Applicant: Apple Inc.

    CPC classification number: G06T1/60 G06T15/04

    Abstract: Techniques are disclosed relating to a cache configured to store state information for texture mapping. In one embodiment, a texture state cache includes a plurality of entries configured to store state information relating to one or more stored textures. In this embodiment, the texture state cache also includes texture processing circuitry configured to retrieve state information for one of the stored textures from one of the entries in the texture state cache and determine pixel attributes based on the texture and the retrieved state information. The state information may include texture state information and sampler state information, in some embodiments. The texture state cache may allow for reduced rending times and power consumption, in some embodiments.

    Clock routing techniques
    3.
    发明授权
    Clock routing techniques 有权
    时钟路由技术

    公开(公告)号:US09594395B2

    公开(公告)日:2017-03-14

    申请号:US14160179

    申请日:2014-01-21

    Applicant: Apple Inc.

    Abstract: Techniques are disclosed relating to clock routing techniques in processors with both pipelined and non-pipelined circuitry. In some embodiments, an apparatus includes execution units that are non-pipelined and configured to perform instructions without receiving a clock signal. In these embodiments, one or more clock lines routed throughout the apparatus do not extend into the one or more execution units in each pipeline, reducing the length of the clock lines. In some embodiments, the apparatus includes multiple such pipelines arranged in an array, with the execution units located on an outer portion of the array and clocked control circuitry located on an inner portion of the array. In some embodiments, clock lines do not extend into the outer portion of the array. In some embodiments, the array includes one or more rows of execution units. These arrangements may further reduce the length of clock lines.

    Abstract translation: 公开了涉及具有流水线和非流水线电路的处理器中的时钟路由技术的技术。 在一些实施例中,装置包括非流水线并被配置为在不接收时钟信号的情况下执行指令的执行单元。 在这些实施例中,在整个装置中布线的一个或多个时钟线不延伸到每个流水线中的一个或多个执行单元中,从而减小时钟线的长度。 在一些实施例中,该装置包括布置成阵列的多个这样的管道,其中执行单元位于阵列的外部部分上,并且位于阵列内部的时钟控制电路。 在一些实施例中,时钟线不延伸到阵列的外部部分。 在一些实施例中,阵列包括一行或多行执行单元。 这些布置可以进一步减少时钟线的长度。

    Instruction source specification
    4.
    发明授权
    Instruction source specification 有权
    指令源规范

    公开(公告)号:US09442730B2

    公开(公告)日:2016-09-13

    申请号:US13956291

    申请日:2013-07-31

    Applicant: Apple Inc.

    Abstract: Techniques are disclosed relating to specification of instruction operands. In some embodiments, this may involve assigning operands to source inputs. In one embodiment, an instruction includes one or more mapping values, each of which corresponds to a source of the instruction and each of which specifies a location value. In this embodiment, the instruction includes one or more location values that are each usable to identify an operand for the instruction. In this embodiment, a method may include accessing operands using the location values and assigning accessed operands to sources using the mapping values. In one embodiment, the sources may correspond to inputs of an execution block. In one embodiment, a destination mapping value in the instruction may specify a location value that indicates a destination for storing an instruction result.

    Abstract translation: 公开了关于指令操作数的指定的技术。 在一些实施例中,这可以涉及将操作数分配给源输入。 在一个实施例中,指令包括一个或多个映射值,每个映射值对应于指令的源,并且每个映射值指定位置值。 在本实施例中,指令包括一个或多个位置值,每个位置值可用于识别指令的操作数。 在该实施例中,方法可以包括使用位置值访问操作数,并使用映射值将访问的操作数分配给源。 在一个实施例中,源可以对应于执行块的输入。 在一个实施例中,指令中的目的地映射值可以指定指示用于存储指令结果的目的地的位置值。

    Extended multiply
    5.
    发明授权
    Extended multiply 有权
    扩展乘法

    公开(公告)号:US09417843B2

    公开(公告)日:2016-08-16

    申请号:US13971753

    申请日:2013-08-20

    Applicant: Apple Inc.

    CPC classification number: G06F7/525

    Abstract: Techniques are disclosed relating to performing extended multiplies without a carry flag. In one embodiment, an apparatus includes a multiply unit configured to perform multiplications of operands having a particular width. In this embodiment, the apparatus also includes multiple storage elements configured to store operands for the multiply unit. In this embodiment, each of the storage elements is configured to provide a portion of a stored operand that is less than an entirety of the stored operand in response to a control signal from the apparatus. In one embodiment, the apparatus is configured to perform a multiplication of given first and second operands having a width greater than the particular width by performing a sequence of multiply operations using the multiply unit, using portions of the stored operands and without using a carry flag between any of the sequence of multiply operations.

    Abstract translation: 公开了关于在没有进位标志的情况下执行扩展乘法的技术。 在一个实施例中,一种装置包括被配置为执行具有特定宽度的操作数的乘法的乘法单元。 在该实施例中,该装置还包括被配置为存储乘法单元的操作数的多个存储元件。 在该实施例中,每个存储元件被配置为响应于来自该设备的控制信号而提供小于存储的操作数的整体的存储操作数的一部分。 在一个实施例中,该装置被配置为通过使用存储的操作数的部分并且不使用进位标志来执行使用乘法单元的乘法运算序列来执行具有大于特定宽度的宽度的给定第一和第二操作数的乘法 在任何一个乘法运算序列之间。

    TEXTURE STATE CACHE
    6.
    发明申请
    TEXTURE STATE CACHE 有权
    纹理状态缓存

    公开(公告)号:US20160071232A1

    公开(公告)日:2016-03-10

    申请号:US14482828

    申请日:2014-09-10

    Applicant: Apple Inc.

    CPC classification number: G06T1/60 G06T15/04

    Abstract: Techniques are disclosed relating to a cache configured to store state information for texture mapping. In one embodiment, a texture state cache includes a plurality of entries configured to store state information relating to one or more stored textures. In this embodiment, the texture state cache also includes texture processing circuitry configured to retrieve state information for one of the stored textures from one of the entries in the texture state cache and determine pixel attributes based on the texture and the retrieved state information. The state information may include texture state information and sampler state information, in some embodiments. The texture state cache may allow for reduced rending times and power consumption, in some embodiments.

    Abstract translation: 公开了与被配置为存储用于纹理映射的状态信息的高速缓存相关的技术。 在一个实施例中,纹理状态高速缓存包括被配置为存储与一个或多个存储纹理相关的状态信息的多个条目。 在该实施例中,纹理状态高速缓存还包括纹理处理电路,其被配置为从纹理状态高速缓存中的一个条目中检索存储的纹理之一的状态信息,并且基于纹理和检索到的状态信息来确定像素属性。 在一些实施例中,状态信息可以包括纹理状态信息和采样器状态信息。 在一些实施例中,纹理状态高速缓存可以允许减少渲染时间和功耗。

    CLOCK ROUTING TECHNIQUES
    7.
    发明申请
    CLOCK ROUTING TECHNIQUES 有权
    时钟路由技术

    公开(公告)号:US20150205324A1

    公开(公告)日:2015-07-23

    申请号:US14160179

    申请日:2014-01-21

    Applicant: Apple Inc.

    Abstract: Techniques are disclosed relating to clock routing techniques in processors with both pipelined and non-pipelined circuitry. In some embodiments, an apparatus includes execution units that are non-pipelined and configured to perform instructions without receiving a clock signal. In these embodiments, one or more clock lines routed throughout the apparatus do not extend into the one or more execution units in each pipeline, reducing the length of the clock lines. In some embodiments, the apparatus includes multiple such pipelines arranged in an array, with the execution units located on an outer portion of the array and clocked control circuitry located on an inner portion of the array. In some embodiments, clock lines do not extend into the outer portion of the array. In some embodiments, the array includes one or more rows of execution units. These arrangements may further reduce the length of clock lines.

    Abstract translation: 公开了涉及具有流水线和非流水线电路的处理器中的时钟路由技术的技术。 在一些实施例中,装置包括非流水线并被配置为在不接收时钟信号的情况下执行指令的执行单元。 在这些实施例中,在整个装置中布线的一个或多个时钟线不延伸到每个流水线中的一个或多个执行单元中,从而减小时钟线的长度。 在一些实施例中,该装置包括布置成阵列的多个这样的管道,其中执行单元位于阵列的外部部分上,并且位于阵列内部的时钟控制电路。 在一些实施例中,时钟线不延伸到阵列的外部部分。 在一些实施例中,阵列包括一行或多行执行单元。 这些布置可以进一步减少时钟线的长度。

    MULTI-THREADED GPU PIPELINE
    8.
    发明申请
    MULTI-THREADED GPU PIPELINE 有权
    多通道GPU管道

    公开(公告)号:US20150035841A1

    公开(公告)日:2015-02-05

    申请号:US13956299

    申请日:2013-07-31

    Applicant: Apple Inc.

    Abstract: Techniques are disclosed relating to a multithreaded execution pipeline. In some embodiments, an apparatus is configured to assign a number of threads to an execution pipeline that is an integer multiple of a minimum number of cycles that an execution unit is configured to use to generate an execution result from a given set of input operands. In one embodiment, the apparatus is configured to require strict ordering of the threads. In one embodiment, the apparatus is configured so that the same thread access (e.g., reads and writes) a register file in a given cycle. In one embodiment, the apparatus is configured so that the same thread does not write back an operand and a result to an operand cache in a given cycle.

    Abstract translation: 公开了涉及多线程执行流水线的技术。 在一些实施例中,设备被配置为向执行流水线分配多个线程,该执行流水线是执行单元被配置为用于从给定的一组输入操作数生成执行结果的最小循环数的整数倍。 在一个实施例中,该装置被配置为要求严格排列螺纹。 在一个实施例中,设备被配置为使得在给定周期中相同的线程访问(例如,读取和写入)寄存器文件。 在一个实施例中,该设备被配置为使得相同的线程不在给定周期中将操作数和结果写回操作数高速缓存。

    Operand cache design
    9.
    发明授权
    Operand cache design 有权
    操作数缓存设计

    公开(公告)号:US09378146B2

    公开(公告)日:2016-06-28

    申请号:US13971811

    申请日:2013-08-20

    Applicant: Apple Inc.

    Abstract: Instructions may require one or more operands to be executed, which may be provided from a register file. In the context of a GPU, however, a register file may be a relatively large structure, and reading from a register file may be energy and/or time intensive An operand cache may be used to store a subset of operands, and may use less power and have quicker access times than the register file. Selectors (e.g., multiplexers) may be used to read operands from the operand cache. Power savings may be achieved in some embodiments by activating only a subset of the selectors, which may be done by activators (e.g. flip-flops). Operands may also be concurrently provided to two or more locations via forwarding, which may be accomplished via a source selection unit in some embodiments. Operand forwarding may also reduce power and/or speed execution in one or more embodiments.

    Abstract translation: 指令可能需要执行一个或多个操作数,这可以从寄存器文件提供。 然而,在GPU的上下文中,寄存器文件可以是相对较大的结构,并且从寄存器文件的读取可能是能量和/或时间密集的。操作数高速缓存可以用于存储操作数的子集,并且可以使用较少的 并且具有比寄存器文件更快的访问时间。 选择器(例如,多路复用器)可用于从操作数高速缓存读取操作数。 在一些实施例中可以通过激活选择器的子集来实现功率节省,这可以由激活器(例如,触发器)完成。 操作数还可以经由转发同时提供给两个或更多个位置,这在一些实施例中可以经由源选择单元来实现。 操作数转发还可以在一个或多个实施例中降低功率和/或速度执行。

    HINT VALUES FOR USE WITH AN OPERAND CACHE
    10.
    发明申请
    HINT VALUES FOR USE WITH AN OPERAND CACHE 有权
    使用操作缓存的提示值

    公开(公告)号:US20150058571A1

    公开(公告)日:2015-02-26

    申请号:US13971782

    申请日:2013-08-20

    Applicant: Apple Inc.

    Abstract: Instructions may require one or more operands to be executed, which may be provided from a register file. In the context of a GPU, however, a register file may be a relatively large structure, and reading from the register file may be energy and/or time intensive An operand cache may be used to store a subset of operands, and may use less power and have quicker access times than the register file. Hint values may be used in some embodiments to suggest that a particular operand should be stored in the operand cache (so that is available for current or future use). In one embodiment, a hint value indicates that an operand should be cached whenever possible. Hint values may be determined by software, such as a compiler, in some embodiments. One or more criteria may be used to determine hint values, such as how soon in the future or how frequently an operand will be used again.

    Abstract translation: 指令可能需要执行一个或多个操作数,这可以从寄存器文件提供。 然而,在GPU的上下文中,寄存器文件可以是相对较大的结构,并且从寄存器文件的读取可能是能量和/或时间密集的。操作数高速缓存可以用于存储操作数的子集,并且可以使用较少的 并且具有比寄存器文件更快的访问时间。 在一些实施例中可以使用提示值来建议特定的操作数应存储在操作数高速缓存中(以便可用于当前或未来的使用)。 在一个实施例中,提示值指示操作数应尽可能缓存。 在一些实施例中,提示值可以由诸如编译器的软件来确定。 可以使用一个或多个标准来确定提示值,例如将来的时间以及操作数将再次被使用的频率。

Patent Agency Ranking