SOURCE OPERAND READ SUPPRESSION FOR GRAPHICS PROCESSORS
    71.
    发明申请
    SOURCE OPERAND READ SUPPRESSION FOR GRAPHICS PROCESSORS 审中-公开
    图形处理器的源操作读取抑制

    公开(公告)号:US20160350112A1

    公开(公告)日:2016-12-01

    申请号:US14726349

    申请日:2015-05-29

    申请人: Intel Corporation

    IPC分类号: G06F9/30 G06F15/82

    摘要: Techniques to suppress redundant reads to register addresses and to replicate read data are disclosed. The redundant reads are suppressed when multiple source operands specify the same register address to read. Additionally, the read data is replicated to a data stream or data location corresponding to the source operands where the data read was suppressed.

    摘要翻译: 公开了抑制冗余读取以注册地址和复制读取数据的技术。 当多个源操作数指定要读取的相同寄存器地址时,冗余读取被抑制。 此外,读取的数据被复制到对应于数据读取被抑制的源操作数的数据流或数据位置。

    Programmable Vision Accelerator
    73.
    发明申请
    Programmable Vision Accelerator 审中-公开
    可编程视觉加速器

    公开(公告)号:US20160321074A1

    公开(公告)日:2016-11-03

    申请号:US15141703

    申请日:2016-04-28

    IPC分类号: G06F9/30 G06F15/82

    摘要: In one embodiment of the present invention, a programmable vision accelerator enables applications to collapse multi-dimensional loops into one dimensional loops. In general, configurable components included in the programmable vision accelerator work together to facilitate such loop collapsing. The configurable elements include multi-dimensional address generators, vector units, and load/store units. Each multi-dimensional address generator generates a different address pattern. Each address pattern represents an overall addressing sequence associated with an object accessed within the collapsed loop. The vector units and the load store units provide execution functionality typically associated with multi-dimensional loops based on the address pattern. Advantageously, collapsing multi-dimensional loops in a flexible manner dramatically reduces the overhead associated with implementing a wide range of computer vision algorithms. Consequently, the overall performance of many computer vision applications may be optimized.

    摘要翻译: 在本发明的一个实施例中,可编程视觉加速器使应用能够将多维循环折叠成一维循环。 通常,包括在可编程视觉加速器中的可配置组件一起工作以促进这种循环崩溃。 可配置元素包括多维地址生成器,向量单元和加载/存储单元。 每个多维地址生成器生成不同的地址模式。 每个地址模式表示与在折叠循环中访问的对象相关联的整体寻址序列。 向量单元和加载存储单元提供通常根据地址模式与多维循环相关联的执行功能。 有利地,以灵活的方式折叠多维循环显着地减少与实现广泛的计算机视觉算法相关联的开销。 因此,可以优化许多计算机视觉应用的整体性能。

    Dividing, scheduling, and parallel processing compiled sub-tasks on an asynchronous multi-core processor
    74.
    发明授权
    Dividing, scheduling, and parallel processing compiled sub-tasks on an asynchronous multi-core processor 有权
    在异步多核处理器上分割,调度和并行处理编译子任务

    公开(公告)号:US09400685B1

    公开(公告)日:2016-07-26

    申请号:US14610351

    申请日:2015-01-30

    申请人: Yiqun Ge Wuxian Shi

    发明人: Yiqun Ge Wuxian Shi

    摘要: An asynchronous multiple-core processor may be adapted for carrying out sets of known tasks, such as the tasks in the LAPACK and BLAS packages. Conveniently, the known tasks may be handled by the asynchronous multiple-core processor in a manner that may be considered to be more power efficient than carrying out the same known tasks on a single-core processor. Indeed, some of the power savings are realized through the use of token-based single core processors. Use of such token-based single core processors may be considered to be power efficient due to the lack of a global clock tree.

    摘要翻译: 异步多核处理器可以适于执行已知任务的集合,诸如LAPACK和BLAS包中的任务。 方便地,已知任务可以由异步多核处理器以与在单核处理器上执行相同已知任务相比更有效率的方式来处理。 实际上,通过使用基于令牌的单核处理器来实现一些功率节省。 由于缺乏全局时钟树,因此使用这种基于令牌的单核处理器可能被认为是功率有效的。

    COMPARISON-BASED SORT IN AN ARRAY PROCESSOR
    76.
    发明申请
    COMPARISON-BASED SORT IN AN ARRAY PROCESSOR 有权
    在ARRAY处理器中基于比较的排序

    公开(公告)号:US20160124900A1

    公开(公告)日:2016-05-05

    申请号:US14729281

    申请日:2015-06-03

    IPC分类号: G06F15/82 G06F9/30 G06F15/80

    摘要: A method for sorting data in an array processor. Each of a first tier of processing elements in the array processor receives data inputs from a load streaming unit. Each of the first tier processing elements compares input data portions received from the load streaming unit, wherein the input data portions are stored for processing in respective queues. The first tier processing elements select one of the input data portions to be an output data portion based on the comparison, and in response to the selection, remove a corresponding queue entry and request next input data from the load streaming unit. Each of the first tier processing elements further provides the output data portion as an input data portion to a second tier processing element that generates output data based on a comparison of output data received from at least two first tier processing elements.

    摘要翻译: 一种用于在数组处理器中排序数据的方法。 阵列处理器中的第一层处理元件中的每一个从负载流传输单元接收数据输入。 每个第一层处理元件比较从负载流传输单元接收的输入数据部分,其中输入数据部分被存储以用于在相应的队列中进行处理。 第一层处理单元基于该比较来选择输入数据部分中的一个作为输出数据部分,并且响应于该选择,移除对应的队列条目并从加载流传输单元请求下一个输入数据。 第一层处理单元中的每一个还将输出数据部分作为输入数据部分提供给基于从至少两个第一层处理单元接收的输出数据的比较来生成输出数据的第二层处理单元。

    Method For Projecting Out Irreducible Representations From a Quantum State of n Particles with d Colors
    77.
    发明申请
    Method For Projecting Out Irreducible Representations From a Quantum State of n Particles with d Colors 审中-公开
    从d颜色的n粒子的量子状态投射不可约的表示的方法

    公开(公告)号:US20160110311A1

    公开(公告)日:2016-04-21

    申请号:US14941464

    申请日:2015-11-13

    IPC分类号: G06F15/82 G06N99/00

    CPC分类号: G06N99/002

    摘要: We describe a method for using a classical computer to generate a particular sequence of elementary operations (SEO), an instruction set for a quantum computer. Such a SEO will induce a quantum computer to perform a unitary transformation U that we call an Irreps Gen U. This U simultaneously diagonalizes a set of operators Hμ called HYPs (Hermitian Young Projectors) for n particles with d colors or, equivalently, for n qu(d)its. Hμ projects out n particle irrep μ of U(d).

    摘要翻译: 我们描述一种使用经典计算机来生成基本操作(SEO)的特定序列的方法,这是量子计算机的指令集。 这样的SEO将引起量子计算机执行我们称为Irreps Gen U的单一变换U.U U同时将一组用于具有d个颜色的n个粒子称为HYP(Hermitian Young投影仪)的运算符Hμ对角化,或者相当于n (d)它的 Hμ投影出U(d)的粒子反射率μ。

    VLIW PROCESSOR
    78.
    发明申请
    VLIW PROCESSOR 有权
    VLIW处理器

    公开(公告)号:US20150277909A1

    公开(公告)日:2015-10-01

    申请号:US14660057

    申请日:2015-03-17

    IPC分类号: G06F9/30 G06F15/82

    摘要: A very long instruction word (VLIW) processor performs efficient processing including extended bits operations, such as processing performed in response to instructions commonly used in image processing, image recognition, and other processing, while preventing scaling up of the circuit. The VLIW processor includes an instruction control unit, a register file unit, and an instruction execution unit. The instruction execution unit includes a plurality of slots, and a state register arranged between the second slot and the third slot to transfer N-bit data between the second and third slots. The VLIW processor stores data output from the third slot into the state register and uses the data, and thus achieves efficient processing including bit-expanded operations, such as processing performed in response to instructions commonly used in image processing, image recognition, and other processing, while preventing scaling up of the circuit.

    摘要翻译: 非常长的指令字(VLIW)处理器执行包括扩展比特操作在内的高效处理,例如响应于在图像处理,图像识别和其他处理中常用的指令执行的处理,同时防止电路的放大。 VLIW处理器包括指令控制单元,寄存器文件单元和指令执行单元。 指令执行单元包括多个时隙,以及布置在第二时隙和第三时隙之间的状态寄存器,以在第二和第三时隙之间传送N位数据。 VLIW处理器将从第三时隙输出的数据存储到状态寄存器并使用该数据,从而实现包括比特扩展操作在内的高效处理,例如响应于图像处理,图像识别和其他处理中常用的指令执行的处理 ,同时防止电路放大。

    NORMALIZING DATA FOR FAST SUPERSCALAR PROCESSING
    79.
    发明申请
    NORMALIZING DATA FOR FAST SUPERSCALAR PROCESSING 有权
    正规化数据进行快速超级处理

    公开(公告)号:US20150234778A1

    公开(公告)日:2015-08-20

    申请号:US14702749

    申请日:2015-05-03

    IPC分类号: G06F15/82 G06F17/30

    摘要: A data normalization system is described herein that represents multiple data types that are common within database systems in a normalized form that can be processed uniformly to achieve faster processing of data on superscalar CPU architectures. The data normalization system includes changes to internal data representations of a database system as well as functional processing changes that leverage normalized internal data representations for a high density of independently executable CPU instructions. Because most data in a database is small, a majority of data can be represented by the normalized format. Thus, the data normalization system allows for fast superscalar processing in a database system in a variety of common cases, while maintaining compatibility with existing data sets.

    摘要翻译: 这里描述了一种数据归一化系统,其表示以规范化形式在数据库系统中通用的多个数据类型,其可以被均匀地处理以实现对超标量CPU架构的数据的更快处理。 数据归一化系统包括对数据库系统的内部数据表示的更改以及利用高密度独立可执行CPU指令的规范化内部数据表示的功能处理变化。 因为数据库中的大多数数据很小,所以大部分数据可以用归一化格式表示。 因此,数据归一化系统允许在各种常见情况下在数据库系统中进行快速超标量处理,同时保持与现有数据集的兼容性。