Profiler for optimizing processor architecture and application
    1.
    发明申请
    Profiler for optimizing processor architecture and application 有权
    Profiler用于优化处理器架构和应用

    公开(公告)号:US20080120493A1

    公开(公告)日:2008-05-22

    申请号:US11730170

    申请日:2007-03-29

    IPC分类号: G06F7/38

    摘要: A profiler which provides information to optimize an application specific architecture processor and a program for the processor is provided. The profiler includes: an architecture analyzer which analyzes an architecture description, and generates architecture analysis information, the architecture description describing an architecture of an application specific architecture processor which comprises a plurality of processing elements; a static analyzer which analyzes program static information that describes static information of a program, and generates static analysis information; a dynamic analyzer which analyzes program dynamic information that describes dynamic information of the program, and generates dynamic analysis information, the dynamic information of the program being generated by simulating the program; and a cross profiling analyzer which generates information for optimizing the application specific architecture processor to implement the program based on at least one of the architecture analysis information, the static analysis information, and the dynamic analysis information.

    摘要翻译: 提供了一种提供信息以优化特定于应用的架构处理器和用于处理器的程序的分析器。 分析器包括:架构分析器,其分析架构描述并生成架构分析信息,描述包括多个处理元件的应用特定架构处理器的架构的架构描述; 静态分析器,分析程序静态信息,描述程序的静态信息,并生成静态分析信息; 动态分析器,其分析描述所述程序的动态信息的程序动态信息,并且生成动态分析信息,所述程序的动态信息是通过模拟所述程序而产生的; 以及交叉分析分析器,其基于所述架构分析信息,所述静态分析信息和所述动态分析信息中的至少一个,生成用于优化所述应用专用架构处理器以实现所述程序的信息。

    Profiler for optimizing processor architecture and application
    2.
    发明授权
    Profiler for optimizing processor architecture and application 有权
    Profiler用于优化处理器架构和应用

    公开(公告)号:US08490066B2

    公开(公告)日:2013-07-16

    申请号:US11730170

    申请日:2007-03-29

    IPC分类号: G06F15/76 G06F9/44

    摘要: A profiler which provides information to optimize an application specific architecture processor and a program for the processor is provided. The profiler includes: an architecture analyzer which analyzes an architecture description, and generates architecture analysis information, the architecture description describing an architecture of an application specific architecture processor which comprises a plurality of processing elements; a static analyzer which analyzes program static information that describes static information of a program, and generates static analysis information; a dynamic analyzer which analyzes program dynamic information that describes dynamic information of the program, and generates dynamic analysis information, the dynamic information of the program being generated by simulating the program; and a cross profiling analyzer which generates information for optimizing the application specific architecture processor to implement the program based on at least one of the architecture analysis information, the static analysis information, and the dynamic analysis information.

    摘要翻译: 提供了一种提供信息以优化特定于应用的架构处理器和用于处理器的程序的分析器。 分析器包括:架构分析器,其分析架构描述并生成架构分析信息,描述包括多个处理元件的应用特定架构处理器的架构的架构描述; 静态分析器,分析程序静态信息,描述程序的静态信息,并生成静态分析信息; 动态分析器,其分析描述所述程序的动态信息的程序动态信息,并且生成动态分析信息,所述程序的动态信息是通过模拟所述程序而产生的; 以及交叉分析分析器,其基于所述架构分析信息,所述静态分析信息和所述动态分析信息中的至少一个,生成用于优化所述应用专用架构处理器以实现所述程序的信息。

    Loop accelerator and data processing system having the same
    3.
    发明授权
    Loop accelerator and data processing system having the same 有权
    循环加速器和数据处理系统具有相同的功能

    公开(公告)号:US07590831B2

    公开(公告)日:2009-09-15

    申请号:US11514889

    申请日:2006-09-05

    IPC分类号: G06F9/44

    摘要: Provided are a loop accelerator and a data processing system having the loop accelerator. The data processing system includes a loop accelerator which executes a loop part of a program, a processor core which processes a remaining part of the program except the loop part, and a central register file which transmits data between the processor core and the loop accelerator. The loop accelerator includes a plurality of processing elements (PEs) each of which performs an operation on each word to execute the program, a configuration memory which stores configuration bits indicating operations, states, etc. of the PEs, and a plurality of context memories, installed in a column or row direction of the PEs, which transmit the configuration bits along a direction toward which the PEs are arrayed. Thus, a connection structure between the configuration memory and the PEs can be simplified to easily modify a structure of the loop accelerator so as to extend the loop accelerator.

    摘要翻译: 提供了一种环路加速器和具有环路加速器的数据处理系统。 数据处理系统包括执行程序的循环部分的循环加速器,处理除循环部分之外的程序的剩余部分的处理器核心以及在处理器核心和循环加速器之间传送数据的中央寄存器文件。 环路加速器包括多个处理元件(PE),每个处理元件(PE)对每个字执行操作以执行程序;配置存储器,其存储指示PE的操作,状态等的配置位,以及多个上下文存储器 安装在PE的列或行方向上,其沿着PE排列的方向传送配置位。 因此,可以简化配置存储器和PE之间的连接结构,以容易地修改循环加速器的结构,以便扩展循环加速器。

    Apparatus and method for optimizing loop buffer in reconfigurable processor
    4.
    发明申请
    Apparatus and method for optimizing loop buffer in reconfigurable processor 有权
    用于优化可重构处理器中循环缓冲器的装置和方法

    公开(公告)号:US20070150710A1

    公开(公告)日:2007-06-28

    申请号:US11525913

    申请日:2006-09-25

    IPC分类号: G06F9/44

    摘要: A reconfigurable processor comprising a configuration memory for storing a configuration bit for at least one loop configuration; a valid information memory for storing bit information indicating whether an operation in a loop is a delay operation; and at least one processing unit for determining whether an operation in a next cycle is the delay operation by referring to the bit information transmitted from the valid information memory, and selectively performing a change and an implementation of a configuration according to the configuration bit from the configuration memory based on the determined results.

    摘要翻译: 一种可重配置处理器,包括用于存储用于至少一个环路配置的配置位的配置存储器; 用于存储指示循环中的操作是否为延迟操作的位信息的有效信息存储器; 以及至少一个处理单元,用于通过参考从有效信息存储器发送的比特信息来确定下一个周期中的操作是否是延迟操作,并且根据来自所述有用信息存储器的配置位选择性地执行改变和配置的实现 基于确定结果的配置存储器。

    Memory access method using three dimensional address mapping
    5.
    发明授权
    Memory access method using three dimensional address mapping 有权
    内存访问方法使用三维地址映射

    公开(公告)号:US07779225B2

    公开(公告)日:2010-08-17

    申请号:US11828440

    申请日:2007-07-26

    IPC分类号: G06F12/00

    摘要: A memory access method includes: obtaining a, b, and c from a program code for accessing a memory with a triple loop in a program, a being a number of values which an inner-most loop variable of the triple loop may have, b being a number of values which a middle loop variable of the triple loop may have, and c being a number of values which an outer-most loop variable of the triple loop may have; obtaining a starting address of the memory accessed by the triple loop; and obtaining an a×b×c number of addresses of the memory accessed by the triple loop using the starting address and a function.

    摘要翻译: 存储器访问方法包括:从程序中用于访问具有三重循环的存储器的程序代码获取a,b和c,所述三循环的最内循环变量可以具有多个值,b 是三重循环的中间循环变量可能具有的多个值,c是三重循环的最外圈循环变量可能具有的值的数量; 获取由三重循环访问的存储器的起始地址; 并使用起始地址和功能获得由三重回路访问的存储器的a×b×c个地址。

    Apparatus and method for optimizing loop buffer in reconfigurable processor
    6.
    发明授权
    Apparatus and method for optimizing loop buffer in reconfigurable processor 有权
    用于优化可重构处理器中循环缓冲器的装置和方法

    公开(公告)号:US07478227B2

    公开(公告)日:2009-01-13

    申请号:US11525913

    申请日:2006-09-25

    IPC分类号: G06F9/40

    摘要: A reconfigurable processor comprising a configuration memory for storing a configuration bit for at least one loop configuration; a valid information memory for storing bit information indicating whether an operation in a loop is a delay operation; and at least one processing unit for determining whether an operation in a next cycle is the delay operation by referring to the bit information transmitted from the valid information memory, and selectively performing a change and an implementation of a configuration according to the configuration bit from the configuration memory based on the determined results.

    摘要翻译: 一种可重配置处理器,包括用于存储用于至少一个环路配置的配置位的配置存储器; 用于存储指示循环中的操作是否为延迟操作的位信息的有效信息存储器; 以及至少一个处理单元,用于通过参考从有效信息存储器发送的比特信息来确定下一个周期中的操作是否是延迟操作,并且根据来自所述有用信息存储器的配置位选择性地执行改变和配置的实现 基于确定结果的配置存储器。

    Processor and method of performing speculative load operations of the processor
    7.
    发明授权
    Processor and method of performing speculative load operations of the processor 有权
    处理器和执行处理器的推测加载操作的方法

    公开(公告)号:US08443174B2

    公开(公告)日:2013-05-14

    申请号:US11838488

    申请日:2007-08-14

    IPC分类号: G06F9/30 G06F9/312

    CPC分类号: G06F9/3842

    摘要: Provided is a processor and method of performing speculative load instructions of the processor in which a load instruction is performed only in the case where the load instruction substantially accesses a memory. A load instruction for canceling operations is performed in other cases except the above case, so that problems occurring by accessing an input/output (I/O) mapped memory area and the like at the time of performing speculative load instructions can be prevented using only a software-like method, thereby improving the performance of a processor.

    摘要翻译: 提供了一种执行处理器的推测性加载指令的处理器和方法,其中仅在加载指令基本访问存储器的情况下执行加载指令。 在除了上述情况之外的其他情况下执行用于取消操作的加载指令,使得仅在执行推测性加载指令时访问输入/输出(I / O)映射存储区等而出现的问题可以仅被使用 一种类似软件的方法,从而提高处理器的性能。

    PROCESSOR AND METHOD OF PERFORMING SPECULATIVE LOAD OPERATIONS OF THE PROCESSOR
    8.
    发明申请
    PROCESSOR AND METHOD OF PERFORMING SPECULATIVE LOAD OPERATIONS OF THE PROCESSOR 有权
    处理器的执行和执行分析负载运算的方法

    公开(公告)号:US20080209188A1

    公开(公告)日:2008-08-28

    申请号:US11838488

    申请日:2007-08-14

    IPC分类号: G06F9/38

    CPC分类号: G06F9/3842

    摘要: Provided is a processor and method of performing speculative load instructions of the processor in which a load instruction is performed only in the case where the load instruction substantially accesses a memory. A load instruction for canceling operations is performed in other cases except the above case, so that problems occurring by accessing an input/output (I/O) mapped memory area and the like at the time of performing speculative load instructions can be prevented using only a software-like method, thereby improving the performance of a processor.

    摘要翻译: 提供了一种执行处理器的推测性加载指令的处理器和方法,其中仅在加载指令基本访问存储器的情况下执行加载指令。 在除了上述情况之外的其他情况下执行用于取消操作的加载指令,使得仅在执行推测性加载指令时访问输入/输出(I / O)映射存储区等而出现的问题可以仅被使用 一种类似软件的方法,从而提高处理器的性能。

    Loop coalescing method and loop coalescing device
    9.
    发明授权
    Loop coalescing method and loop coalescing device 有权
    循环聚结方法和回路聚结装置

    公开(公告)号:US08549507B2

    公开(公告)日:2013-10-01

    申请号:US11843357

    申请日:2007-08-22

    IPC分类号: G06F9/45

    CPC分类号: G06F8/4441

    摘要: A loop coalescing method and a loop coalescing device are disclosed. The loop coalescing method comprises removing an inner-most loop from among nested loops, so that an outer operation provided outside of the inner-most loop is performed when a condition of a conditional statement is satisfied, generating a guard code by applying an if-conversion method to the conditional statement, and converting a guard by using an instruction calculating the guard of the guard code, the instruction calculating the guard using a register where information related to a period of time corresponding to the number of iterations of the inner-most loop is stored.

    摘要翻译: 公开了一种环路聚结方法和回路聚结装置。 循环合并方法包括从嵌套循环中去除最内循环,使得当满足条件语句的条件时,执行在最内循环之外提供的外部操作,通过应用if- 转换方法到条件语句,并且通过使用计算保护码的保护的指令来转换保护,所述指令使用寄存器计算所述保护器,所述寄存器的信息与对应于最内层次的迭代次数相关的时间段 循环被存储。

    Apparatus for compressing instruction word for parallel processing VLIW computer and method for the same
    10.
    发明授权
    Apparatus for compressing instruction word for parallel processing VLIW computer and method for the same 有权
    用于并行处理VLIW计算机的指令字的压缩装置及其方法

    公开(公告)号:US07774581B2

    公开(公告)日:2010-08-10

    申请号:US11838511

    申请日:2007-08-14

    IPC分类号: G06F9/30

    摘要: An apparatus and a method are provided for a parallel processing very long instruction word (VLIW) computer. The apparatus includes: an index code generation unit sequentially generating an index code, which is associated with a number of no operation (NOP) instruction word between effective instruction words, with respect to each of instruction word groups to be executed in a VLIW computer; an instruction compression unit sequentially deleting the NOP instruction word which corresponds to the index code with respect to each of instruction word groups; and an instruction word conversion unit converting the effective instruction words to include the index code, the effective instruction words corresponding to the NOP instruction words.

    摘要翻译: 为并行处理非常长的指令字(VLIW)计算机提供了一种装置和方法。 该装置包括:索引代码生成单元,相对于要在VLIW计算机中执行的每个指令字组,顺序生成与有效指令字之间的无操作数(NOP)指令字数相关联的索引码; 指令压缩单元相对于每个指令字组顺序地删除对应于索引代码的NOP指令字; 以及指令字转换单元,将有效指令字转换为包括索引代码,与NOP指令字对应的有效指令字。