FUSED MULTIPLY-ADD APPARATUS AND METHOD
    12.
    发明申请
    FUSED MULTIPLY-ADD APPARATUS AND METHOD 有权
    熔融多媒体设备和方法

    公开(公告)号:US20120124117A1

    公开(公告)日:2012-05-17

    申请号:US13153885

    申请日:2011-06-06

    IPC分类号: G06F7/487 G06F7/485 G06F5/01

    CPC分类号: G06F7/483 G06F7/5443

    摘要: A fixed multiply-add (FMA) apparatus and method are provided. The FMA apparatus includes a partial product generator configured to generate a partial sum and a partial carry, a carry save adder configured to generate a partial sum having a first bit size and a partial carry having the first bit size by adding the partial sum and the partial carry to least significant bits (LSBs) of the mantissa of a third floating-point number, a carry select adder configured to generate a mantissa having a second bit size by adding the first bit-size partial sum and the first bit-size partial carry to most significant bits (MSBs) of the third floating-point number, and a selector configured to transmit the first bit-size partial sum and the first bit-size partial carry to the carry save adder or the carry select adder according to whether the mantissa of the third floating-point number is zero.

    摘要翻译: 提供固定的乘法(FMA)装置和方法。 该FMA装置包括:部分乘积发生器,被配置为产生部分和和部分进位;进位保存加法器,被配置为通过将部分和和相加来产生具有第一位大小的部分和和具有第一位大小的部分进位 部分进位到第三浮点数的尾数的最低有效位(LSB),进位选择加法器,被配置为通过将第一位大小部分和和第一位大小部分相加来生成具有第二位大小的尾数 携带到第三浮点数的最高有效位(MSB),以及选择器,被配置为根据是否将第一位大小部分和和第一位大小部分进位发送到进位存储加法器或进位选择加法器 第三个浮点数的尾数为零。

    Method and system for early Z test in title-based three-dimensional rendering
    13.
    发明授权
    Method and system for early Z test in title-based three-dimensional rendering 有权
    基于标题的三维渲染的早期Z检验方法与系统

    公开(公告)号:US08154547B2

    公开(公告)日:2012-04-10

    申请号:US13090924

    申请日:2011-04-20

    IPC分类号: G06T15/40

    CPC分类号: G06T15/405

    摘要: A method and system for an early Z test in a tile-based three-dimensional rendering is provided. In the method and system for an early Z test, a portion which is not displayed to a user is removed prior to performing a rasterization process, and thereby performing the 3D rendering efficiently. The method includes segmenting a scene into tiles for performing a rendering with respect to a triangle; selecting a first tile of the tiles, which has a tile Z value less than a minimum Z value of the triangle; and performing the rendering with respect to the triangle in remaining tiles excluding the selected first tile of the tiles.

    摘要翻译: 提供了一种基于瓦片的三维渲染的早期Z检验的方法和系统。 在早期Z测试的方法和系统中,在执行光栅化处理之前去除了不向用户显示的部分,从而有效地执行3D渲染。 该方法包括将场景分割成用于执行相对于三角形的呈现的图块; 选择瓦片的第一瓦片,其具有小于所述三角形的最小Z值的瓦片Z值; 以及在除了所选择的瓦片的所选择的第一瓦片之外的剩余瓦片中执行相对于三角形的呈现。

    APPARATUS AND METHOD FOR THREAD PROGRESS TRACKING USING DETERMINISTIC PROGRESS INDEX
    14.
    发明申请
    APPARATUS AND METHOD FOR THREAD PROGRESS TRACKING USING DETERMINISTIC PROGRESS INDEX 有权
    使用确定性进度索引进行进度跟踪的装置和方法

    公开(公告)号:US20120005679A1

    公开(公告)日:2012-01-05

    申请号:US13156492

    申请日:2011-06-09

    IPC分类号: G06F9/46

    摘要: Provided is a method and apparatus for measuring a performance or a progress state of an application program to perform data processing and execute particular functions in a computing environment using a micro architecture. A thread progress tracking apparatus may include a selector to select at least one thread constituting an application program; a determination unit to determine, based on a predetermined criterion, whether an instruction execution scheme corresponds to a deterministic execution scheme having a regular cycle or a nondeterministic execution scheme having an irregular delay cycle with respect to each of at least one instruction constituting a corresponding thread; and a deterministic progress counter to generate a deterministic progress index with respect to an instruction that is executed by the deterministic execution scheme, excluding an instruction that is executed by the nondeterministic execution scheme.

    摘要翻译: 提供了一种用于测量应用程序的性能或进展状态以便在使用微架构的计算环境中执行数据处理并执行特定功能的方法和装置。 线程进度跟踪装置可以包括:选择器,用于选择构成应用程序的至少一个线程; 确定单元,基于预定标准,确定指令执行方案是否对应于具有规则周期的确定性执行方案或具有相对于构成对应线程的至少一个指令中的每一个指令具有不规则延迟周期的非确定性执行方案 ; 以及确定性进度计数器,用于生成关于由确定性执行方案执行的指令的确定性进度索引,不包括由非确定性执行方案执行的指令。

    MULTIPORT DATA CACHE APPARATUS AND METHOD OF CONTROLLING THE SAME
    15.
    发明申请
    MULTIPORT DATA CACHE APPARATUS AND METHOD OF CONTROLLING THE SAME 有权
    多媒体数据缓存设备及其控制方法

    公开(公告)号:US20110225369A1

    公开(公告)日:2011-09-15

    申请号:US13036102

    申请日:2011-02-28

    IPC分类号: G06F12/08

    CPC分类号: G06F12/0846 G06F12/0857

    摘要: A multiport data cache apparatus and a method of controlling the same are provided. The multiport data cache apparatus includes a plurality of cache banks configured to share a cache line, and a data cache controller configured to receive cache requests for the cache banks, each of which including a cache bank identifier, transfer the received cache requests to the respective cache banks according to the cache bank identifiers, and process the cache requests independently from one another.

    摘要翻译: 提供了一种多端口数据缓存装置及其控制方法。 多端口数据高速缓存装置包括配置成共享高速缓存行的多个高速缓冲存储器组,以及数据高速缓存控制器,被配置为接收高速缓冲存储器的缓存请求,每个高速缓冲存储器包括高速缓存存储体标识符, 根据缓存存储体标识符缓存存储器,并且彼此独立地处理缓存请求。

    Loop data processing system and method for dividing a loop into phases
    16.
    发明授权
    Loop data processing system and method for dividing a loop into phases 有权
    循环数据处理系统和将循环分为阶段的方法

    公开(公告)号:US08019982B2

    公开(公告)日:2011-09-13

    申请号:US11542118

    申请日:2006-10-04

    IPC分类号: G06F9/30

    CPC分类号: G06F9/325 G06F9/3879

    摘要: A data processing system and method. The data processing system includes a processor core that executes a program; a loop accelerator that has an array consisting of a plurality of data processing cells and executes a loop in a program by configuring the array according to a set of configuration bits; and a centralized register file which allows data used in the program execution to be shared by the processor core and the loop accelerator. The loop accelerator divides the configuration of the array into at least three phases according to whether data exchange with the central register file is conducted during the loop execution. Thus, unnecessary occupation of the routing resource, which is used for the data exchange between the loop accelerator and the central register file during the loop execution, can be avoided.

    摘要翻译: 一种数据处理系统和方法。 数据处理系统包括执行程序的处理器核心; 循环加速器,其具有由多个数据处理单元组成的阵列,并且通过根据一组配置位配置阵列来执行程序中的循环; 以及允许在程序执行中使用的数据由处理器核和循环加速器共享的集中寄存器文件。 根据在循环执行期间是否进行与中央寄存器文件的数据交换,循环加速器将阵列的配置分为至少三个阶段。 因此,可以避免在循环执行期间用于循环加速器和中央寄存器文件之间的数据交换的路由资源的不必要的占用。

    Register allocation method and system for program compiling
    17.
    发明授权
    Register allocation method and system for program compiling 有权
    注册分配方法和系统进行程序编译

    公开(公告)号:US07660970B2

    公开(公告)日:2010-02-09

    申请号:US11506887

    申请日:2006-08-21

    IPC分类号: G06F9/30 G06F9/34

    摘要: Disclosed is a data processing system and method. The data processing method determines the number of static registers and the number of rotating registers for assigning a register to a variable contained in a certain program, assigns the register to the variable based on the number of the static registers and the number of the rotating registers, and compiles the program. Further, the method stores in the special register a value corresponding to the number of the rotating registers in the compiling operation, and obtains a physical address from a logical address of the register based on the value. Accordingly, the present invention provides an aspect of efficiently using register files by dynamically controlling the number of rotating registers and the number of static registers for a software pipelined loop, and has an effect capable of reducing the generations of spill/fill codes unnecessary during program execution to a minimum.

    摘要翻译: 公开了一种数据处理系统和方法。 数据处理方法确定静态寄存器的数量和用于将寄存器分配给包含在某个程序中的变量的旋转寄存器的数量,基于静态寄存器的数量和旋转寄存器的数量将寄存器分配给变量 ,并编译程序。 此外,该方法在特殊寄存器中存储与编译操作中的旋转寄存器的数量相对应的值,并且基于该值从寄存器的逻辑地址获得物理地址。 因此,本发明提供了通过动态地控制旋转寄存器的数量和用于软件流水线循环的静态寄存器的数量来有效地使用寄存器文件的方面,并且具有能够减少在程序期间不必要的溢出/填充代码的代数的效果 执行到最小。

    Pipeline synchronisation device
    18.
    发明授权
    Pipeline synchronisation device 失效
    管道同步装置

    公开(公告)号:US07519759B2

    公开(公告)日:2009-04-14

    申请号:US10542906

    申请日:2004-01-14

    IPC分类号: G06F13/36 G06F5/00

    CPC分类号: G06F9/3869 H04L7/02

    摘要: Pipeline synchronization device for transferring data between clocked devices having different clock frequencies. The Pipeline synchronization device comprises a mousetrap buffer for exchanging data with one of said external devices said mousetrap buffer having a signalling output for coordinating the data exchange with the external device. The pipeline synchronization device comprises further a synchronizer adapted to synchronizing the change in a signalling output with the clock of the external device.

    摘要翻译: 用于在具有不同时钟频率的时钟设备之间传送数据的流水线同步装置。 管道同步装置包括用于与所述外部设备之一交换数据的捕鼠器缓冲器,所述捕鼠器缓冲器具有用于协调与外部设备的数据交换的信令输出。 流水线同步装置还包括一个同步器,该同步器适于使信令输出的变化与外部设备的时钟同步。

    Apparatus and method of avoiding bank conflict in single-port multi-bank memory system
    19.
    发明申请
    Apparatus and method of avoiding bank conflict in single-port multi-bank memory system 有权
    避免单端口多库存储系统存在冲突的装置和方法

    公开(公告)号:US20090089551A1

    公开(公告)日:2009-04-02

    申请号:US12071910

    申请日:2008-02-27

    IPC分类号: G06F9/40 G06F9/30

    摘要: Provided are a method and apparatus for avoiding bank conflict. A first instruction that is one of access instructions that are predicted to cause the bank conflict is replaced with a second instruction by changing an execute timing of the first instruction to a timing prior to the execute timing of the first instruction so as for the access instructions not to cause the bank conflict. Next, a load/store unit that is scheduled to access the bank according to the first instruction accesses the bank and reads out a data from the bank at an execute timing of the second instruction, and after that, the load/store unit is allowed to be inputted the read data at the execute timing of the first instruction. Accordingly, although the access instructions that are predicted to cause the bank conflict are allocated to the load/store units, the bank conflict can be prevented, so that it is possible to avoid deterioration in performance due the occurrence of the bank conflict.

    摘要翻译: 提供了一种避免银行冲突的方法和装置。 通过将第一指令的执行定时改变为在第一指令的执行定时之前的定时,以访问指令来替换作为预测导致存储体冲突的访问指令之一的第一指令, 不造成银行冲突。 接下来,根据第一指令被调度为访问存储体的加载/存储单元访问存储体,并在第二指令的执行定时从存储体读出数据,然后允许加载/存储单元 在第一指令的执行定时输入读数据。 因此,尽管预测导致银行冲突的访问指令被分配给加载/存储单元,但是可以防止银行冲突,使得可以避免由于银行冲突的发生导致的性能下降。

    PROCESSOR AND METHOD OF PERFORMING SPECULATIVE LOAD OPERATIONS OF THE PROCESSOR
    20.
    发明申请
    PROCESSOR AND METHOD OF PERFORMING SPECULATIVE LOAD OPERATIONS OF THE PROCESSOR 有权
    处理器的执行和执行分析负载运算的方法

    公开(公告)号:US20080209188A1

    公开(公告)日:2008-08-28

    申请号:US11838488

    申请日:2007-08-14

    IPC分类号: G06F9/38

    CPC分类号: G06F9/3842

    摘要: Provided is a processor and method of performing speculative load instructions of the processor in which a load instruction is performed only in the case where the load instruction substantially accesses a memory. A load instruction for canceling operations is performed in other cases except the above case, so that problems occurring by accessing an input/output (I/O) mapped memory area and the like at the time of performing speculative load instructions can be prevented using only a software-like method, thereby improving the performance of a processor.

    摘要翻译: 提供了一种执行处理器的推测性加载指令的处理器和方法,其中仅在加载指令基本访问存储器的情况下执行加载指令。 在除了上述情况之外的其他情况下执行用于取消操作的加载指令,使得仅在执行推测性加载指令时访问输入/输出(I / O)映射存储区等而出现的问题可以仅被使用 一种类似软件的方法,从而提高处理器的性能。