Low Overhead Access to Shared On-Chip Hardware Accelerator With Memory-Based Interfaces
    1.
    发明申请
    Low Overhead Access to Shared On-Chip Hardware Accelerator With Memory-Based Interfaces 有权
    具有基于内存接口的共享片上硬件加速器的低架构访问

    公开(公告)号:US20080222396A1

    公开(公告)日:2008-09-11

    申请号:US11684348

    申请日:2007-03-09

    IPC分类号: G06F9/50

    摘要: In one embodiment, a method is contemplated. Access to a hardware accelerator is requested by a user-privileged thread. Access to the hardware accelerator is granted to the user-privileged thread by a higher-privileged thread responsive to the requesting. One or more commands are communicated to the hardware accelerator by the user-privileged thread without intervention by higher-privileged threads and responsive to the grant of access. The one or more commands cause the hardware accelerator to perform one or more tasks. Computer readable media comprises instructions which, when executed, implement portions of the method are also contemplated in various embodiments, as is a hardware accelerator and a processor coupled to the hardware accelerator.

    摘要翻译: 在一个实施例中,预期了一种方法。 用户特权线程请求访问硬件加速器。 通过响应请求的较高特权线程向硬件加速器的访问授予用户特权线程。 一个或多个命令由用户特权的线程传送到硬件加速器,而不受较高特权线程的干扰,并响应于授权的访问。 一个或多个命令使硬件加速器执行一个或多个任务。 计算机可读介质包括当各种实施例中被执行时实施该方法的部分的指令,以及硬件加速器和耦合到硬件加速器的处理器。

    Efficient On-Chip Accelerator Interfaces to Reduce Software Overhead
    2.
    发明申请
    Efficient On-Chip Accelerator Interfaces to Reduce Software Overhead 有权
    高效的片上加速器接口,以减少软件开销

    公开(公告)号:US20080222383A1

    公开(公告)日:2008-09-11

    申请号:US11684358

    申请日:2007-03-09

    IPC分类号: G06F9/34

    摘要: In one embodiment, a processor comprises execution circuitry and a translation lookaside buffer (TLB) coupled to the execution circuitry. The execution circuitry is configured to execute a store instruction having a data operand; and the execution circuitry is configured to generate a virtual address as part of executing the store instruction. The TLB is coupled to receive the virtual address and configured to translate the virtual address to a first physical address. Additionally, the TLB is coupled to receive the data operand and to translate the data operand to a second physical address. A hardware accelerator is also contemplated in various embodiments, as is a processor coupled to the hardware accelerator, a method, and a computer readable medium storing instruction which, when executed, implement a portion of the method.

    摘要翻译: 在一个实施例中,处理器包括耦合到执行电路的执行电路和转换后备缓冲器(TLB)。 执行电路被配置为执行具有数据操作数的存储指令; 并且所述执行电路被配置为生成作为执行所述存储指令的一部分的虚拟地址。 所述TLB被耦合以接收所述虚拟地址并被配置为将所述虚拟地址转换为第一物理地址。 此外,TLB被耦合以接收数据操作数并将数据操作数转换为第二物理地址。 还可以在各种实施例中考虑硬件加速器,以及耦合到硬件加速器的处理器,方法和存储指令的计算机可读介质,其在执行时实现该方法的一部分。

    Method and apparatus for resolving multiple branches
    3.
    发明授权
    Method and apparatus for resolving multiple branches 失效
    用于解决多个分支的方法和装置

    公开(公告)号:US06256729B1

    公开(公告)日:2001-07-03

    申请号:US09004971

    申请日:1998-01-09

    IPC分类号: G06F1500

    CPC分类号: G06F9/3861 G06F9/3806

    摘要: A method for repairing a pipeline in response to a branch instruction having a branch, includes the steps of providing a branch repair table having a plurality of entries, allocating an entry in the branch repair table for the branch instruction, storing a target address, a fall-through address, and repair information in the entry in the branch repair table, processing the branch instruction to determine whether the branch was taken, and repairing the pipeline in response to the repair information and the fall-through address in the entry in the branch repair table when the branch was not taken.

    摘要翻译: 一种用于响应于具有分支的分支指令来修复流水线的方法,包括以下步骤:提供具有多个条目的分支修复表,在分支指令的分支修复表中分配条目,存储目标地址, 分支修复表中的条目中的修复信息和修复信息,处理分支指令以确定是否采用分支,以及修复管道,以响应修复信息和条目中的到达地址 分支修复表时未分支。

    Method and apparatus for branch target prediction
    4.
    发明授权
    Method and apparatus for branch target prediction 失效
    分支目标预测方法和装置

    公开(公告)号:US5938761A

    公开(公告)日:1999-08-17

    申请号:US976826

    申请日:1997-11-24

    IPC分类号: G06F9/38 G06F9/32

    CPC分类号: G06F9/3806

    摘要: One embodiment of the present invention provides a method and an apparatus for predicting the target of a branch instruction. This method and apparatus operate by using a translation lookaside buffer (TLB) to store page numbers for predicted branch target addresses. In this embodiment, a branch target address table stores a small index to a location in the translation lookaside buffer, and this index is used retrieve a page number from the location in the translation lookaside buffer. This page number is used as the page number portion of a predicted branch target address. Thus, a small index into a translation lookaside buffer can be stored in a predicted branch target address table instead of a larger page number for the predicted branch target address. This technique effectively reduces the size of a predicted branch target table by eliminating much of the space that is presently wasted storing redundant page numbers. Another embodiment maintains coherence between the branch target address table and the translation lookaside buffer. This makes it possible to detect a miss in the translation lookaside buffer at least one cycle earlier by examining the branch target address table.

    摘要翻译: 本发明的一个实施例提供了一种用于预测分支指令的目标的方法和装置。 该方法和装置通过使用翻译后备缓冲器(TLB)来存储用于预测的分支目标地址的页码。 在本实施例中,分支目标地址表将小索引存储到翻译后备缓冲器中的位置,并且使用该索引从翻译后备缓冲器中的位置检索页码。 该页码用作预测分支目标地址的页码部分。 因此,可以在预测的分支目标地址表中存储向翻译后备缓冲器的小索引,而不是预测的分支目标地址的较大的页码。 该技术通过消除存储冗余页码的目前浪费的大部分空间来有效地减小预测分支目标表的大小。 另一个实施例维护分支目标地址表和转换后备缓冲器之间的一致性。 这使得可以通过检查分支目标地址表来更早地检测翻译后备缓冲区中的未命中至少一个周期。

    Selection from multiple fetch addresses generated concurrently including
predicted and actual target by control-flow instructions in current and
previous instruction bundles
    5.
    发明授权
    Selection from multiple fetch addresses generated concurrently including predicted and actual target by control-flow instructions in current and previous instruction bundles 失效
    通过当前和以前的指令束中的控制流指令从多个并发产生的提取地址中进行选择,包括预测和实际目标

    公开(公告)号:US5935238A

    公开(公告)日:1999-08-10

    申请号:US878759

    申请日:1997-06-19

    IPC分类号: G06F9/38 G06F9/32

    摘要: A microprocessor is provided with an instruction fetch mechanism that simultaneously predicts multiple control-flow instructions. The instruction fetch unit farther is capable of handling multiple types of control-flow instructions. The instruction fetch unit uses predecode data and branch prediction data to select the next instruction fetch bundle address. If a branch misprediction is detected, a corrected branch target address is selected as the next fetch bundle address. If no branch misprediction occurs and the current fetch bundle includes a taken control-flow instruction, then the next fetch bundle address is selected based on the type of control-flow instruction detected. If the first taken control-flow instruction is a return instruction, a return address from the return address stack is selected as the next fetch bundle address. If the first taken control-flow instruction is an unconditional branch or predicted taken conditional branch, a predicted branch target address is selected as the next fetch bundle address. If no branch misprediction is detected and the current fetch bundle does not include a taking control-flow instruction, then a sequential address is selected as the next fetch bundle address.

    摘要翻译: 微处理器具有同时预测多个控制流指令的指令获取机制。 指令提取单元能够处理多种类型的控制流程指令。 指令提取单元使用预解码数据和分支预测数据来选择下一个指令获取束地址。 如果检测到分支错误预测,则选择校正的分支目标地址作为下一个获取束地址。 如果没有发生分支错误预测,并且当前的提取束包括所采取的控制流指令,则基于检测到的控制流指令的类型来选择下一个提取束地址。 如果第一个采取的控制流程指令是一个返回指令,则返回地址堆栈的返回地址将被选择作为下一个提取包地址。 如果第一个采取的控制流程指令是无条件分支或预测的条件分支,则选择预测的分支目标地址作为下一个获取束地址。 如果没有检测到分支错误预测,并且当前的提取包不包括获取控制流程指令,则选择顺序地址作为下一个提取包地址。

    Cache memory array which stores two-way set associative data
    6.
    发明授权
    Cache memory array which stores two-way set associative data 失效
    存储双向组关联数据的缓存存储器阵列

    公开(公告)号:US5854761A

    公开(公告)日:1998-12-29

    申请号:US883544

    申请日:1997-06-26

    IPC分类号: G06F12/08 G11C7/00 G11C7/10

    摘要: A cache memory array stores two-way set associative data. An odd set data bank stores odd number sets of the two-way set associative data, where the two ways of each odd number set are aligned horizontally within the odd set data bank. An even set data bank stores even number sets of the two-way set associative data, where the two ways of each even number set are aligned horizontally within the even set data bank. Also, the odd set data bank is aligned horizontally with the even set data bank such that each odd number set is aligned horizontally with a next even number set. The horizontally aligned ways are interleaved for data path width reduction. Set and way selection circuits extract lines of data from the array. The array may be structurally implemented by single-ported RAM cells.

    摘要翻译: 缓存存储器阵列存储双向组关联数据。 奇数组数据组存储奇数组合的双向组关联数据,其中每个奇数组的两种方式在奇数组数据库内水平排列。 偶数集数据库存储偶数集合的双向组关联数据,其中每个偶数集合的两个方式在偶数集数据库内水平对准。 此外,奇数组数据组与偶数组数据组水平对准,使得每个奇数组都与下一个偶数组水平对准。 水平对齐的方式被交织以减少数据路径宽度。 设置和路径选择电路从数组中提取数据行。 阵列可以在结构上由单端口RAM单元实现。

    Instruction sampling in a multi-threaded processor
    7.
    发明授权
    Instruction sampling in a multi-threaded processor 有权
    多线程处理器中的指令采样

    公开(公告)号:US08826241B2

    公开(公告)日:2014-09-02

    申请号:US10780264

    申请日:2004-02-16

    IPC分类号: G06F9/44 G06F9/45

    CPC分类号: G06F9/3851

    摘要: A method of sampling instructions executing in a multi-threaded processor which includes selecting an instruction for sampling, storing information relating to the instruction, determining whether the instruction includes an event of interest, and reporting the instruction if the instruction includes an event of interest on a per-thread basis. The event of interest includes information relating to a thread to which the instruction is bound.

    摘要翻译: 一种在多线程处理器中执行的采样指令的方法,包括选择采样指令,存储与指令有关的信息,确定指令是否包括感兴趣的事件,以及如果指令包括感兴趣的事件,则报告指令 每个线程的基础。 感兴趣的事件包括与指令绑定的线程有关的信息。

    Efficient on-chip accelerator interfaces to reduce software overhead
    8.
    发明授权
    Efficient on-chip accelerator interfaces to reduce software overhead 有权
    高效的片上加速器接口,以减少软件开销

    公开(公告)号:US07827383B2

    公开(公告)日:2010-11-02

    申请号:US11684358

    申请日:2007-03-09

    IPC分类号: G06F9/34 G06F12/08

    摘要: In one embodiment, a processor comprises execution circuitry and a translation lookaside buffer (TLB) coupled to the execution circuitry. The execution circuitry is configured to execute a store instruction having a data operand; and the execution circuitry is configured to generate a virtual address as part of executing the store instruction. The TLB is coupled to receive the virtual address and configured to translate the virtual address to a first physical address. Additionally, the TLB is coupled to receive the data operand and to translate the data operand to a second physical address. A hardware accelerator is also contemplated in various embodiments, as is a processor coupled to the hardware accelerator, a method, and a computer readable medium storing instruction which, when executed, implement a portion of the method.

    摘要翻译: 在一个实施例中,处理器包括耦合到执行电路的执行电路和转换后备缓冲器(TLB)。 执行电路被配置为执行具有数据操作数的存储指令; 并且所述执行电路被配置为生成作为执行所述存储指令的一部分的虚拟地址。 所述TLB被耦合以接收所述虚拟地址并被配置为将所述虚拟地址转换为第一物理地址。 此外,TLB被耦合以接收数据操作数并将数据操作数转换为第二物理地址。 还可以在各种实施例中考虑硬件加速器,以及耦合到硬件加速器的处理器,方法和存储指令的计算机可读介质,其在被执行时实现该方法的一部分。

    Method and apparatus for reducing register file access times in pipelined processors
    9.
    发明授权
    Method and apparatus for reducing register file access times in pipelined processors 有权
    用于在流水线处理器中减少寄存器文件访问时间的方法和装置

    公开(公告)号:US06934830B2

    公开(公告)日:2005-08-23

    申请号:US10259721

    申请日:2002-09-26

    IPC分类号: G06F9/30 G06F9/38

    摘要: One embodiment of the present invention provides a system that reduces the time required to access registers from a register file within a processor. During operation, the system receives an instruction to be executed, wherein the instruction identifies at least one operand to be accessed from the register file. Next, the system looks up the operands in a register pane, wherein the register pane is smaller and faster than the register file and contains copies of a subset of registers from the register file. If the lookup is successful, the system retrieves the operands from the register pane to execute the instruction. Otherwise, if the lookup is not successful, the system retrieves the operands from the register file, and stores the operands into the register pane. This triggers the system to reissue the instruction to be executed again, so that the re-issued instruction retrieves the operands from the register pane.

    摘要翻译: 本发明的一个实施例提供一种减少从处理器内的寄存器文件访问寄存器所需的时间的系统。 在操作期间,系统接收要执行的指令,其中该指令从该寄存器文件中识别要访问的至少一个操作数。 接下来,系统在寄存器窗格中查找操作数,其中寄存器窗格比寄存器文件更小和更快,并且包含寄存器文件中寄存器子集的副本。 如果查找成功,系统将从寄存器窗格中检索操作数,执行指令。 否则,如果查找不成功,系统将从寄存器文件中检索操作数,并将操作数存储到寄存器窗格中。 这将触发系统重新发出要再次执行的指令,以便重新发出的指令从寄存器窗格中检索操作数。

    Mechanism for delivering precise exceptions in an out-of-order processor with speculative execution
    10.
    发明授权
    Mechanism for delivering precise exceptions in an out-of-order processor with speculative execution 有权
    在具有推测性执行的无序处理器中提供精确异常的机制

    公开(公告)号:US06615343B1

    公开(公告)日:2003-09-02

    申请号:US09599227

    申请日:2000-06-22

    IPC分类号: G06F938

    CPC分类号: G06F9/3861 G06F9/3842

    摘要: A method of handling an exception in a processor includes setting a state upon detection of an exception, signaling a trap for the exception if the state is set, and based on a class of the exception, processing the exception differently before signaling the trap. The method may include replaying an instruction causing the exception before signaling the trap for the exception based on the class of the exception. The method may include replaying the instruction causing the exception after the instruction causing the exception becomes an oldest, unretired instruction. The method may include signaling the trap for the exception after an instruction causing the exception becomes an oldest, unretired instruction. The method may include marking an instruction causing the exception as complete without issuing the instruction causing the exception. An apparatus for handling exceptions in a processor includes an instruction scheduler for setting a state upon detection of an exception and signaling a trap for the exception if the state is set. The instruction scheduler, based on a class of the exception, processes the exception differently before signaling the trap.

    摘要翻译: 在处理器中处理异常的方法包括在检测到异常时设置状态,如果状态被设置则发信号通知异常的陷阱,并且基于异常类,在发信号通知之前不同地处理异常。 该方法可以包括在基于异常的类发送异常的陷阱之前重放导致异常的指令。 该方法可以包括在导致异常的指令成为最旧的未命令指令之后重放导致异常的指令。 该方法可以包括在导致异常的指令成为最旧的未命令指令之后发信号通知异常的陷阱。 该方法可以包括将导致异常的指令标记为完整,而不发出导致异常的指令。 用于在处理器中处理异常的装置包括指令调度器,用于在检测到异常时设置状态,并且如果状态被设置,则发送异常的陷阱。 指令调度程序基于异常类,在通知陷阱之前处理异常。