Apparatus and method for a hybrid latency-throughput processor

    公开(公告)号:US10255077B2

    公开(公告)日:2019-04-09

    申请号:US15226875

    申请日:2016-08-02

    Abstract: An apparatus and method are described for executing both latency-optimized execution logic and throughput-optimized execution logic on a processing device. For example, a processor according to one embodiment comprises: latency-optimized execution logic to execute a first type of program code; throughput-optimized execution logic to execute a second type of program code, wherein the first type of program code and the second type of program code are designed for the same instruction set architecture; logic to identify the first type of program code and the second type of program code within a process and to distribute the first type of program code for execution on the latency-optimized execution logic and the second type of program code for execution on the throughput-optimized execution logic.

    Processing core having shared front end unit

    公开(公告)号:US10140129B2

    公开(公告)日:2018-11-27

    申请号:US13730719

    申请日:2012-12-28

    Abstract: A processor having one or more processing cores is described. Each of the one or more processing cores has front end logic circuitry and a plurality of processing units. The front end logic circuitry is to fetch respective instructions of threads and decode the instructions into respective micro-code and input operand and resultant addresses of the instructions. Each of the plurality of processing units is to be assigned at least one of the threads, is coupled to said front end unit, and has a respective buffer to receive and store microcode of its assigned at least one of the threads. Each of the plurality of processing units also comprises: i) at least one set of functional units corresponding to a complete instruction set offered by the processor, the at least one set of functional units to execute its respective processing unit's received microcode; ii) registers coupled to the at least one set of functional units to store operands and resultants of the received microcode; iii) data fetch circuitry to fetch input operands for the at least one functional units' execution of the received microcode.

    Apparatus and method for low-latency invocation of accelerators

    公开(公告)号:US10095521B2

    公开(公告)日:2018-10-09

    申请号:US15145748

    申请日:2016-05-03

    Abstract: An apparatus and method are described for providing low-latency invocation of accelerators. For example, a processor according to one embodiment comprises execution logic to execute a plurality of instructions including an accelerator invocation instruction to invoke one or more accelerator commands. The accelerator invocation instruction stores command data specifying the command within a command register. One or more accelerators read the command data from the command register and responsively attempt to execute the command identified by the command data. Upon a switch from a first context to a second context, an accelerator context save/restore pointer identifies a region within system memory where the accelerator is to save its state and later the accelerator context save/restore pointer aids in restoring its state upon returning to the first context.

    Apparatus and Method for a Hybrid Latency-Throughput Processor
    15.
    发明申请
    Apparatus and Method for a Hybrid Latency-Throughput Processor 审中-公开
    用于混合延迟吞吐量处理器的装置和方法

    公开(公告)号:US20160342419A1

    公开(公告)日:2016-11-24

    申请号:US15226875

    申请日:2016-08-02

    Abstract: An apparatus and method are described for executing both latency-optimized execution logic and throughput-optimized execution logic on a processing device. For example, a processor according to one embodiment comprises: latency-optimized execution logic to execute a first type of program code; throughput-optimized execution logic to execute a second type of program code, wherein the first type of program code and the second type of program code are designed for the same instruction set architecture; logic to identify the first type of program code and the second type of program code within a process and to distribute the first type of program code for execution on the latency-optimized execution logic and the second type of program code for execution on the throughput-optimized execution logic.

    Abstract translation: 描述了用于在处理设备上执行延迟优化的执行逻辑和吞吐量优化的执行逻辑的装置和方法。 例如,根据一个实施例的处理器包括:执行第一类型的程序代码的等待时间优化的执行逻辑; 吞吐量优化执行逻辑以执行第二类型的程序代码,其中所述第一类型的程序代码和所述第二类型的程序代码被设计用于相同的指令集架构; 识别过程中的第一类型的程序代码和第二类型的程序代码的逻辑,并且将用于执行的第一类型的程序代码分配在延迟优化的执行逻辑和第二类型的程序代码上以便在吞吐量 - 优化的执行逻辑。

Patent Agency Ranking