-
公开(公告)号:US11803389B2
公开(公告)日:2023-10-31
申请号:US16738362
申请日:2020-01-09
IPC分类号: G06F9/38
CPC分类号: G06F9/3838 , G06F9/3836
摘要: A reach matrix scheduler circuit for scheduling instructions to be executed in a processor is disclosed. The scheduler circuit includes an N×R matrix wake-up circuit, where ‘N’ is the instruction window size of the scheduler circuit, and ‘R’ is the “reach” within the instruction window of the matrix wake-up circuit, with ‘R’ being less than ‘N’. A grant line associated with each instruction request entry in the N×R matrix wake-up circuit is coupled to ‘R’ other instruction entries among the ‘N’ instruction entries. When a producer instruction in an instruction request entry is ready for issuance, the grant line associated with the instruction request entry is activated so that any other instruction entries coupled to the grant line (i.e., within the “reach” of the instruction request entry) that consume the produced value generated by the producer instruction are “woken-up” and subsequently indicated as ready to be issued.
-
公开(公告)号:US11144820B2
公开(公告)日:2021-10-12
申请号:US15637426
申请日:2017-06-29
发明人: Eric S. Chung , Douglas C. Burger , Jeremy Fowers
摘要: Processors and methods for neural network processing are provided. A method in a processor including a pipeline having a matrix vector unit (MVU), a first multifunction unit connected to receive an input from the matrix vector unit, a second multifunction unit connected to receive an output from the first multifunction unit, and a third multifunction unit connected to receive an output from the second multifunction unit is provided. The method includes decoding a chain of instructions received via an input queue, where the chain of instructions comprises a first instruction that can only be processed by the matrix vector unit and a sequence of instructions that can only be processed by a multifunction unit. The method includes processing the first instruction using the MVU and processing each of instructions in the sequence of instructions depending upon a position of the each of instructions in the sequence of instructions.
-
公开(公告)号:US11099906B2
公开(公告)日:2021-08-24
申请号:US16128224
申请日:2018-09-11
摘要: A service mapping component (SMC) is described herein for processing requests by instances of tenant functionality that execute on software-driven host components (or some other components) in a data processing system. The SMC is configured to apply at least one rule to determine whether a service requested by an instance of tenant functionality is to be satisfied by at least one of: a local host component, a local hardware acceleration component which is locally coupled to the local host component, and/or at least one remote hardware acceleration component that is indirectly accessible to the local host component via the local hardware acceleration component. In performing its analysis, the SMC can take into account various factors, such as whether or not the service corresponds to a line-rate service, latency-related considerations, security-related considerations, and so on.
-
公开(公告)号:US11016770B2
公开(公告)日:2021-05-25
申请号:US15044045
申请日:2016-02-15
发明人: Douglas C. Burger , Aaron L. Smith
IPC分类号: G06F9/30 , G06F9/38 , G06F9/46 , G06F9/52 , G06F11/36 , G06F15/78 , G06F9/26 , G06F9/32 , G06F9/345 , G06F9/35 , G06F12/0806 , G06F12/0862 , G06F12/1009 , G06F13/42 , G06F15/80 , G06F9/355 , G06F12/0811 , G06F12/0875
摘要: Distinct system registers for logical processors are disclosed. In one example of the disclosed technology, a processor includes a plurality of block-based physical processor cores for executing a program comprising a plurality of instruction blocks. The processor also includes a thread scheduler configured to schedule a thread of the program for execution, the thread using the one or more instruction blocks. The processor further includes at least one system register. The at least one system register stores data indicating a number and placement of the plurality of physical processor cores to form a logical processor. The logical processor executes the scheduled thread. The logical processor is configured to execute the thread in a continuous instruction window.
-
公开(公告)号:US10819657B2
公开(公告)日:2020-10-27
申请号:US16283878
申请日:2019-02-25
发明人: Douglas C. Burger , Andrew R. Putnam , Stephen F. Heil , Michael David Haselman , Sitaram V. Lanka , Yi Xiao
IPC分类号: H04L12/911 , G06F9/48 , G06F9/50 , H04L12/26
摘要: Aspects extend to methods, systems, and computer program products for allocating acceleration component functionality for supporting services. A service manager uses a finite number of acceleration components to accelerate services. Acceleration components can be allocated in a manner that balances load in a hardware acceleration plane, minimizes role switching, and adapts to demand changes. When role switching is appropriate, less extensive mechanisms (e.g., based on configuration data versus image files) can be used to switch roles to the extent possible.
-
公开(公告)号:US10776115B2
公开(公告)日:2020-09-15
申请号:US14942557
申请日:2015-11-16
发明人: Douglas C. Burger , Aaron L. Smith
IPC分类号: G06F9/30 , G06F9/38 , G06F9/46 , G06F9/52 , G06F11/36 , G06F9/26 , G06F9/32 , G06F9/345 , G06F9/35 , G06F12/0806 , G06F12/0862 , G06F12/1009 , G06F13/42 , G06F15/80 , G06F15/78 , G06F9/355 , G06F12/0811 , G06F12/0875
摘要: Systems and methods are disclosed for supporting debugging of programs in block-based processor architectures. In one example of the disclosed technology, a processor includes a block-based processor core for executing an instruction block comprising an instruction header and a plurality of instructions. The block-based processor core includes execution control logic and core state access logic. The execution control logic can be configured to schedule respective instructions of the plurality of instructions for execution in a dynamic order during a default execution mode and to schedule the respective instructions for execution in a static order during a debug mode. The core state access logic can be configured to read intermediate states of the block-based processor core and to provide the intermediate states outside of the block-based processor core during the debug mode.
-
公开(公告)号:US10511478B2
公开(公告)日:2019-12-17
申请号:US14752807
申请日:2015-06-26
发明人: Andrew R. Putnam , Douglas C. Burger , Michael David Haselman , Stephen F. Heil , Yi Xiao , Sitaram V. Lanka
摘要: Aspects extend to methods, systems, and computer program products for changing between different roles at acceleration components. Changing roles at an acceleration component can be facilitated without loading an image file to configure or partially reconfigure the acceleration component. At configuration time, an acceleration component can be configured with a framework and a plurality of selectable roles. The framework also provides a mechanism for loading different selectable roles for execution at the acceleration component (e.g., the framework can include a superset of instructions for providing any of a plurality of different roles). The framework can receive requests for specified roles from other components and switch to a subset of instructions for the specified roles. Switching between subsets of instructions at an acceleration component is a lower overhead operation relative to reconfiguring or partially reconfiguring an acceleration component by loading an image file.
-
公开(公告)号:US10445097B2
公开(公告)日:2019-10-15
申请号:US15073365
申请日:2016-03-17
发明人: Douglas C. Burger , Aaron L. Smith
IPC分类号: G06F9/30 , G06F9/38 , G06F15/80 , G06F9/32 , G06F9/26 , G06F11/36 , G06F12/0862 , G06F9/35 , G06F12/1009 , G06F13/42 , G06F12/0806 , G06F15/78 , G06F9/46 , G06F9/52 , G06F9/345 , G06F9/355 , G06F12/0875 , G06F12/0811
摘要: Apparatus and methods are disclosed for decoding targets from an instruction and transmitting data to those targets in accordance with a current instruction. Multimodal target hardware is used in conjunction with one or more of the routers so as to route data to an appropriate target. The data can be one or more operands or a predicate and the targets can include operand buffers, broadcast channels, and general registers. In this way, operands, for example, can be directed for use with multiple subsequent instructions, and there are multiple modes for distributing the operands to the multiple instructions.
-
公开(公告)号:US20190310852A1
公开(公告)日:2019-10-10
申请号:US16450172
申请日:2019-06-24
发明人: Douglas C. Burger , Aaron Smith , Jan Gray
IPC分类号: G06F9/30 , G06F9/38 , G06F15/80 , G06F12/0875 , G06F12/0842
摘要: A processor core in an instruction block-based microarchitecture is configured so that an instruction window and operand buffers are decoupled for independent operation in which instructions in the block are not tied to resources such as control bits and operands that are maintained in the operand buffers. Instead, pointers are established among instructions in the block and the resources so that control state can be established for a refreshed instruction block (i.e., an instruction block that is reused without re-fetching it from an instruction cache) by following the pointers. Such decoupling of the instruction window from the operand space can provide greater processor efficiency, particularly in multiple core arrays where refreshing is utilized (for example when executing program code that uses tight loops), because the operands and control bits are pre-validated.
-
公开(公告)号:US10409606B2
公开(公告)日:2019-09-10
申请号:US14752356
申请日:2015-06-26
发明人: Douglas C. Burger , Aaron L. Smith , Jan S. Gray
摘要: Apparatus and methods are disclosed for implementing bad jump detection in block-based processor architectures. In one example of the disclosed technology, a block-based processor includes one or more block-based processing cores configured to fetch and execute atomic blocks of instructions and a control unit configured to, based at least in part on receiving a branch signal indicating a target location is received from one of the instruction blocks, verify that the target location is a valid branch target.
-
-
-
-
-
-
-
-
-