-
公开(公告)号:US20220398100A1
公开(公告)日:2022-12-15
申请号:US17343442
申请日:2021-06-09
Applicant: Microsoft Technology Licensing, LLC
Inventor: Yusuf Cagatay Tekmen , Rodney Wayne Smith , Shivam Priyadarshi , Milind A. Choudhary , Kiran Ravi Seth
Abstract: Processors employing memory bypassing in memory data dependent instructions as a store data forwarding mechanism, and related methods. To reduce stalls of memory data dependent, load-based instructions, a memory data dependency detection circuit is configured to detect a memory hazard between a store-based instruction and a load-based instruction based on their opcodes and designation/source operands. Some store-based and load-based instructions have opcodes identifying these instructions as having respective store and load address operand types that can be compared without resolution of their respective store and load addresses. For these detected types of instructions, the memory data dependency detection circuit is configured to determine if a source operand of a load-based instruction matches a target operand of a store-based instruction to detect a memory hazard earlier in the instruction pipeline. Identifying memory hazards earlier in an instruction pipeline can allow memory dependent instructions to be processed with avoided or reduced stalls.
-
公开(公告)号:US11061677B1
公开(公告)日:2021-07-13
申请号:US16887827
申请日:2020-05-29
Applicant: Microsoft Technology Licensing, LLC
Inventor: Kiran Ravi Seth , Yusuf Cagatay Tekmen , Rodney Wayne Smith , Shivam Priyadarshi , Vignyan Reddy Kothinti Naresh
Abstract: A register mapping circuit for recovering a register mapping state associated with a flushed instruction by traversing ROB entries from a snapshot of another register mapping state. The register mapping circuit includes a ROB control circuit, a snapshot circuit, and a register rename recovery circuit (RRRC). The ROB control circuit allocates ROB entries to instructions entering a processor pipeline, including a target ROB entry allocated to a target instruction and other ROB entries allocated to other instructions. The snapshot circuit captures snapshots of logical register-to-physical register mapping states in the rename map table in association with a subset of instructions that could be flushed. If the target instruction is flushed, the RRRC restores the rename map table register mapping state corresponding to the target instruction based on a snapshot in a ROB entry allocated to another instruction, and traverses register mapping updates in the intervening ROB entries.
-
公开(公告)号:US11327763B2
公开(公告)日:2022-05-10
申请号:US16898938
申请日:2020-06-11
Applicant: Microsoft Technology Licensing, LLC
Inventor: Arthur Perais , Shivam Priyadarshi , Yusuf Cagatay Tekmen , Rami Mohammad Al Sheikh , Vignyan Reddy Kothinti Naresh
Abstract: Opportunistic consumer instruction steering based on producer instruction value prediction in a multi-cluster processor is disclosed. A processor provides producer instructions and consumer instructions to a steering circuit that steers the program instructions to clusters of instruction execution circuits. An input value provided to a consumer instruction may be a produced value of a producer instruction, creating a dependency. The steering circuit steers a producer instruction to a first cluster and, in response to receiving the consumer instruction and the predicted value of the producer instruction, provides the predicted value to at least a second cluster and steers the consumer instruction to the second cluster for execution with the predicted value as the input value. A consumer instruction can be executed in a different cluster than a producer instruction without a cluster-to-cluster latency penalty, which allows the instruction loads to be better balanced among the clusters for higher processor throughput.
-
公开(公告)号:US11113068B1
公开(公告)日:2021-09-07
申请号:US16986650
申请日:2020-08-06
Applicant: Microsoft Technology Licensing, LLC
Inventor: Yusuf Cagatay Tekmen , Rodney Wayne Smith , Kiran Ravi Seth , Shivam Priyadarshi
IPC: G06F9/38
Abstract: Performing flush recovery using parallel walks of sliced reorder buffers (SROBs) is disclosed herein. In one exemplary embodiment, a register mapping circuit provides a rename mapping table (RMT) comprising RMT entries representing logical register number (LRN) to physical register number (PRN) mappings. The register mapping circuit also provides an SROB comprising multiple SROB slices that each corresponds to a respective LRN. Each SROB slice tracks uncommitted instructions that write to the LRN corresponding to that SROB slice, and maintains those instructions in program order with respect to each other. Upon detecting an uncommitted instruction writing to an LRN, the register mapping circuit allocates an SROB slice entry in the SROB slice corresponding to the LRN. When an pipeline flush from a target instruction occurs, the register mapping circuit restores RMT entries of the RMT to their prior mapping states based on parallel walks of the SROB slices of the SROB.
-
5.
公开(公告)号:US10956162B2
公开(公告)日:2021-03-23
申请号:US16456836
申请日:2019-06-28
Applicant: Microsoft Technology Licensing, LLC
Inventor: Robert Douglas Clancy , Melinda Joyce Brown , Yusuf Cagatay Tekmen , Brian Michael Stempel , Michael Scott Mcilvaine , Thomas Philip Speier , Rodney Wayne Smith , Gagan Gupta , David Tennyson Harper, III
IPC: G06F9/38
Abstract: Operand-based reach explicit dataflow processors, and related methods and computer-readable media are disclosed. The operand-based reach explicit dataflow processors support execution of a producer instruction that explicitly names a target consumer operand of a consumer instruction in a consumer operand encoding namespace of the producer instruction. The produced value from execution of the producer instruction is provided or otherwise made available as an input to the named target consumer operand of the consumer instruction as a result of processing the producer instruction. The target consumer operand is encoded in the producer instruction as an operand target distance relative to the producer instruction. Instructions in an instruction stream between the producer instruction and the targeted consumer instruction that have no operands do not consume an operand reach namespace in the producer instructions. This provides for a deeper explicit consumer naming reach for a given bit size of the operand reach namespace.
-
公开(公告)号:US11803389B2
公开(公告)日:2023-10-31
申请号:US16738362
申请日:2020-01-09
Applicant: Microsoft Technology Licensing, LLC
Inventor: Yusuf Cagatay Tekmen , Rodney Wayne Smith , Douglas C. Burger , Gagan Gupta , Kiran Ravi Seth
IPC: G06F9/38
CPC classification number: G06F9/3838 , G06F9/3836
Abstract: A reach matrix scheduler circuit for scheduling instructions to be executed in a processor is disclosed. The scheduler circuit includes an N×R matrix wake-up circuit, where ‘N’ is the instruction window size of the scheduler circuit, and ‘R’ is the “reach” within the instruction window of the matrix wake-up circuit, with ‘R’ being less than ‘N’. A grant line associated with each instruction request entry in the N×R matrix wake-up circuit is coupled to ‘R’ other instruction entries among the ‘N’ instruction entries. When a producer instruction in an instruction request entry is ready for issuance, the grant line associated with the instruction request entry is activated so that any other instruction entries coupled to the grant line (i.e., within the “reach” of the instruction request entry) that consume the produced value generated by the producer instruction are “woken-up” and subsequently indicated as ready to be issued.
-
7.
公开(公告)号:US11023243B2
公开(公告)日:2021-06-01
申请号:US16518341
申请日:2019-07-22
Applicant: Microsoft Technology Licensing, LLC
Inventor: Yusuf Cagatay Tekmen , Shivam Priyadarshi , Rodney Wayne Smith
IPC: G06F9/38
Abstract: Latency-based instruction reservation clustering in a scheduler circuit in a processor is disclosed. The scheduler circuit includes a plurality of latency-based reservation circuits each having an assigned producer instruction cycle latency. Producer instructions with the same cycle latency can be clustered in the same latency-based reservation circuit. Thus, the number of reservation entries is distributed among the plurality of latency-based reservation circuits to avoid or reduce an increase in the number of scheduling path connections and complexity in each reservation circuit to avoid or reduce an increase in scheduling latency. The scheduling path connections are reduced for a given number of reservation entries over a non-clustered pick circuit, because signals (e.g., wake-up signals, pick-up signals) used for scheduling instructions in each latency-based reservation circuit do not have to have the same clock cycle latency so as to not impact performance.
-
公开(公告)号:US10896041B1
公开(公告)日:2021-01-19
申请号:US16582008
申请日:2019-09-25
Applicant: Microsoft Technology Licensing, LLC
Inventor: Shivam Priyadarshi , Arthur Perais , Vignyan Reddy Kothinti Naresh , Yusuf Cagatay Tekmen , Rami Mohammad Al Sheikh , Rodney Wayne Smith
Abstract: Enabling early execution of move-immediate instructions having variable immediate value sizes in processor-based devices is disclosed. In one exemplary embodiment, a processor-based device provides a move-immediate logic circuit that detects a move-immediate instruction comprising an immediate value and a destination register. For frequently encountered immediate values, the move-immediate logic circuit allocates a physical register from an immediate physical register file (IPRF), and writes an IPRF tag corresponding to the allocated IPRF register into a most-recent mapping table (MRT) entry for the destination register. Subsequent move-immediate instructions embedding the same immediate value, as well as other dependent instructions, may then obtain the immediate value from the IPRF register by accessing the MRT entry. Additionally, the PE provides a frequent immediate table (FIT) for tracking occurrences of immediate values, and allocates IPRF registers for a given immediate value only when a count of occurrences of that immediate value exceeds a FIT threshold.
-
-
-
-
-
-
-