Accelerating predicated instruction execution in vector processors

    公开(公告)号:US12164923B2

    公开(公告)日:2024-12-10

    申请号:US17853790

    申请日:2022-06-29

    Abstract: Methods and systems are disclosed for processing a vector by a vector processor. Techniques disclosed include receiving predicated instructions by a scheduler, each of which is associated with an opcode, a vector of elements, and a predicate. The techniques further include executing the predicated instructions. Executing a predicated instruction includes compressing, based on an index derived from a predicate of the instruction, elements in a vector of the instruction, where the elements in the vector are contiguously mapped, then, after the mapped elements are processed, decompressing the processed mapped elements, where the processed mapped elements are reverse mapped based on the index.

    INCREASING SYSTEM POWER EFFICIENCY BY OPTICAL COMPUTING

    公开(公告)号:US20240111355A1

    公开(公告)日:2024-04-04

    申请号:US17956606

    申请日:2022-09-29

    CPC classification number: G06F1/329

    Abstract: Methods and systems are disclosed for reducing power consumption by a system including a digital unit and an optical unit. Techniques disclosed comprise generating a workload signature of an incoming workload to be executed by the system. Based on the generated workload signature, techniques disclosed comprise matching the incoming workload with a profile of stored workload profiles. The workload profiles are generated by a trace capture unit. Based on the associated profile, a task submission transaction is sent to the optical unit of the system, representative of a request to execute the incoming workload by the optical unit.

    BIGNUM ADDITION AND/OR SUBTRACTION WITH CARRY PROPAGATION

    公开(公告)号:US20240111489A1

    公开(公告)日:2024-04-04

    申请号:US17955634

    申请日:2022-09-29

    CPC classification number: G06F7/4981 G06F7/506

    Abstract: A processing unit includes a plurality of adders and a plurality of carry bit generation circuits. The plurality of adders add first and second X bit binary portion values of a first Y bit binary value and a second Y bit binary value. Y is a multiple of X. The plurality of adders further generate first carry bits. The plurality of carry bit generation circuits is coupled to the plurality of adders, respectively, and receive the first carry bits. The plurality of carry bit generation circuits generate second carry bits based on the first carry bits. The plurality of adders use the second carry bits to add the first and second X bit binary portions of the first and second Y bit binary values, respectively.

    ACCELERATING PREDICATED INSTRUCTION EXECUTION IN VECTOR PROCESSORS

    公开(公告)号:US20240004656A1

    公开(公告)日:2024-01-04

    申请号:US17853790

    申请日:2022-06-29

    CPC classification number: G06F9/30145 G06F9/3851 G06F9/3887

    Abstract: Methods and systems are disclosed for processing a vector by a vector processor. Techniques disclosed include receiving predicated instructions by a scheduler, each of which is associated with an opcode, a vector of elements, and a predicate. The techniques further include executing the predicated instructions. Executing a predicated instruction includes compressing, based on an index derived from a predicate of the instruction, elements in a vector of the instruction, where the elements in the vector are contiguously mapped, then, after the mapped elements are processed, decompressing the processed mapped elements, where the processed mapped elements are reverse mapped based on the index.

    IOMMU COLLOCATED RESOURCE MANAGER
    5.
    发明公开

    公开(公告)号:US20230205539A1

    公开(公告)日:2023-06-29

    申请号:US17565336

    申请日:2021-12-29

    CPC classification number: G06F9/3842 G06F1/26 G06N3/04

    Abstract: Devices, methods and systems for managing resources in a computing device. Information regarding resource usage is captured. A prediction is generated, based on the information, that resource usage by a processor will exceed a threshold during an upcoming time. An operating parameter of the processor is adjusted, based on the prediction. In some implementations, information regarding memory bandwidth is captured. A prediction is generated, based on the information, that a memory region stored in a first memory device will be addressed by a memory intensive instruction during an upcoming time period. Data stored in the memory region is moved to a second memory device, based on the prediction.

    Prefetch disable of memory requests targeting data lacking locality

    公开(公告)号:US11645207B2

    公开(公告)日:2023-05-09

    申请号:US17132769

    申请日:2020-12-23

    CPC classification number: G06F12/0862 G06F2212/6028

    Abstract: A system and method for efficiently processing memory requests are described. A processing unit includes at least a processor core, a cache, and a non-cache storage buffer capable of storing data prevented from being stored in the cache. While processing a memory request targeting the non-cache storage buffer, the processor core inspects a flag stored in a tag of the memory request. The processor core prevents data prefetching into one or more of the non-cache storage buffer and the cache based on determining the flag specifies preventing data prefetching into one or more of the non-cache storage buffer and the cache using the target address of the memory request during processing of this instance of the memory request. While processing a prefetch hint instruction, the processor core determines from the tag whether to prevent prefetching.

    PREFETCH DISABLE OF MEMORY REQUESTS TARGETING DATA LACKING LOCALITY

    公开(公告)号:US20220100664A1

    公开(公告)日:2022-03-31

    申请号:US17132769

    申请日:2020-12-23

    Abstract: A system and method for efficiently processing memory requests are described. A processing unit includes at least a processor core, a cache, and a non-cache storage buffer capable of storing data prevented from being stored in the cache. While processing a memory request targeting the non-cache storage buffer, the processor core inspects a flag stored in a tag of the memory request. The processor core prevents data prefetching into one or more of the non-cache storage buffer and the cache based on determining the flag specifies preventing data prefetching into one or more of the non-cache storage buffer and the cache using the target address of the memory request during processing of this instance of the memory request. While processing a prefetch hint instruction, the processor core determines from the tag whether to prevent prefetching.

Patent Agency Ranking