GENERATING VECTORIZED CONTROL FLOW USING RECONVERGING CONTROL FLOW GRAPHS

    公开(公告)号:US20200310766A1

    公开(公告)日:2020-10-01

    申请号:US16370079

    申请日:2019-03-29

    Inventor: Nicolai Haehnle

    Abstract: A reconverging control flow graph is generated by receiving an input control flow graph including a plurality of basic code blocks, determining an order of the basic code blocks, and traversing the input control flow graph. The input control flow graph is traversed by, for each basic code block B of the plurality of basic code blocks, according to the determined order of the basic code blocks, visiting the basic code block B prior to visiting a subsequent block C of the plurality of basic code blocks, and based on determining that the basic code block B has a prior block A and that the prior block A has an open edge AC to the subsequent block C, in the reconverging control flow graph, creating an edge AF between the prior block A and a flow block F1, and creating an edge FC between the flow block F1 and the subsequent block C.

    DYNAMIC INSTANCES SEMANTICS
    562.
    发明申请

    公开(公告)号:US20200301681A1

    公开(公告)日:2020-09-24

    申请号:US16544796

    申请日:2019-08-19

    Inventor: Nicolai Haehnle

    Abstract: A computing system includes a processor and a memory storing instructions for a compiler that, when executed by the processor, cause the processor to generate a control flow graph of program source code by receiving the program source code in the compiler, in the compiler, generating a structure point representation based on the received program source code by inserting into the program source code a set of structure points including an anchor structure point and a join structure point associated with the anchor structure point, and based on the structure point representation, generating the control flow graph including a plurality of blocks each representing a portion of the program source code. In the control flow graph, a block A between the anchor structure point and the join structure point post-dominates each of the one or more divergent branches between the anchor structure point and the join structure point.

    Memory with expandable row width
    563.
    发明授权

    公开(公告)号:US10783953B2

    公开(公告)日:2020-09-22

    申请号:US15830176

    申请日:2017-12-04

    Abstract: A method for operating a memory device includes initiating an access operation to a corresponding row of an array of bit cells of the memory device. Responsive to an expansion mode signal having a first state, the method further includes dynamically operating each column of a plurality of columns of the array to access each bit cell of a corresponding row within the plurality of columns during the access operation. Alternatively, responsive to the expansion mode state signal having a second state different than the first state, the method includes dynamically operating each column of a first subset of columns of the plurality of columns to access each bit cell of a corresponding row within the first subset of columns during the access operation, and maintaining each column of a second subset of columns of the plurality of columns in a static state during the access operation.

    DETECTING VOICE REGIONS IN A NON-STATIONARY NOISY ENVIRONMENT

    公开(公告)号:US20200294534A1

    公开(公告)日:2020-09-17

    申请号:US16355676

    申请日:2019-03-15

    Inventor: A. Srinivas

    Abstract: Methods, devices, and systems for voice activity detection. An audio signal is received by receiver circuitry. A pitch analysis is performed on the received audio signal by pitch analysis circuitry. A higher-order statistics analysis is performed on the audio signal by statistics analysis circuitry. Logic circuitry determines, based on the pitch analysis and the higher-order statistics analysis, whether the audio signal includes a voice region. The logic circuitry outputs a signal indicating that the audio signal includes voice if the audio signal was determined to include a voice region or indicating that the audio signal does not include voice if the audio signal was determined not to include a voice region.

    ADAPTIVE CACHE RECONFIGURATION VIA CLUSTERING
    565.
    发明申请

    公开(公告)号:US20200293445A1

    公开(公告)日:2020-09-17

    申请号:US16355168

    申请日:2019-03-15

    Abstract: A method of dynamic cache configuration includes determining, for a first clustering configuration, whether a current cache miss rate exceeds a miss rate threshold. The first clustering configuration includes a plurality of graphics processing unit (GPU) compute units clustered into a first plurality of compute unit clusters. The method further includes clustering, based on the current cache miss rate exceeding the miss rate threshold, the plurality of GPU compute units into a second clustering configuration having a second plurality of compute unit clusters fewer than the first plurality of compute unit clusters.

    METHOD AND APPARATUS FOR PEER-TO-PEER MESSAGING IN HETEROGENEOUS MACHINE CLUSTERS

    公开(公告)号:US20200293387A1

    公开(公告)日:2020-09-17

    申请号:US16887643

    申请日:2020-05-29

    Inventor: Shuai Che

    Abstract: Various computing network messaging techniques and apparatus are disclosed. In one aspect, a method of computing is provided that includes executing a first thread and a second thread. A message is sent from the first thread to the second thread. The message includes a domain descriptor that identifies a first location of the first thread and a second location of the second thread.

    PROCESSING UNIT WITH MIXED PRECISION OPERATIONS

    公开(公告)号:US20200293286A1

    公开(公告)日:2020-09-17

    申请号:US16591031

    申请日:2019-10-02

    Abstract: A graphics processing unit (GPU) implements operations, with associated op codes, to perform mixed precision mathematical operations. The GPU includes an arithmetic logic unit (ALU) with different execution paths, wherein each execution path executes a different mixed precision operation. By implementing mixed precision operations at the ALU in response to designate op codes that delineate the operations, the GPU efficiently increases the precision of specified mathematical operations while reducing execution overhead.

    Multi-tiered low power states
    569.
    发明授权

    公开(公告)号:US10775874B2

    公开(公告)日:2020-09-15

    申请号:US16210985

    申请日:2018-12-05

    Abstract: A computer processing device transitions among a plurality of power management states and at least one power management sub-state. From a first state, it is determined whether an entry condition for a third state is satisfied. If the entry condition for the third state is satisfied, the third state is entered. If the entry condition for the third state is not satisfied, it is determined whether an entry condition for the first sub-state is satisfied. If the entry condition for the first sub-state is determined to be satisfied, the first sub-state is entered, a first sub-state residency timer is started, and after expiry of the first sub-state residency timer, the first sub-state is exited, the first state is re-entered, and it is re-determined whether the entry condition for the third state is satisfied.

    IMPLEMENTING A MICRO-OPERATION CACHE WITH COMPACTION

    公开(公告)号:US20200285466A1

    公开(公告)日:2020-09-10

    申请号:US16297358

    申请日:2019-03-08

    Abstract: Systems, apparatuses, and methods for compacting multiple groups of micro-operations into individual cache lines of a micro-operation cache are disclosed. A processor includes at least a decode unit and a micro-operation cache. When a new group of micro-operations is decoded and ready to be written to the micro-operation cache, the micro-operation cache determines which set is targeted by the new group of micro-operations. If there is a way in this set that can store the new group without evicting any existing group already stored in the way, then the new group is stored into the way with the existing group(s) of micro-operations. Metadata is then updated to indicate that the new group of micro-operations has been written to the way. Additionally, the micro-operation cache manages eviction and replacement policy at the granularity of micro-operation groups rather than at the granularity of cache lines.

Patent Agency Ranking