Latch circuit with a bridging device
    21.
    发明授权
    Latch circuit with a bridging device 有权
    带桥接器的锁存电路

    公开(公告)号:US09077329B2

    公开(公告)日:2015-07-07

    申请号:US14151715

    申请日:2014-01-09

    Abstract: One embodiment of the present invention sets forth a technique for capturing and holding a level of an input signal using a latch circuit that presents a low number of loads to the clock signal. The clock is only coupled to a bridging transistor and a pair of clock-activated pull-down or pull-up transistors. The level of the input signal is propagated to the output signal when the storage sub-circuit is not enabled. The storage sub-circuit is enabled by the bridging transistor and a propagation sub-circuit is activated and deactivated by the pair of clock-activated transistors.

    Abstract translation: 本发明的一个实施例提出了一种使用向时钟信号呈现低负载的锁存电路来捕获和保持输入信号电平的技术。 时钟仅耦合到桥接晶体管和一对时钟激活的下拉或上拉晶体管。 当存储子电路未使能时,输入信号的电平被传播到输出信号。 存储子电路由桥接晶体管使能,传播子电路由一对时钟激活晶体管激活和去激活。

    Generalized acceleration of matrix multiply accumulate operations

    公开(公告)号:US11816481B2

    公开(公告)日:2023-11-14

    申请号:US17890540

    申请日:2022-08-18

    Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

    Generalized acceleration of matrix multiply accumulate operations

    公开(公告)号:US11797301B2

    公开(公告)日:2023-10-24

    申请号:US17141082

    申请日:2021-01-04

    Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

    GENERALIZED ACCELERATION OF MATRIX MULTIPLY ACCUMULATE OPERATIONS

    公开(公告)号:US20210311734A1

    公开(公告)日:2021-10-07

    申请号:US17351175

    申请日:2021-06-17

    Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

    Stochastic rounding of numerical values

    公开(公告)号:US10684824B2

    公开(公告)日:2020-06-16

    申请号:US16001838

    申请日:2018-06-06

    Abstract: A method, computer readable medium, and system are disclosed for rounding numerical values. A set of bits from an input value is identified as a rounding value. A second set of bits representing a second value is extracted from the input value and added with the rounding value to produce a sum. The sum is truncated to produce the rounded output value. Thus, the present invention provides a stochastic rounding technique that rounds up an input value as a function of a second value and a rounding value, both of which were obtained from the input value. When the second value and rounding value are obtained from consistent bit locations of the input value, the resulting output value is deterministic. Stochastic rounding, which is deterministic, is advantageously applicable in deep learning applications.

    GROUND-REFERENCED SINGLE-ENDED SIGNALING CONNECTED GRAPHICS PROCESSING UNIT MULTI-CHIP MODULE
    26.
    发明申请
    GROUND-REFERENCED SINGLE-ENDED SIGNALING CONNECTED GRAPHICS PROCESSING UNIT MULTI-CHIP MODULE 有权
    接地参考单端信号连接图形处理单元多芯片模块

    公开(公告)号:US20140281383A1

    公开(公告)日:2014-09-18

    申请号:US13973952

    申请日:2013-08-22

    Abstract: A system of interconnected chips comprising a multi-chip module (MCM) includes a processor chip, a system functions chip, and an MCM package configured to include the processor chip, the system functions chip, and an interconnect circuit. The processor chip is configured to include a first ground-referenced single-ended signaling interface circuit. A first set of electrical traces manufactured within the MCM package and configured to couple the first single-ended signaling interface circuit to the interconnect circuit. The system functions chip is configured to include a second single-ended signaling interface circuit and a host interface. A second set of electrical traces manufactured within the MCM package and configured to couple the host interface to at least one external pin of the MCM package. In one embodiment, each single-, ended signaling interface advantageously implements ground-referenced single-ended signaling.

    Abstract translation: 包括多芯片模块(MCM)的互连芯片的系统包括处理器芯片,系统功能芯片和被配置为包括处理器芯片,系统功能芯片和互连电路的MCM封装。 处理器芯片被配置为包括第一接地参考的单端信令接口电路。 在MCM封装内制造的第一组电迹线,用于将第一单端信令接口电路耦合到互连电路。 系统功能芯片被配置为包括第二单端信令接口电路和主机接口。 MCM封装中制造的第二组电迹线,用于将主机接口耦合到MCM封装的至少一个外部引脚。 在一个实施例中,每个单端信令接口有利地实现接地参考的单端信令。

    DUAL-TRIGGER LOW-ENERGY FLIP-FLOP CIRCUIT
    27.
    发明申请
    DUAL-TRIGGER LOW-ENERGY FLIP-FLOP CIRCUIT 有权
    双触发低能量FLIP-FLOP电路

    公开(公告)号:US20130278315A1

    公开(公告)日:2013-10-24

    申请号:US13921138

    申请日:2013-06-18

    CPC classification number: H03K3/36 H03K3/012 H03K3/356121

    Abstract: One embodiment of the present invention sets forth a technique for technique for capturing and storing a level of an input signal using a dual-trigger low-energy flip-flop circuit that is fully-static and insensitive to fabrication process variations. The dual-trigger low-energy flip-flop circuit presents only three transistor gate loads to the clock signal and none of the internal nodes toggle when the input signal remains constant. One of the clock signals may be a low-frequency “keeper clock” that toggles less frequently than the other two clock signal that is input to two transistor gates. The output signal Q is set or reset at the rising clock edge using separate trigger sub-circuits. Either the set or reset may be armed while the clock signal is low, and the set or reset is triggered at the rising edge of the dock.

    Abstract translation: 本发明的一个实施例提出了一种技术,用于使用完全静态且对制造工艺变化不敏感的双触发低能量触发器电路来捕获和存储输入信号电平的技术。 双触发低能触发器电路仅向时钟信号提供三个晶体管栅极负载,并且当输入信号保持恒定时,内部节点都不会切换。 时钟信号之一可以是低频“保持时钟”,其比输入到两个晶体管栅极的另外两个时钟信号频率更低。 输出信号Q在上升时钟沿使用分离的触发子电路设置或复位。 当时钟信号为低电平时,设置或复位可能被布防,并且在基座的上升沿触发置位或复位。

    Generalized acceleration of matrix multiply accumulate operations

    公开(公告)号:US11797302B2

    公开(公告)日:2023-10-24

    申请号:US17351161

    申请日:2021-06-17

    Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

    Fault injection architecture for resilient GPU computing

    公开(公告)号:US11669421B2

    公开(公告)日:2023-06-06

    申请号:US17591481

    申请日:2022-02-02

    Abstract: Unavoidable physical phenomena, such as an alpha particle strikes, can cause soft errors in integrated circuits. Materials that emit alpha particles are ubiquitous, and higher energy cosmic particles penetrate the atmosphere and also cause soft errors. Some soft errors have no consequence, but others can cause an integrated circuit to malfunction. In some applications (e.g. driverless cars), proper operation of integrated circuits is critical to human life and safety. To minimize or eliminate the likelihood of a soft error becoming a serious malfunction, detailed assessment of individual potential soft errors and subsequent processor behavior is necessary. Embodiments of the present disclosure facilitate emulating a plurality of different, specific soft errors. Resilience may be assessed over the plurality of soft errors and application code may be advantageously engineered to improve resilience. Normal processor execution is halted to inject a given state error through a scan chain, and execution is subsequently resumed.

    GENERALIZED ACCELERATION OF MATRIX MULTIPLY ACCUMULATE OPERATIONS

    公开(公告)号:US20210311733A1

    公开(公告)日:2021-10-07

    申请号:US17351161

    申请日:2021-06-17

    Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

Patent Agency Ranking